Word List

Data Terms:

  • Bits
  • Bytes
  • Hexadecimal
  • Nibbles

Binary Numbers:

  • Unsigned Integer
  • Signed Integer
  • Floating Point

Binary Data Abstractions:

  • Boolean
  • ASCII
  • Unicode
  • RGB

Data Compression:

  • Lossless
  • Lossy

Bits

  • Category: Data Terms
  • Definition: A bit is the smallest unit of data in a computer. It can have one of two values: 0 or 1. Can store all types of data.

Bits Code Example

def DecimalToBinary(n):
    
    binary_digits = []

    while n > 0:
        binary_digits.append(n % 2)
        n = n // 2
    return ''.join(str(d) for d in reversed(binary_digits))

print(DecimalToBinary(10))
# Output: 1010

print(DecimalToBinary(100))  
# Output: 1100100

print(DecimalToBinary(255))  
# Output: 11111111
1010
1100100
11111111

Bytes

  • Category: Data Terms
  • Defintion: A byte is a group of 8 bits. It is the smallest unit of data that can be directly addressed by a computer.

Bytes Code Example

string = "Hello, world!"
byte_object = bytes(string, "utf-8")

print(byte_object)  
# Output: b'Hello, world!'

print(f"Number of bytes: {len(byte_object)}")  
# Output: Number of bytes: 13

# Byte back to string
string = byte_object.decode("utf-8")

print(string)  
# Output: Hello, world!
b'Hello, world!'
Number of bytes: 13
Hello, world!

Hexadecimal

  • Category: Data Terms
  • Definition: A hexadecimal number is a number expressed in base 16 similar to decimal which is expressed in base 10. This means that each place represents a power of 16. The letters A-F are used to reprsent the values which are greater than 0-9 so A = 10, B = 11, etc. It is a convenient way to represent a byte value. Each hexadecimal digit represents 4 bits.

Hexadecimal Code Example

hex_string = input("Enter a hexadecimal number: ")

decimal_num = int(hex_string, 16)
binary_num = bin(decimal_num)

print("Hexadecimal number:", hex_string)

print("Decimal equivalent:", decimal_num)

print("Binary equivalent:", binary_num)

# what this code does is that it takes an input 
# from the user of a hexadecimal string then 
# converts that code to a decimal number using
# the bin and int functions as they are just 
# different methods of storing data
Hexadecimal number: 0xA5
Decimal equivalent: 165
Binary equivalent: 0b10100101

Nibbles

  • Category: Data Terms
  • Definition: A nibble is a group of 4 bits. It is half of a byte. It is often used to represent a single digit in a hexadecimal number (as each hexadecimal digit represents 4 bits). A high nibble is the most significant 4 bits of a byte. A low nibble is the least significant 4 bits of a byte.

Nibbles Code Example

byte = 0xA5

# Extract the high nibble (the four most significant bits)
# uses the bitwise right shift operator so that is shifts 
# the most 4 most important bits(high nibble) to the right 
# discard the 4 less significant bits (low nibble)

high_nibble = byte >> 4

# Extract the low nibble (the four least significant bits) 
# through the use of the bitwise and operator as the 0x0F 
# represents the last for bits thus through the program it 
# does the inverse of what the first portion does and extracts
# the 4 least significant bits which would be the low nibble

low_nibble = byte & 0x0F

# Convert the nibbles to hexadecimal strings
high_nibble_hex = hex(high_nibble)
low_nibble_hex = hex(low_nibble)

# Print the results

print("Byte:", byte)
print("High nibble:", high_nibble_hex)
print("Low nibble:", low_nibble_hex)
Byte: 165
High nibble: 0xa
Low nibble: 0x5

Unsigned Integer

  • Category: Binary Numbers
  • Definition: An unsigned integer is a number that can only be positive. It is represented in binary using the same number of bits as the number of bits in the computer's word size. For example, a 32-bit computer uses 32 bits to represent an unsigned integer. So an unsigned integer can only be used in situations where the number used cannot be negative so in senarios like a shopping list where the number of groceries you need to buy cannot be negative. The advantages to using an unsigned integer is that it is faster to compute than a signed integer and it is easier to store in memory. The disadvantages are that it can only be used in situations where the number cannot be negative and it can only represent positive numbers making it prone to overflow errors if the number is implemented incorrectly in situation where there can be negative numbers such as a subtraction problem.

Unsigned Integer Code Example

  • FYI: Python doesn't have unsigned integers so the ctypes module needs to be imported.
from ctypes import c_uint

shopping_list = []

# Declares an unsigned integer to store the numbers in the shopping list this is done as the list cannot have negative numbers
# Making it it better than a signed integer in this case as this case is more appropriate for a shopping list
item_count = c_uint(0)

# Function to add an item to the shopping list
def add_item(item):
  global item_count
  shopping_list.append(item)
  item_count.value += 1
  print(f"\n{item} added to shopping list.")

# Function to remove an item from the shopping list this prevents an overflow error 
# From occuring as the check in the remove items function makes it so that if only if the 
# number of items in the shopping list is greater than 0 can an item be removed
# this prevents the number of items in the shopping list from going below 0
# thus eleminating the possiblity of an overflow error occuring
def remove_item(item):
  global item_count
  if item in shopping_list:
    shopping_list.remove(item)
    if item_count.value > 0:
        item_count.value -= 1
    else:
        item_count.value = 0
        print("\nItem count is set to 0 cannot have a negative number of items in the shopping list.")
    print(f"\n{item} removed from shopping list.")
  else:
    print(f"\n{item} not found in shopping list.")

# Add some items to the shopping list
add_item("Milk")
add_item("Bread")
add_item("Eggs")

# Prints the current shopping list
print(f"\nCurrent shopping list: {shopping_list}")

# Remove an item from the shopping list
remove_item("Bread")

# Print the current shopping list
print(f"\nCurrent shopping list: {shopping_list}")

# Try to remove an item that is not in the shopping list
remove_item("Sugar")
Milk added to shopping list.

Bread added to shopping list.

Eggs added to shopping list.

Current shopping list: ['Milk', 'Bread', 'Eggs']

Bread removed from shopping list.

Current shopping list: ['Milk', 'Eggs']

Sugar not found in shopping list.

Signed Integer

  • Category: Binary Numbers
  • Definition: A signed integer is a number that can be positive or negative. It is represented in binary using the same number of bits as the number of bits in the computer's word size. For example, a 32-bit computer uses 32 bits to represent a signed integer. So a signed integer can be used in situations where the number used can be negative so in senarios like a subtraction problem where the output can be either negative or postive where only integers are used. The advantages to using a signed integer is that it can be used in situations where the number can be negative and it can represent positive and negative numbers. The disadvantages are that it is slower to compute than an unsigned integer and it is more difficult to store in memory. Also compared to an unsigned integer, smaller numbers can be stored in the same amount of space but are more efficient than floating point numbers.

Signed Integer Code Example

def subtractor():
    num1 = int(input("Enter first number: "))
    num2 = int(input("Enter second number: "))
    print(f"The difference between {num1} and {num2} is {num1 - num2}")

subtractor()

# As illustrated it can be seen as through the use of a 
# signed integer the program is able to store and dispaly 
# negative values as well as positive values 
# if an unsigned integer would have been used 
# in this senario the program would have had an overflow error
The difference between 21 and 69 is -48

Floating Point

  • Category: Binary Numbers
  • Definition: A floating point number is a number that can be positive or negative and can have a decimal point. It is represented in binary using the same number of bits as the number of bits in the computer's word size. For example, a 32-bit computer uses 32 bits to represent a floating point number. So a floating point number can be used in situations where the number used can be negative and can have a decimal point so in senarios like a temperature where the temperature can be negative and can have a decimal point. The advantages to using a floating point number is that it can be used in situations where the number can be negative and can have a decimal point and it can represent positive and negative numbers. The disadvantages are that it is slower to compute than an unsigned and signed integer and it is more difficult to store in memory. Also compared to an unsigned and signed integer, smaller numbers can be stored in the same amount of space.

Floating Point Code Example

def convertFahrenheit(temperature):
    celsius = temperature - 32
    celsius = celsius * 5/9
    print(f"\nThe temperature outside is {round(celsius, 7)} degrees celsius")

outsideTemp = input("Enter temperature in celsius")
outsideTemp = float(outsideTemp)
print(f"\nThe tempreature outside is {outsideTemp} degrees farenheit")

convertFahrenheit(outsideTemp)

# As seen the program can handle both postive and negative numbers
# Along with decimals as well as whole numbers
# This is due to the fact that the program is using a floating number
# This is best when there is divsion involved such as the tempreature 
# conversion as it allows the program to be able to display 
# information with more precision than if an integer was used
The tempreature outside is 21.21 degrees farenheit

The temperature outside is -5.9944444 degrees celsius

Boolean

  • Category: Binary Data Abstractions
  • Definition: A boolean is a data type that can only have two values: True or False. So it can be used in programs such as conditional statments to determine if a certain condition is true or false it can excute a certain actions. There are many procedures that output a boolean value.

Boolean Code Example

things = [1, 2.0, "snowflake", True, 4, 5.0, "butterfly", False, 7, 8.0, "pumpernickle toast", True, 10, 11.0, "rainbow", False]
def thingchecker(things):
    for thing in things:
        # Had to add this portion as it seems that booleans and 
        # integers are both considered integers in python for some reason
        # as it turns out that is because booleans are a subclass of integers
        if isinstance(thing, int) and not isinstance(thing, bool):
            print(f"{thing} is an integer")
        elif isinstance(thing, float):
            print(f"{thing} is a float")
        elif isinstance(thing, str):
            print(f"{thing} is a string")
        elif isinstance(thing, bool):
            print(f"{thing} is a boolean")
        else:
            print("The input is something else")
thingchecker(things)
1 is an integer
2.0 is a float
snowflake is a string
True is a boolean
4 is an integer
5.0 is a float
butterfly is a string
False is a boolean
7 is an integer
8.0 is a float
pumpernickle toast is a string
True is a boolean
10 is an integer
11.0 is a float
rainbow is a string
False is a boolean

ASCII

  • Category: Binary Data Abstractions
  • Definition: ASCII stands for American Standard Code for Information Interchange. It is a character encoding standard for electronic communication. It uses 7 bits to represent 128 different characters. It is the most common character encoding standard. It is used to represent text in computers and other devices. It is a subset of the Unicode character encoding standard. And is used for things such as data trasmission, to format files, is the base of languages such as C, C++, Java, and Python, and is used to represent text in HTML and XML documents, along with many more uses.

ASCII Code Example

Coded = input("Enter a string to be coded: ")
print(f'Your string was {Coded} which in ASCII codes is:')
def ASCII_Coder(Coded):
    for letter in Coded:
        print(ord(letter), end = " ")
ASCII_Coder(Coded)

# The ord function is used to convert 
# a character into its ASCII code which 
# is then printed out to the user this 
# would also do genereal Unicode as well     
Your string was butter which in ASCII codes is:
98 117 116 116 101 114 

Unicode

  • Category: Binary Data Abstractions
  • Definition: Unicode is a character encoding standard for electronic communication. It uses 16 bits to represent 65,536 different characters. It is the most common character encoding standard. It is used to represent text in computers and other devices. It is a superset of the ASCII character encoding standard. Representing the standard that encodes most of the worlds languages.

Unicode Code Example

Coded = input("Enter a string to be coded: ")
print(f'Your string was {Coded} which in ASCII codes is:')
def Uni_Coder(Coded):
    for letter in Coded:
        print(ord(letter), end = " ")
Uni_Coder(Coded)
# According to google trasnlate that should be Hello World in Chinese Simplified
# The program is able to convert the string into its Unicode code pointers 
# Which is the same code as the ASCII code as the ASCII code is a subset of the Unicode code
# and the ord() function is able to convert the string into its Unicode code pointers
Your string was 你好世界 which in ASCII codes is:
20320 22909 19990 30028 

RGB

  • Category: Binary Data Abstractions
  • Definition: RGB stands for Red, Green, Blue. It is a color model that uses 3 bytes to represent 16,777,216 different colors. It is used to represent colors in computers and other devices. It is used to represent colors in images, videos, and other media. It is used to represent colors in HTML and CSS documents.

RGB Code Example

BTW here is the picture the values are from: RGB Values

from PIL import Image
# Opens the image file
doughnut = Image.open("/home/tirth/vscode/APCSP-Blog/images/Profile.jpg")

# Get the width and height of the image
width, height = doughnut.size

center_x = width // 2
quarter_y = height // 4

# Get the color of the pixel at the center of the image 
# and a quarter of the way down from the top of the image
r, g, b = doughnut.getpixel((center_x, quarter_y))

# gets the color of the pixel
print(f"Red: {r}")
print(f"Green: {g}")
print(f"Blue: {b}")
Red: 255
Green: 139
Blue: 184

Lossless

  • Category: Data Compression
  • Definition: A lossless data compression algorithm is an algorithm that compresses data without losing any information as it can be reconstructed exactly from its compressed form. This type of compression is used for things like text files, certain image formats, and audo codecs such as FLAC (Free Lossless Audio Codec) ALAC (Apple Lossless Audio Codec) seen in the refrence to compression in audiofile communties where a small subset of people can audibly dicern the differences in compression as indicated in them preferring to listen to FLACs and other simlar types of files, along with data bases, and many more uses.

Lossy

  • Category: Data Compression
  • Definition: A lossy data compression algorithm is an algorithm that compresses data by removing some information. It is not reversible and the original data cannot be reconstructed exactly from its compressed form. This is done to be able to compress the data more efficiently. This type of compression is used in things such as audio and video compression where some quaility is loss is able to be tolerated. Some may be able to dicern the differnece in compression see above for more info.

Lossless and Lossy Compression Code Example

As seen, the lossy image is orders of magnitude smaller than the lossless image, around 7.55 times smaller in fact (see the code below). This is due to the fact that lossy compression is able to remove some of the data from the image to achieve more efficient compression, unlike lossless compression, which is able to compress the image without removing any data. This is why the use of lossy compression, when possible, can save costs in commercial applications. As seen in the implementation of lossy compression in software such as Apple Music and Spotify, where lossy compression is used for most songs and non-premium users. Can you tell the difference between the lossy and lossless image?

Orginal Image: Orginal Image

Lossy Compressed Image: Lossy Compressed Image

Lossless Compressed Image: Lossless Compressed Image

import os 
from PIL import Image

# Opens the image file same one as before
with Image.open("/home/tirth/vscode/APCSP-Blog/images/Profile.jpg") as image:
    # Save the image using lossy JPEG compression
    image.save("image_compressed_lossy.jpg", "JPEG", quality=75)
    
    # Save the image using lossless PNG compression
    image.save("image_compressed_lossless.png", "PNG")

# Get the file sizes of the compressed images
lossy_size = os.path.getsize("image_compressed_lossy.jpg")
lossless_size = os.path.getsize("image_compressed_lossless.png")

# Prints the file sizes
print(f"Lossy image size: {lossy_size} bytes")
print(f"Lossless image size: {lossless_size} bytes")
Lossy image size: 106523 bytes
Lossless image size: 803895 bytes