Sorting Algorithms
Working with Data Structures and manipulating data.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Warm Up
Discuss with a partner... What are some strategies you would use to sort this list? (Don't worry about writing code for now)
- Sure, here's a simple description of a sorting algorithm:One way to sort a list is to repeatedly find the smallest element and move it to the beginning of the list. This can be done by iterating through the list and comparing each element to the current minimum. If a smaller element is found, it becomes the new minimum and is swapped with the current element at the beginning of the list. This process is repeated until the list is sorted.
Explore
Get into groups of 3
We will be focusing on 4 algorithms today.
We will look at the first one together, Bubble Sort
What is happening with this sort?
- Bubble sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent elements and swaps them if they are in the wrong order. The pass through the list is repeated until the list is sorted. The algorithm, which is a comparison sort, is named for the way smaller or larger elements "bubble" to the top of the list.
In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm and be ready to explain it to your other group members.
Practice
[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with...
- Bubble Sort
- Selection Sort
Explain.
Bubble Sort
- Compares adjacent elements in the list and swaps them if they are in the wrong order.
- Repeats this process until the list is sorted.
- For example, if we have the list [75, 17, 46, 80, 67, 45, 69, 79, 40, 0], Bubble Sort would start by comparing the first two elements, 75 and 17. Since 75 is greater than 17, they would be swapped, resulting in the list [17, 75, 46, 80, 67, 45, 69, 79, 40, 0]. The algorithm would then compare the next two elements, 75 and 46, and swap them if necessary. This process would continue until the list is fully sorted.
Selection Sort
- Finds the smallest element in the unsorted portion of the list and swaps it with the first element of the unsorted portion.
- Repeats this process for the remaining unsorted portion of the list until all elements are sorted.
- For example, if we have the list [75, 17, 46, 80, 67, 45, 69, 79, 40, 0], Selection Sort would start by finding the smallest element in the entire list, which is 0. It would then swap 0 with the first element of the list, resulting in [0, 17, 46, 80, 67, 45, 69, 79, 40, 75]. The algorithm would then find the smallest element in the remaining unsorted portion of the list (which is [17, 46, 80, 67, 45, 69, 79, 40, 75]), which is 17, and swap it with the second element of the unsorted portion, resulting in [0, 17, 46, 80, 67, 45, 69, 79, 40, 75]. This process would continue until the entire list is sorted.
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
How would you sort this list with...
- Merge Sort
- Insertion Sort
Explain.
Merge Sort
- Divides the list into smaller sub-lists, sorts those sub-lists recursively, and then merges them back together.
- Continues this process until the entire list is sorted.
- For example, if we have the list [88, 39, 53, 39, 58, 43, 74, 81, 71, 51], Merge Sort would start by dividing the list into two sub-lists:[88, 39, 53, 39, 58] and [43, 74, 81, 71, 51]. It would then recursively divide each of these sub-lists into smaller sub-lists until each sub-list contains only one element. Next, it would merge the sub-lists back together in sorted order. For example, it would merge [88] and [39] into [39, 88], and then merge [53, 39, 88] and [58] into [39, 53, 58, 88]. It would continue this process until the entire list is sorted: [39, 39, 43, 51, 53, 58, 71, 74, 81, 88]. ### Insertion Sort
- Builds the final sorted list one element at a time.
- Iterates through the list, comparing each element to the previous elements and inserting it into the correct position in the sorted portion of the list.
- For example, if we have the list [88, 39, 53, 39, 58, 43, 74, 81, 71, 51], Insertion Sort would start by considering the first element, 88, as the sorted portion of the list. It would then iterate through the remaining unsorted portion of the list, comparing each element to the elements in the sorted portion and inserting it into the correct position. The second element, 39, would be compared to 88 and inserted before it, resulting in [39, 88, 53, 39, 58, 43, 74, 81, 71, 51]. The third element, 53, would be compared to 88 and inserted after it, resulting in [39, 88, 53, 39, 58, 43, 74, 81, 71, 51]. The fourth element, 39, would be compared to 53 and 88 and inserted before 53, resulting in [39, 39, 88, 53, 58, 43, 74, 81, 71, 51]. This process would continue until the entire list is sorted: [39, 39, 43, 51, 53, 58, 71, 74, 81, 88].
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
#print(len(english_words)) # Prints the number of words in the list
# You can now use the 'english_words' list in your code
words = []
for i in range(10):
words.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(words)
Discuss
Answer the following with your group.
- When should you use each algorithm? What makes an algorithm the right choice?
-
Given the following lists...
- [0, 2, 6, 4, 8, 10]
- The best algorithm to sort this list would be Bubble sort as it compares the first two elements, 0 and 2, and swaps them if they are in the wrong order. It would then compare the next two elements, 2 and 6, and swap them if necessary. This process would continue until the list is fully sorted. Since the list is already mostly sorted, Bubble sort would be the best choice as it is most efficient when the list is nearly sorted and small.
-
[Elephant, Banana, Cat, Dog, Apple]
- The best algorithm to sort this list would be Insertion sort as it builds the final sorted list one element at a time. It would iterate through the list, comparing each element to the previous elements and inserting it into the correct position in the sorted portion of the list. Since the list is small, Insertion sort would be the best choice as it is most efficient when the list is small.
-
[29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
- Merge sort would be the best sort for this list as it is a rather large list and as Merge Sort has a time complexity of O(n log n), it would be the most efficient compared to the other sorting algorithms with are less effective with larger lists. Select the algorithm you believe is best for each, explain.
- [0, 2, 6, 4, 8, 10]
HACKS
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
-
Now it's time to do some coding...
-
Run code and then research and answer these questions...
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
- In Python, a list and a dictionary are considered collection types. This is because they can hold multiple values of different types in a single variable. Primitive types, on the other hand, can only hold a single value of a specific type.
- Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
- The list passed into bubble sort is "pass-by-reference". This is because the original list is modified in place, rather than creating a new copy of the list. Therefore, any changes made to the list inside the function are reflected in the original list outside the function.
- Is a list and/or dictionary in python considered a primitive or collection type? Why?
- Implement new cell(s) and/or organize cells to do the following.
- Create your own list
- Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
- Test your list with my bubble sort
- Test my list with your new sort, do NOT make a copy my list when doing this
- Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.
Use the code below to help guide your adventure
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
cookie_prices = [
{"name": "Chocolate Chip", "price": 2.50},
{"name": "Oatmeal Raisin", "price": 2.00},
{"name": "Peanut Butter", "price": 2.75},
{"name": "Snickerdoodle", "price": 1.75},
{"name": "Sugar", "price": 1.50}
]
# assuming uniform keys, pick 1st row as source of keys
# key_row = list_of_people[0]
key_row = cookie_prices[0]
# print list as defined
print("Original")
print(cookie_prices)
for key in key_row: # finds each key in the row
print(key)
bubbleSort(cookie_prices, key) # sort list of people
print(cookie_prices)
import time
def merge_sort(arr, key):
"""
This function sorts the given array of dictionaries using the merge sort algorithm.
The sorting is done based on the value of the given key in each dictionary.
Args:
- arr: list of dictionaries to be sorted
- key: key in each dictionary to be used for sorting
Returns:
- sorted_arr: sorted list of dictionaries
"""
# If the length of the array is greater than 1, split it into two halves
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half, key)
merge_sort(right_half, key)
# Merge the sorted halves
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# Compare the values of the given key in the left and right halves
if left_half[i][key] < right_half[j][key]:
# If the value in the left half is smaller, add it to the sorted array
arr[k] = left_half[i]
i += 1
else:
# If the value in the right half is smaller, add it to the sorted array
arr[k] = right_half[j]
j += 1
k += 1
# Add any remaining elements from the left half to the sorted array
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Add any remaining elements from the right half to the sorted array
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Return the sorted array
return arr
# Example usage
cookie_prices = [
{"name": "Chocolate Chip", "price": 2.50},
{"name": "Oatmeal Raisin", "price": 2.00},
{"name": "Peanut Butter", "price": 2.75},
{"name": "Snickerdoodle", "price": 1.75},
{"name": "Sugar", "price": 1.50}
]
# Record the start time
start_time = time.time()
# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(cookie_prices, "price")
# Record the end time
end_time = time.time()
# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")
import time
def merge_sort(arr, key):
"""
This function sorts the given array of dictionaries using the merge sort algorithm.
The sorting is done based on the value of the given key in each dictionary.
Args:
- arr: list of dictionaries to be sorted
- key: key in each dictionary to be used for sorting
Returns:
- sorted_arr: sorted list of dictionaries
"""
# If the length of the array is greater than 1, split it into two halves
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half, key)
merge_sort(right_half, key)
# Merge the sorted halves
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# Compare the values of the given key in the left and right halves
if left_half[i][key] < right_half[j][key]:
# If the value in the left half is smaller, add it to the sorted array
arr[k] = left_half[i]
i += 1
else:
# If the value in the right half is smaller, add it to the sorted array
arr[k] = right_half[j]
j += 1
k += 1
# Add any remaining elements from the left half to the sorted array
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Add any remaining elements from the right half to the sorted array
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Return the sorted array
return arr
# Example usage
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# Record the start time
start_time = time.time()
# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(list_of_people, "age")
# Record the end time
end_time = time.time()
# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")