Week Eleven - Sorting and Efficiency
Efficiency has not really become a huge concern to me this term, but this is mainly because of the fact that the data we are dealing with right now is relatively small. However, in real world, this is definitely not the case. In real world, the data structures are often huge, and if the program is not efficient enough, it is can take more longer time to run and reproduce the result. Therefore, in my opinion to optimize the run-time should always be a goal to achieve for a good programmer. This week, we discussed several sorting algorithms, some are more efficient than other, and I will talk about them below:
Selection Sort
Selection sort is very easy to code and understand, in selection sort you have an index 'i', which split the list into two parts, all the indices smaller than 'i' are the sorted part, and the indices equal or greater to 'i' are the unsorted part. You would first make 'i' equal to index 0 (which means the sorted part of the list is empty and the unsorted part is the whole list), and you would move 'i' to the end of the list. Each time you move i down by one, you can find the smallest item in the unsorted part and swap it with the object at 'i' (this can be done using tuple unpacking). There is no best, average, or worst case because no matter how you arrange list it will have the same number of operations with respect of the lengths of the list. The run-time is always in O(n^2) where n being the number of items in the list. (when i is 0, you will make n-1 comparisons, then n-2, n-3, n-4 and so on. so at the end when you sum up all the comparisons it will be O(n^2)).
Insertion Sort
Insertion sort is similar to selection sort in some ways. It has an index 'i' and a unsorted and sorted part of the list. But in insertion sort you would "insert" the item at index i to the correct place in the sorted part. (To do this you might have to shift some of the item in the sorted part in order to insert i). Different than selection sort, insertion sort actually have a best case and a worst case. The best case is when the list is already sorted, therefore no shift is required to make. However, the worst case is when the list is in exact reverse order, then you have have to shift every insert. In the worst case the run-time is also big O(n^2), similar logic with the selection sort can be applied here.
Quick Sort
Quick Sort is a new algorithm we learned in this course, I personally really liked this sort because the algorithm is straight forward and the code is extremely easy to write. This sort shows how powerful recursion can be sometimes (also the merge sort below). In the quick sort, you can pick a pivot, then you compare all the other elements to the pivot. All the elements smaller than the pivot will be in one list and all the elements greater or equal to the pivot will be in another list. Then you call the quick sort on those two list separately. The worst case can occur when all the items are always greater or equal than the pivots or all smaller than the pivots (i.e. one sub-list is created). In that way the run-time is going to be in O(n^2), the problem is however able to be avoid by using the random number generator to generate a new pivot each time, then the run-time is going to be O(nlog(n)) where n accounts for the comparison.
Merge Sort
Merge Sort is also uses recursion, the idea is that you keep cutting the list a half, then you sort the left side of the list and the right side of the list, then you would put them together. (kind of like binary search) There is not much difference between the worst and average case of merge sort. Similar to quick sort the run-time is O(nlog(n)), one thing to note is the worst case run-time of merge sort is actually the best/average run-time of quick sort.