I had written about sorting algorithms (Tag: Sorting) with details about what to look out for along with their code snippets but I wanted a do a quick comparison of all the algos together to see how do they perform when the same set of input is provided to them. Hence I started working on a simple implementation for each one of them. I have now put together all of them in a single project on GitHub. I ensured that they all have the same set of procedures during their run. Some of the items I wanted to ensure was:
- Same input array
- Same number of iterations. Each iteration having the same input
- Each algo being timed the exact same way as another
Some of the algorithms being tested were:
How did we ensure Equality?
- Created a simple base class for all algorithms:
- Responsible for benchmarking everything
- Provide a function called
that would allow derived classes to implement their algorithm
- Ensures that every algorithm has a name and description – to help us distinguish
- Another class to help manage the testing of all the algorithms:
- All instances are created here for the algorithms
- The input array is provided by this class to all algorithms
As usual the code for the project is available here:
It can be run using Visual Studio without any changes.
Heap Sort algo has the following properties:
- The top element (root) is always the next in order
- This allows you to remove one element at a time (the root) and ensure that you are pulling out items in a sorted order
- Always takes O(n*log(n)) time – worst case or best case
- Simple implementations require additional space to hold heap of size n
- Hence space requirement is double of array size n
- Not included in big-O notation so something to keep in mind
- Not a stable sort
- Original order of equal values may not be maintained
Let’s look at three versions of Heap Sort and see how they compare against each other. We will also find out what differentiates them from each other.
Quick Sort is an efficient divide and conquer algorithm performed in two phases – partition and sorting phase.
Here are few pointers to remember about Quick Sort:
- Partitioning places all the elements less than the pivot in the left part of the array and greater elements in the right part
- Pivot element stays in its place
- After partitioning no element moves to the other side, of the pivot
- This allows you to sort the elements, to the left or right of the pivot, independent of the other side
- Complexity is O(n log n)
- Often fast for small arrays with a few distinct values, repeated many times
- It is a conquer-and-divide algo; with most of the work happening during partitioning phase
- If you had to choose the optimum pivot then it should the median of the given array
- Not a stable sort
Insertion Sort has the following properties:
- It works by moving elements one at a time
- Works really well for small data sets
- Consider going with this when the input data may already be sorted or partially sorted
- The may not have to move the elements around, thereby saving precious cycles
- Stable sort
- Keeps the original order of elements with equal values
It is a non-comparative integer sorting algorithm. It sorts data by grouping keys by the individual digits which share the same significant position and value. Think Tens, Hundreds, Thousands etc. Some pointers about Radix Sort:
- Even though it is an integer sorting algorithm, it is not restricted just to integers. Integers can also represent strings of characters
- Two types of radix sort are:
- LSD (Least Significant Digit): Short keys come before long keys
- MSD (Most Significant Digit) Sorting: Lexicographic Order. Better for strings.
- Uses Counting Sort as Sub Routine (which takes extra memory)
- If memory is not really a concern, forget about this issue
- Radix Sort, with at most d digits, takes O(d*(n+b)) time where b is the base
- Use Radix Sort over Counting Sort when the numbers range from 1 to n-square for example.
Counting Sort is an integer sorting algorithm. It is not very famous when somebody talks about sorting algorithms but it is great when sorting integers. In fact, many a times it may even beat other Sorting Algorithms. The highlight of Counting Sort is that it creates a bucket array (to keep track of frequency of numbers) whose size is the maximum element in the provided array.
We are looking to compare most of the sorting algorithms to find out which one performs better under different circumstances. One of the ways is to compare the complexity for each algorithm. The other way is to compare how well they perform based on the input they are all provided. I will post my code on Github but will start with Counting Sort here.
Here are some pointers about Counting Sort:
- Linear sorting algo with O(n+k) when the elements range from 1 to k
- It is really good when the numbers range from 1 to n which means the max between them is not going to be very high
How do we compare all Sorting Algorithms?
We are going to look at the following scenarios across all Sorting Algorithms:
- 10 numbers – Random + Sorted
- 1000 numbers (1K) – Random + Sorted
- 1,000,000 numbers (1M) – Random + Sorted
- 20000 iterations for each one of these scenarios
- For good measure we will look at the average time required to sort 100 numbers in each one of these cases
- This will allow us to compare all algorithms against each other by looking at the average times
- It means we will be multiplying the results of 10 numbers by 10 (to get the “average” for 100 numbers) and also