When it comes to sorting, Quicksort and Heapsort are among the most popular choices, since they both have an average complexity of O(n logn) and are reasonably easy to implement. Even if in theory their average time complexity is similar, both being in the log-linear time family, there are many differences between the two algorithms that our analysis will point out. But first, let’s start by understanding how each of them work.
Quicksort
Quicksort is a sorting algorithm developed by Tony Hoare in 1959 and published in 1961. Its optimized versions are still used today. It is a comparison sort, based on a divide and conquer approach, which, if implemented correctly, has the following time complexities:
- O(n * logn) on average
- Θ(n * logn) best case
- Θ(n^2) worst case
Also, quicksort is an in-place algorithm. An in place algorithm uses no auxiliary data structure, but only a small amount of extra storage space for auxiliaray variables. The same happens in quicksort, which uses only a small auxiliary stack because of its recursive nature.
The Idea
As mentioned before, quicksort is a divide and conquer algorithm. With the input being an array with n elements A[1…n], quicksort works in the following way:
DIVIDE
Pick an element, called a pivot, from the array and partition (re-arrange the elements of) array A so that for A[position] (where position is the index of the pivot element, so A[position] = pivot):
- All elements on the left of position (A[1…position-1]) are less than or equal to the pivot (i.e. A[position])
- All elements on the right of position (A[position+1,…n]) are greater than or equal to the pivot (i.e. A[position])
CONQUER
Sort the sub-arrays A[1…position-1] and A[position+1…n] by recursive executions of step 1.
COMBINE
Join the sorted sub-arrays to obtain the sorted array.
Why does it work?
At every step, we choose a pivot and we partition the array. What this does is actually putting the pivot in its final position in the sorted array, because in a sorted array, for each element, all the elements on his left are less than or equal to it, and all the elements on his right are greater than or equal to it. By recursive calls of the function, each element will be picked as the pivot, so every element will be placed in the correct position. Now the trick is to find a way to achieve this in an efficient manner, and that is what the partition function does.
Implementation
The quicksort method is pretty straightforward, we will consider a Python implementation:
def quickSort (A, left, right):
# The base case for quicksort is when the array only has one element or none, which is by definition sorted
# so in that case, we have nothing to do
# that happens only if left >= right, so we are outside the base case if left < right (a simple negation of the predicate left >= right)
if (left < right):
position = partition (A, left, right)
quickSort (A, left, position-1)
quickSort (A, position+1, right)
The Partition Function
Partition is the central sorting operation of quicksort. There are two widely used partitioning schemes: Lomuto’s partition scheme and Hoare’s partition scheme. We will only consider Lomuto’s partition scheme, even if its performance is worse than the one of Hoare’s partition, because it is more compact and easier to understand.
Lomuto’s Partition Scheme
This scheme usually picks the last element in A as the pivot. It is easier to understand and therefore it is used in many tutorials. The idea is as follows:
Starting from the leftmost element, we keep track of the index of the first element that is greater than or equal to the pivot as i. We traverse the array with another index j. If at a certain point, A[j] is smaller than the pivot, we swap A[j] with A[i] and increase the value of i, otherwise we ignore the current element. In other words, we keep track of index i such that all the elements from left to i-1 are smaller than our pivot, and elements from i to j are greater than or equal to the pivot.
Why does it work?
As our index i goes through the array from left to right, no elements greater than the pivot are left behind it, because when such an element is identified, it is placed at index i and i is then increased, obtaining our desired behaviour. After going through the whole array, we also swap the pivot with the element at index i, to place it in its correct position. Let’s see the implementation for this:
# Partition returns an index i for which
# all elements A[left...i-1] are smaller than A[i] and
# all elements A[i+1...right] are greater than or equal to A[i]
# In other words, the element A[i] is now in its correct position in the sorted array
def partition (A, left, right):
pivot = A[right]
i = left
for j in range (left, right):
if (A[j] < pivot):
A[i], A[j] = A[j], A[i]
i += 1
A[i], A[right] = A[right], A[i]
return i
Let’s consider an example:
Performance
The performance is affected by the choice of the pivot element during partitioning, because it can lead to either balanced or unbalanced outcome.
-
Worst case, Θ(n^2), happens when partitioning is always completely unbalanced (i.e. when the choice of the pivot generates sub-arrays that always have n-1 and 0 elements), for example when the array is already sorted, or, in our implementation, when all the elements in the array are equal.
-
Best case, Θ(n logn), happens when partitioning is always fairly balanced (i.e. the choice of the pivot generates sub-arrays that always have [n/2] and [n/2]-1 elements)
Heapsort
Heapsort is a comparison-based sorting algorithm, that can be considered an improved selection sort, because it also divides its input into one sorted and one unsorted region, iteratively shrinking the unsorted region by extracting the largest element and moving it to the sorted region. The improvement comes from the use of a binary heap, rather than an array to find the maximum, which has a logarithmic-time search, compared to the linear-time search of an array.
Like quicksort, heapsort is an in-place algorithm.
The Idea
Given an array A, first the algorithm converts it into a max heap. Let’s say that the size of our array is n. We will use an index named crt to store how many elements we are currently considering from the max-heap (initially its value is equal to n). The algorithm repeatedly swaps the first value of the array with the value at A[crt], then calls the sink method on the heap and decrements crt until it becomes 1. To make it more clear, the algorithm follows the steps:
- Build the max-heap from the array A.
- Swap A[1] with A[crt] and decrease crt (which represents the number of elements that we currently consider from our max-heap) by 1.
- Call the sink function to restore heap order.
- Go back to step 2 if the number of elements in the max-heap is greater than 1.
Why does it work?
Remember the association with selection sort from the beginning of the section. It said that heapsort divides its input into one sorted and one unsorted region. This is done by the crt index. Intially its value is n, so we are considering the range [1…n]. In other words, initially, the whole array is unsorted (i.e. no element is placed in its correct position).
We know that at each step, the first element of our array (A[1]) stores the largest value in the range that we are considering (i.e. in the range 1 to crt), because the array A represents a max-heap. So A[1] >= A[1…crt] at each step. Now, the largest value in the range [1…crt] in a sorted array should be placed at index crt. Therefore, we swap A[1] with A[crt].
Currently we know that element A[crt] is in its correct place, so we decrease the range of values that are unsorted (decrease the value of crt by 1) and restore the heap order for elements from 1 to the new crt and keep applying this procedure until we get all the values sorted.
What Is a Binary Heap?
A binary heap is a complete binary tree that satisfies the heap property: in a max-heap, the key in each node is greater than or equal to the keys in that node’s children (if any) and in a min-heap, the key in each node is less than or equal to the keys in that node’s children (if any). In a complete binary tree, each node has at most two children and every level, except possibly the last, is completely filled from left to right. Note that in a max-heap, the root(A[1] in the array representation) is the largest value, and in a min-heap, the root is the smallest value.
A heap can be easily represented by an array. If the parent node is stored at index i, then, because the heap is a binary tree, the left child can be found at index 2 * i and the right child at 2 * i + 1 (assuming that we are not using index 0, so our indexing starts from 1), and for a node with the index i, its parent is at index [i/2]. Let’s see an example of a max-heap and its represantation as an array that will help us visualize the relation between indices and child-parent relations.
We can now see the reason why heapsort is an in-place algorithm: Max-heaps can be implemented as an implicit data structure, storing keys in an array and using their relative positions within that array to represent child-parent relationships.
There are two main operations on a binary heap: insertion and extraction (or deletion) of the root, both having a complexity of O(logn). After any of these two, the heap property can be violated and certain operations must be implemented to correct such violations. This process of maintaing the heap property is called reheapifying or restoring heap order:
- When a new element is added to the heap, we travel up the heap to restore the order.
- When the root is deleted, we travel down the heap to restore the order.
For the first case, the solution is called bottom-up reheapify or swim.
Bottom-Up Reheapify (Swim)
The algorithm for inserting has the following steps:
- Add the new element to the bottom level of the heap.
- Compare it with its parent; if they are in correct order (i.e. the value of the new element is <= the value of its parent for a max heap, or the value is >= than its parent for a min heap) stop.
- If not, swap the element with its parent and return to step 2.
Note: The source of this algorithm is Wikipedia.
Let’s say that we want to add the value 17 to our heap:
Top-Down Reaheapify (Sink)
In the case of deleting the root, the operation is called top-down reaheapify or sink, and can be achieved by:
- Replace the root of the heap with the last element from the last level in the heap.
- Compare it to its children. If they are in correct order (i.e. greater than both children in a max-heap, and smaller than both in a min-heap), then stop.
- Else, swap the element with one of its children and go back to 2. (In case of a max-heap swap with its larger child, and in case of a min-heap swap with its smaller child).
Note: The source of this algorithm is Wikipedia.
A Python implementation for this algorithm can be:
# Sink actually implements step 2 and 3 of the algorithm
def sink (array, k, n):
while (2 * k <= n): # While k is not a leaf
j = 2 * k # j is the index of the left child
if (j < n and array[j] < array[j+1]): # if there is a right child, check to see which one is the largest
j += 1 # if the right one is larger, then we consider it, so we increase j by 1
if (array[k] > array[j]): # if the node is greater than the largest child, then we stop
break
array[k], array[j] = array[j], array[k] # else, we swap it with its largest child and continue down the heap, so we give k the value of j
k = j
And to actually delete the root, considering that array is our array representing a max-heap, with size n, one would do:
array[1], array[n] = array[n], array[1] # step one
n -= 1 # decrement the number of elements in the heap
sink (A, 1, n) # implementation of steps 2 and 3
Let’s see an example on our original heap:
Building the Heap
The most intuitive method of building a heap is by simply taking each of the n input elements and inserting them to an initially empty heap. We know that inserting has O(logn) complexity, so this method of constructing a heap will have O(n * logn) complexity, since we perform an insertion n times (once for each element). However, there is a better way of doing this, called Floyd’s method.
Floyd’s Method
This method runs in O(n) time and only uses sink to create the heap, avoiding the need to implement the swim function. It is based on the observation that every leaf is a heap itself. Now the algorithm begins to add parents to the leaves, by using the sink function to make sure that the heap property is respected. Since a heap is a binary tree, half of the nodes are leaves, so the algorithm starts from [n/2], and it goes towards the first element, at each step applying the sink function to ‘merge’ heaps into bigger but still valid heaps, until the whole heap is completed. An implementation can be found below:
def createHeap (array):
# Because in Python indexing starts from 0, if we want to build an array with elements from
# [1-5] we must use an array of 6 elements,
# but our n must be equal to 5, which is the length - 1, that is why n takes the value of len(array) - 1
n = len(array) - 1
for k in range (n//2 , 0, -1):
sink (array, k , n)
A good visualization of this process can be found here. Note that our sink function is the equivalent of the max-heapify function from the video.
Now we can put all the pieces together to create our heapsort implementation:
def heapSort (array):
n = len(array) - 1 // Step 1 - building the max-heap
for k in range (n//2 , 0, -1): // Step 1 - building the max-heap
sink (array, k , n) // Step 1 - building the max-heap
crt = n
while (crt > 1): // Step 2
array[1], array[crt] = array[crt], array[1] // Step 2
crt -= 1 // Step 2
sink (array, 1, crt) // Step 3
Performance
Heapsort’s performance is affected by the heap construction algorithm, but most importantly by the fact that once the array gets bigger than the CPU cache, a lot of cache misses occur because heapsort does not access contiguous memory locations, since it often jumps from an index i to 2*i for example. However, it is important to note that the complexities of heapsort are:
- O(n * logn) on average
- O(n * logn) best case
- O(n * logn) worst case
Now that we know how the algorithms work and we have seen potential implementations, let’s start analyzing the differences between them.
Comparison Method
We are going to compare the performance of quicksort and heapsort for various input sizes, using two criteria: run time and the number of comparisons and swaps.
We will use graphs to visualize the differences between the two algorithms. In order to generate the graphs, the following procedure was used:
-
For each run of the experiment, 8 different input sizes were chosen in the range [100 000, 10 000 000]. The sizes remained the same for all of the runs.
-
For each input size, an array of random integers in the range (1, 10^12) was generated. It is important to specify that the given range is reasonably large to compensate the fact that random integers are used and not random real numbers. (As discussed in the section that inspects running time, the size of the interval of random integers is relevant for the comparison)
-
Both quicksort and heapsort were applied to the same array generated at step 2) and execution time and the number of comparisons and swaps were recorded and plotted for each run.
-
The experiment was repeated 10 times. The average time and number of comparisons and swaps was calculated for each of the input sizes and plotted together with the standard deviation.
The graphs for each run can be found here, and the code, which contains the implementation of quicksort and heapsort that I explained in the first sections, can be found here. Let’s now begin analyzing the results.
We can summarize the graphs into two tables, that show the actual value of the averages and deviations.
As stated before, the comparison of the two sorting algorithms is based on two criteria: running time and the number of comparisons and swaps. Let’s first consider the number of comparisons and swaps.
Comparisons and Swaps
By simply looking at the graphs for each run, it can be noticed that quicksort performs less comparisons and swaps than heapsort. One would also observe that as the input size gets bigger, the difference between the number of comparisons and swaps also increases. It is known that both algorithms have an average complexity of O(n * logn), therefore, the fact that the input size and the difference between the number of comparisons and swaps are directly proportional suggests that quicksort has smaller constant factors than heapsort. These two very simple remarks are supported by the graph of the average and the tables. It can be clearly seen that on random inputs, quicksort tends to perform better than heapsort.
By taking a closer look at the graphs and the tables above, there is an interesting phenomenon that emerges. The standard deviation of the number of comparisons and swaps for heapsort is extremely small, as seen in Table 2, and is not even visible on the average graph. That shows that in average, heapsort has a nearly constant number of comparisons and swaps. This suggests that its performance is not affected as much by the actual values of the input array, but rather by the size of the array. On the other hand, in the case of quicksort, the comparisons and swaps varies, proving that the values of the input array has a much more noticeable effect on its performance. This observation will be detailed later in this section. In order to support this statement, the following graphs show the average number of comparisons and the average number of swaps independently, based on the same inputs.
As seen in the above graphs, the same behavior applies when comparing the number of swaps and comparisons separately. A factor that contributes to the notable difference between the number of swaps is that in the case of heapsort, all the elements in the array are going to be swapped in the process of ordering, whilst quicksort does not swap elements that are already at the correct position.
Running Time
There is a direct link between the number of comparisons and swaps and the running time of the two algorithms. Even though comparisons may not have an important effect on running time, swaps are considerably more time consuming and tend to impact the execution time. By looking at Figure 1.4, one can notice that heapsort performs in average about 40% more swaps for an array of 10M elements, which significantly increases the running time. The correlation between the number of swaps and running time can be noticed by looking at Figure 1.1, which shows that for an array of 10M elements, quicksort runs 4 to 5 times faster than heapsort, this trend also holding for input sizes greater than 5M.
Another factor that causes quicksort to perform better than heapsort is the way in which the algorithms interact with the cache memory. Quicksort has a sequential access pattern, accessing contiguous memory locations, allowing a very effective use of caching and therefore speeding up the execution of the algorithm. On the other hand, heapsort does not have such an access pattern. In fact, heapsort jumps from one memory address to another, causing a large number of cache misses once the size of the input exceeds the size of the CPU cache.
So far, the running time comparison only took into consideration the case when the input array had random values (i.e. average case for both quicksort and heapsort). It is important to also consider the worst-case of the algorithms. For heapsort, the worst-case is still Θ(n * logn) . For quicksort, the worst-case complexity is Θ(n^2). As mentioned before, the worst-case occurs when partitioning is always completely unbalanced. An example that causes quicksort to run in Θ(n^2) is when the array is already ordered. However, the worst-case can be avoided by changing the pivot from the right most element to the middle element. This is the idea behind Hoare’s partition scheme we mentioned earlier. A detailed explanation of it can be found here.
Another case that determines quicksort to run slower than the average is when considering the sorting of integer numbers. If the range of integers is too small, the input array will contain many repetitions and the partition function will create unbalanced sub-arrays, causing quicksort to run considerably slower. An example to illustrate this behavior can be seen in the graphs below. The graphs represent the average of ten runs of quicksort in which the random range was set to [1, 10^12] for the blue graph, while the red graph had a range of only [1,10^4].
Having a considerable amount of repetitions in the input array causes heapsort to perform better than quicksort, as seen in Figures 1.7 and 1.8, that represent the average of 10 runs of quicksort and heapsort with the input array containing integers in the range [1, 10^4].
Summary
To summarize, both quicksort and heapsort have an average time complexity of O(n * logn) , as can be seen from the graphs who follow a nlog(n) curve, quicksort performing better than heapsort on random inputs that contain a very small number of repetitions due to better use of caching and less swapping operations. However, if the input contains repetitions or if the input array is already sorted, quicksort’s performance drops significantly, causing it to be slower than heapsort. A change in the pivot element eliminates the possibility of the worst-case of Θ(n^2) for quicksort. Therefore, one would choose to use quicksort if the input data is guaranteed to have few repetitions and to be in a random order, not already sorted, whilst heapsort’s upper bound of O(n * logn) makes it a choice when there is no guarantee of the input data to be randomized or to have few repetitions. It is worth mentioning that improved versions of quicksort are being used very often, the improvements deal with the cases of the array being already sorted or with repetitions in the array, and because of how caching works with quicksort, its optimized versions are preferred over heapsort, because, as we have seen, quicksort tends to perform better than heapsort.