For the sake of simplicity (and because it relates to binary search trees, which we will treat later), we shall only treat the case of sorting lists without repeated elements. (See the end of this section for further discussion of this point.)
As is well known, quicksort has quadratic worst-case performance if the pivot is chosen poorly. Using the true median as the pivot would solve this, but is impractical. Instead, a simple alternative is to choose the pivot randomly, which is the variant that we shall analyse first.
3.1 Randomised Quicksort
Intuitively, the good performance of randomised quicksort can be explained by the fact that a random pivot will usually not be among the most extreme values of the list, but somewhere in the middle, which means that, on average, the size of the lists is reduced significantly in every recursion step.
To make this more rigorous, let us look at the definition of the algorithm in Isabelle. Note that the algorithm returns not only the result, but a pair of the result and the number of comparisons that it made in total:
Here, \({\mathbin {@}}\) denotes list concatenation and \(\textit{xs}\mathbin {!}i\) denotes the ith element of the list \(\textit{xs}\), where \(0 \le i < |\textit{xs}|\). The delete_index function removes the ith element of a list, and the parameter R is a linear ordering represented as a set of pairs.
It is easy to prove that all of the lists that can be returned by the algorithm are sorted w. r. t.
R. The base case makes 0 comparisons and the recursive case makes
\(|\textit{xs}| - 1 + m + n\) comparisons, where
m and
n are the numbers of comparisons made by the recursive calls. This could easily be encapsulated in a resource monad (as we have done elsewhere [
30] for more complex code), but it is not worth the effort in this case.
For an element x and some list \(\textit{xs}\), we call the number of elements of \(\textit{xs}\) that are smaller than x the rank of x w. r. t. \(\textit{xs}\). In lists with distinct elements, each element can clearly be uniquely identified by either its index in the list or its rank w. r. t. that list, so choosing an element uniformly at random, choosing an index uniformly at random, or choosing a rank uniformly at random are all interchangeable.
In the above algorithm, the length of \(\textit{ls}\) is simply the rank r of the pivot, and the length of \(\textit{rs}\) is simply \(|\textit{xs}| - 1 - r\), so choosing the pivot uniformly at random means that the length of \(\textit{ls}\) is also distributed uniformly between 0 and \(|\textit{xs}| - 1\). From this, we can see that the distribution of the number of comparisons does not actually depend on the content of the list or the ordering R at all, but only on the length of the list, and we can find the following recurrence for it:
We then have the following relationship between \(\text {rquicksort} \) and \(\text {rqs}\_\text {cost}\):
In other words: Projecting out the number of comparisons from our cost-aware randomised quicksort yields the distribution given by \(\textit{rqs}\_\textit{cost}\).
Next, we will attempt to determine the expected value of
\(\textit{rqs}\_\textit{cost}\), which we shall denote by
Q(
n). The recursive definition of
\(\textit{rqs}\_\textit{cost}\) directly leads us to the recurrence
$$\begin{aligned} Q(n+1) = n + \frac{1}{n+1}\left( \sum _{i=0}^n Q(i) + Q(n - i)\right) \ , \end{aligned}$$
or, equivalently,
$$\begin{aligned} Q(n+1) = n + \frac{2}{n+1}\left( \sum _{i=0}^n Q(i)\right) \ . \end{aligned}$$
This is often called the
quicksort recurrence. Cichoń [
6] gave a simple way of solving this by turning it into a linear recurrence
$$\begin{aligned} \frac{Q(n+1)}{n+2} = \frac{2n}{(n+1)(n+2)} + \frac{Q(n)}{n+1}\ , \end{aligned}$$
which gives us
$$\begin{aligned} \frac{Q(n)}{n+1} = 2\sum _{k=1}^n \frac{k-1}{k(k+1)} = 4 H_{n+1} - 2 H_n - 4 \end{aligned}$$
by telescoping and thereby the closed-form solution
$$\begin{aligned} Q(n) = 2 (n+1) H_n - 4n\ , \end{aligned}$$
where
\(H_n\) is the
nth harmonic number. We can use the well-known asymptotics
\(H_n \sim \ln n + \gamma \) (where
\(\gamma \approx 0.5772\) is the Euler–Mascheroni constant) from the Isabelle library and obtain that the expected number of comparisons fulfils
\(Q(n) \sim 2 n \ln n\).
Remember, however, that we only considered lists with no repeated elements. If there
are repeated elements, the performance of the above algorithm can deteriorate to quadratic time. This can be fixed easily by using a three-way partitioning function instead, although this makes things slightly more complicated since the number of comparisons made now depends on the
content of the list and not just its
length. The only real difference in the cost analysis is that the lists in the recursive call no longer simply have lengths
r and
\(n - r - 1\), but can also be shorter if the pivot is contained in the list more than once. We can still show that the expected number of comparisons is at most
Q(
n) in much the same way as before (and our entry [
9] in the
Archive of Formal Proofs does contain that proof), but we shall not go into more detail here.
Comparing our proof to those in the literature, note that both Cormen et al. [
7] and Knuth [
24] also restrict their analysis to distinct elements. Cormen et al. use a non-compositional approach with indicator variables and only derive the logarithmic upper bound, whereas Knuth’s analysis counts the detailed number of different operations made by a particular implementation of the algorithm in MIX. His general approach is very similar to the one presented here.
3.2 Average-Case of Non-Randomised Quicksort
The above results carry over directly to the average-case analysis of non-randomised quicksort (again, we will only consider lists with distinct elements). Here, the pivot is chosen deterministically; we always choose the first element for simplicity. This gives us the following definitions of quicksort and its cost:
Interestingly, the number of comparisons made on a randomly-permuted input list has exactly the same distribution as the number of comparisons in randomised quicksort from before. The underlying idea is that when randomly permuting the input, the randomness can be ‘deferred’ to the first point where an element is actually inspected, which means that choosing the first element of a randomly-permuted list as the pivot is equivalent to choosing the pivot uniformly at random.
The formal proof of this starts by noting that choosing a random permutation of a non-empty finite set A is the same as first choosing the first list element \(x\in A\) uniformly at random and then choosing a random permutation of \(A\setminus \{x\}\) as the remainder of the list, allowing us to pull out the pivot selection. Then, we note that taking a random permutation of \(A\setminus \{x\}\) and partitioning it into elements that are smaller and bigger than x is the same as first partitioning the set \(A\setminus \{x\}\) into \(\{y\in A\setminus \{x\} \mid (y,x)\in R\}\) and \(\{y\in A\setminus \{x\}\mid (y,x)\notin R\}\) and choosing independent random permutations of these sets.
This last step, which interchanges partitioning and drawing a random permutation, is probably the most crucial one and one that we will need again later, so we present the corresponding lemma in full here. Let \(\textit{partition}\ P\ \textit{xs}\) be the function that splits the list \(\textit{xs}\) into the pair of sub-sequences that satisfy (resp. do not satisfy) the predicate P. Then, we have:
Here, the function \(\textit{permutations}\_\textit{of}\_\textit{set}\ A\) returns the set of all permutations of the given finite set A, i. e. the set of all lists that contain each element of A exactly once. The lemma is easily proven directly by extensionality, i. e. fixing permutations \(\textit{xs}\) of \(\{x\in A \mid P\ x\}\) and \(\textit{ys}\) of \(\{x\in A \mid \lnot P\ x\}\), computing their probabilities in the two distributions and noting that they are the same.
With this, the proof of the following theorem is just a straightforward induction on the recursive definition of \(\textit{rqs}\_\textit{cost}\):
Thus, the cost distribution of deterministic quicksort applied to a randomly permuted list (with distinct elements) is the same as that of randomised quicksort applied to any list of the same size (with distinct elements). In particular, the asymptotic results about the expectation of \(\textit{rqs}\_\textit{cost}\) carry over directly.