1 Introduction
-
We derive an exact formula for the probability distribution into how many elements a given element is inserted (Theorem 2). This is the crucial first step in order to obtain better bounds for the average case of MergeInsertion.
-
We experimentally examine different decision trees for binary insertion. We obtain the best result when assigning shorter decision paths to positions located further to the left (Section 3.1).
-
We use Theorem 2 in order to compute quite precise numerical estimates for the average number of comparisons for n up to roughly 15000 (Section 4.1).
-
We improve the bound of [3, 4] to \(n \log n - 1.4005n + o(n)\) (Theorem 3). This partially answers a conjecture from [12] which asks for an in-place algorithm with \(n\log n - 1.4n\) comparisons on average and \(n\log n - 1.3 n\) comparisons in the worst case. Although MergeInsertion is not in-place, the the techniques from [3, 4] or [12] can be used to make it so.
-
We evaluate a slightly different insertion order decreasing the gap between the lower bound and the average number of comparisons of MergeInsertion by roughly 30% for n ≈ 2k/3 (Section 5.2).
2 Preliminaries
2.1 Description of MergeInsertion
3 Average Case Analysis of the Insertion Step
i | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
P(Xi = 0) | \(\frac {1}{11}\) | \(\frac {1}{11} \cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 1) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 2) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 3) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 4) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 5) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 6) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 7) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 8) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 9) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 10) | \(\frac {1}{11}\) | \(\frac {1}{11}\cdot \frac {12}{13}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {18}{19}\) | \(\frac {1}{11}\cdot \frac {12}{13}\cdot \frac {14}{15}\cdots \frac {20}{21}\) |
P(Xi = 11) | 0 | \(\frac {1}{13}\) | \(\frac {1}{13}\cdot \frac {14}{15}\) | \(\frac {1}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\) | \(\frac {1}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\cdot \frac {18}{19}\) | \(\frac {1}{13}\cdot \frac {14}{15}\cdot \frac {16}{17}\cdots \frac {20}{21}\) |
P(Xi = 12) | 0 | 0 | \(\frac {1}{15}\) | \(\frac {1}{15}\cdot \frac {16}{17}\) | \(\frac {1}{15}\cdot \frac {16}{17}\cdot \frac {18}{19}\) | \(\frac {1}{15}\cdot \frac {16}{17}\cdot \frac {18}{19}\cdot \frac {20}{21}\) |
P(Xi = 13) | 0 | 0 | 0 | \(\frac {1}{17}\) | \(\frac {1}{17}\cdot \frac {18}{19}\) | \(\frac {1}{17}\cdot \frac {18}{19}\cdot \frac {20}{21}\) |
P(Xi = 14) | 0 | 0 | 0 | 0 | \(\frac {1}{19}\) | \(\frac {1}{19}\cdot \frac {20}{21}\) |
P(Xi = 15) | 0 | 0 | 0 | 0 | 0 | \(\frac {1}{21}\) |
3.1 Binary Insertion and Different Decision Trees
center-left
and center-right
strategies (the standard options for binary insertion): they compare the element to be inserted with the middle element, rounding down(up) in case of an odd number. The left
strategy chooses the element to compare with in a way such that the positions where only k − 1 comparisons are required are at the very left. The right
strategy is similar, here the positions where one can insert with just k − 1 comparisons are at the right. To summarize, the element to compare with is
\(\left \lfloor \frac {n+1}{2}\right \rfloor \) | strategy center-left |
\(\left \lceil \frac {n+1}{2}\right \rceil \vphantom {2^{k^{k}}}\) | strategy center-right |
\(\max \limits \lbrace n-2^{k}+1, 2^{k-1}\rbrace \quad \) | strategy left |
\( \min \limits \lbrace 2^{k}, n - 2^{k-1}+1\rbrace \) | strategy right |
left
strategy is also used in [7], where it is called right-hand-binary-search. Figure 7 shows experimental results comparing the different strategies for binary insertion regarding their effect on the average-case of MergeInsertion. As we can see the left
strategy performs the best, closely followed by center-left
and center-right
. right
performs the worst. The left
strategy performing best is no surprise since the probability that an element is inserted into one of the left positions is higher that it being inserted to the right. Therefore, in all further experiments we use the left
strategy.
4 Improved Upper Bounds for MergeInsertion
4.1 Numeric Upper Bound
4.2 Computing the Exact Number of Comparisons
exact.txt
in [13]. Our results match up with the values for \(n \in \lbrace 1, \dots , 8\rbrace \) calculated in [8]. Note that for these values the chosen insertion strategy does not affect the average case (we use the left
strategy).
4.3 Improved theoretical upper bounds
5 Experiments
left
strategy for binary insertion (see Section 3.1). The number of comparisons has been averaged over 10 to 10000 runs, depending on the size of the input.