A comparison sort is a particular type of sorting algorithm; a number of well-known algorithms are comparison sorts. It is defined as a sorting algorithm which can only read the list elements through a single abstract comparison operation (often a "less than" operator) that determines which of two elements should occur first in the final sorted list. The only requirement is that the operator obey the three defining properties of a total order:
Some of the most well-known comparison sorts include:
Some examples of sorts which are not comparison sorts include:
There are fundamental limits on the performance of comparison sorts. A comparison sort must have an upper bound of at least Ω(n log n) comparison operations in the worst case. This is a consequence of the limited information available through comparisons alone. In this sense, mergesort, heapsort, and introsort are asymptotically optimal in terms of the number of comparisons they must perform, although this metric neglects other operations. The three non-comparison sorts above achieve O(n) performance by using operations other than comparisons, allowing them to sidestep this lower bound (assuming elements are constant-sized).
Nevertheless, comparison sorts offer the notable advantage that control over the comparison function allows sorting of many different datatypes and fine control over how the list is sorted. For example, reversing the result of the comparison function allows the list to be sorted in reverse, and it's simple to sort a list of tuples in lexicographic order by just creating a comparison function that compares by each part in sequence: function compare((lefta, leftb, leftc), (righta, rightb, rightc)) if lefta ≠ righta return compare(lefta, righta) else if leftb ≠ rightb return compare(leftb, rightb) else return compare(leftc, rightc)
Comparison sorts generally adapt more easily to complex orders such as the order of floating-point numbers. Additionally, once a comparison function is written, any comparison sort can be used without modification; non-comparison sorts typically require specialized versions for each datatype.
This flexibility, together with the efficiency of the above comparison sorting algorithms on modern computers, has led to widespread preference for comparison sorts in most practical work.
Imagine we're sorting a list of numbers, all distinct (we can assume this because this is a worst-case analysis). There are n factorial (n!) possible ways to rearrange this list, and each one must be rearranged differently to come up with a sorted list. Therefore, if we don't perform enough comparisons to distinguish n! different lists, we can't have a correct sort.
But how many comparisons is this? We can think of the control flow of a comparison sort algorithm as following a very large conceptual binary tree, where each node represents a comparison between two list elements. There are n! leaves in this tree corresponding to the n! permutations; when we reach a leaf, we have derived enough information to successfully sort the list. The worst-case time of the algorithm is the maximum distance of any leaf from the root, called the height of the tree. Although the specific tree depends on the sorting algorithm, any tree with n! leaves has height at least as large as the complete binary tree with n! leaves. This tree has Ω(log(n!)) height. But Stirling's approximation tells us that log(n!) is Ω(n log n).
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Comparison sort".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world