# Quicksort – follow-up

Posted on

Problem

This is a follow-up to

Quicksort function

Like the previous code, this does not optimize for containers with large numbers of duplicate elements. How can I improve this code, particularly with iterators?

``````template<typename C>
void insertion_sort(C low, C high) {
for (auto i = low; i < high; ++i) {
for (auto j = i; j != low && *j < *(j-1); --j) {
std::swap(*j, *(j-1));
}
}
}

template<typename C, typename I>
I partition(C& c, I low, I high) {
auto p = *low;
I i = low;
I j = high;
while (true) {
while (i+1 != std::end(c) && *(++i) < p) {
if (i == high) {
break;
}
}
while (p < *(--j)) { }
if (i >= j) {
break;
}
std::swap(*i, *j);
}
std::swap(*low, *j);
return j;
}

template<typename C, typename I>
void sort(C& c, I low, I high) {
if (high <= low+15) {
insertion_sort(low, high);
return;
}
I p = partition(c, low,high);
sort(c, low, p);
sort(c, p+1,high);
}

template<typename C>
void sort(C& c) {
if (std::begin(c) == std::end(c)) {
throw std::out_of_range("no size");
}
if (c.size() == 1) {
return;
}
std::random_shuffle(std::begin(c), std::end(c));
sort(c, std::begin(c), std::end(c));
}
``````

Solution

Sorting an empty array is perfectly valid.

``````if (c.size() == 0) {
throw std::out_of_range("no size");
}
``````

The result id an empty array. An exception is not the correct way to go.

OK optimize. An array of size 0 or 1 is already sorted.

``````if (c.size() == 1) {  // I would make this c.size() <= 1 (as I would remove the exception)
return;
}
``````

Special casing the `std::end(c)` just looks wrong.

``````   while (i+1 != std::end(c) && *(++i) < p) {
``````

I would say that if `i+1 == high` would be a better test. Also adding the ++ and * operators into the expression makes it really hard to read. I would go with:

``````    while ((i < high) && (*i < p)) { ++i;}
``````

The decrementing the j pointer should look exactly the same as the increment.

This is not tested (so may have bugs).
But this is how I would write partition.

``````// Note: this function require std::distance(low, high) >= 3
// Note: it is called from sort when std::distance(low, high) > 15
template<typename I>
std::pair<I,I> partition(I low, I high) {
--high;
I   pivot   = high;

I   highBot = low;
I   lowTop  = high;
--lowTop;
while (highBot < lowTop) {
while ((highBot < high)  && (*highBot <= *pivot))   { ++highBot;}
while ((lowTop >= low)   && (*pivot < *lowTop))     { --lowTop;}
if (highBot < lowTop) {
std::swap(*highBot, *lowTop);
}
}
// At this point highBot and lowTop have just passed each other.
// Thus highBot points to the first element greater than *pivot
// and lowTop points to the last element smaller or equal to *pivot
swap(*highBot, *pivot);

// Now: highBot points to the pivot value and the value that was
// at highBot is in the top end of the high range.

// We want to to remove pivot value from the range of elements to be sorted.
// so we want to sort the two ranges likes this:
//
//      [low..lowTop+1)
//      highBot                 The pivot point removed from further sorting.
//                              Note: lopTop+1 == highBot
//      [highBot+1..high)
//

// Small optimization
// If the value at lowTop is also equal to the pivot value
// Then we can remove it from further sorting.
for(; (lowTop >= low) && (*p == *lowTop); --lowTop) {}

return std::make_pair(lowTop+1, highBot+1);
}
``````

Usage:

``````std::pair<I,I> part = partition(low, high);
sort(c, low, part.first);
sort(c, part.second, high);
``````

@Loki already covered the most important points, but there are still a few things that you could change to improve the algorithm further (provided you’re still interested in this question one year later):

• A well-designed sorting algorithm should be able to take advantage of user-defined swap functions instead of simply using `std::swap` everywhere; these user-defined overloads might be optimized. We can use argument-dependent lookup to do the job:

``````using std::swap;
swap(*it1, *it2);
``````

The first line tells the compiler to take `std::swap` into account for unqualified calls to `swap`. The second line makes an unqualified call to `swap`: the compiler will check the types of `*it1` and `*it2` and search for a `swap` function in their associated namespace. If the lookup finds such a function, it will use it, otherwise it will use `std::swap` instead. Note that you can also use `std::iter_swap` which does exactly that:

``````std::iter_swap(it1, it2);
``````
• Algorithms that care about comparison generally allow to pass an additional `compare` parameter to use instead of `operator<`. It allows to sort types that do not define `operator<` and to sort things differently (for example, it allows to use `std::greater<>` to sort a collection in reverse order. Here is a comparison-enhanced insertion sort:

``````template<typename C, typename Compare=std::less<>>
void insertion_sort(C low, C high, Compare compare={}) {
for (auto i = low; i < high; ++i) {
for (auto j = i; j != low && compare(*j, *(j-1)); --j) {
std::iter_swap(j, j - 1);
}
}
}
``````
• `std::sort` is required to work with move-only types, and quicksort generally requires to copy the pivot before the partitioning operation to avoid tricky problems. To implement a quicksort without compying the pivot, the trick either to choose `std::prev(high)` as the pivot or to swap the pivot with `std::prev(high)` before the partitioning step.

• It’s a mere detail, but the condition `if (high <= low+15)` is not the easiest to reason about. I would use `if (std::distance(low, high) <= 15)`, which tells more about the reasoning in my opinion.