Find the min and max elements of a collection, and check whether it is sorted

Posted on

Problem

The function I propose allows to find the minimum and maximum values of a collection and to check that it is sorted up to a certain element. Basically it is a combination of std::minmax_element and std::is_sorted_until in a single pass function, without any significant overhead compared to a call to std::minmax_element.

template<typename ForwardIterator, typename Compare = std::less<>>
auto minmax_element_and_is_sorted_until(ForwardIterator first, ForwardIterator last,
                                        Compare compare={})
    -> decltype(auto)
{
    // Function-local result type, only the names of the
    // data members matter
    struct result_type
    {
        ForwardIterator min;
        ForwardIterator max;
        ForwardIterator sorted_until;
    } result = { first, first, last };

    // 0 or 1 elements
    if (first == last) return result;
    auto next = std::next(first);
    if (next == last) return result;

    // While it is sorted, the min and max are obvious
    auto current = first;
    while (not compare(*next, *current)) {
        ++current;
        ++next;

        // The range is fully sorted
        if (next == last) {
            result.max = current;
            return result;
        }
    }

    // The range is not sorted, use a regular minmax_element algorithm
    result.min = first;
    result.max = current;
    result.sorted_until = next;

    auto tmp = std::minmax_element(next, last, compare);
    if (compare(*tmp.first, *result.min)) {
        result.min = tmp.first;
    }
    if (not compare(*tmp.second, *result.max)) {
        result.max = tmp.second;
    }
    return result;
}

The algorithm minmax_element was introduced in Boost a long time ago before making its way into C++11. Its principal advantage is that finding both the min and max element of a collection only costs at most max(32(N1),0)max(32(N1),0) comparisons vs. approximately 2N2N comparisons with separate calls to std::min_element and std::max_element.

The algorithm minmax_element_and_is_sorted_until takes the optimization logic one step further and allows to also find until which element a collection is sorted with virtually no additional comparison, while a separate call to std::is_sorted_until would have added another O(n)O(n) comparisons.

While a bit obscure, this algorithm notably helps to optimize counting sort: some flavours of this sorting algorithm first need to know the minimal and maximal values of the collection to sort. In such a case, checking for free whether the collection is sorted allows to return early when the collection is already sorted, which is always something nice to have since it’s free.

Do you think anything could be improved with this algorithm, be it a matter of correctness, style or performance?

Solution

The quick exit seems a bit clunky.

// 0 or 1 elements
if (first == last) return result;
auto next = std::next(first);
if (next == last) return result;

I also don’t like the return on the same line as the if (test) it makes it harder to read (subjective).

I think I would simplify like this:

auto next = first;
if ((next == last) || (++next == last)) {
    return {first, first, last};
}

This also removed the need to declare result so far before it is actually needed.

Thus this:

// The range is not sorted, use a regular minmax_element algorithm
result.min = first;
result.max = current;
result.sorted_until = next;

can now be replaced with:

result_type result = { first, current, next };

Lets also use the std library to convey meaning for the min and max. So this:

if (compare(*tmp.first, *result.min)) {
    result.min = tmp.first;
}
if (not compare(*tmp.second, *result.max)) {
    result.max = tmp.second;
}

Can be replaced by:

auto iterCompare = [&compare](ForwardIterator lhs, ForwardIterator rhs){return compare(*lhs, *rhs);};
result.min = std::min(tmp.first,  result.min, iterCompare); 
result.max = std::max(tmp.second, result.max, iterCompare); 

I would personally avoid de-referencing the iterators (pointers) with “*” inside the compare function. I think you can use:

compare(tmp->first, result->min)

which should be a safer option.

Leave a Reply

Your email address will not be published. Required fields are marked *