Multiple column sorting and arraging algorithm

Posted on

Problem

Problem

The goal is to perform the following transformation:

enter image description here

Additionally, if two rows can be merged, they are combined into one:

enter image description here

Code

I wrote the following algorithm to solve this problem:

#include <vector>
#include <iostream>
#include <algorithm>

using Matrix = std::vector<std::vector<int>>;
using Column = std::vector<int>;

Column max_col(const Matrix &v) {
    return *std::max_element(v.begin(), v.end(),
                             [](auto& lhs, auto& rhs)
                             {
                                 return lhs.size() < rhs.size();
                             });
}

bool is_element_in_next_rows(const Matrix &v, int element,int row) {
    for (auto& col : v) {
        if (row >= static_cast<int>(col.size())) continue; // out of range
        if (std::lower_bound(col.begin()+row+1,col.end(),element) != 
            std::upper_bound(col.begin()+row+1,col.end(),element)) {
            return true;
        }
    }
    return false;
}

int min_element_in_row(const Matrix &v, int row) { 
    int min_element = max_col(v)[row];
    for (auto& col : v) {
        if (row >= static_cast<int>(col.size())) continue; // out of range
        if (col[row] != 0) min_element = std::min(min_element, col[row]);
    }
    return min_element;
}

void print_elements(const Matrix &v) {
    for (auto& i : v) {
        for (auto& j : i) {
            std::cout << j << " ";
        }
        std::cout << std::endl;
    }
}

void organize_elements(Matrix &v) {
    for (auto& col : v) {
        std::sort(col.begin(),col.end());
    }
    auto current_max_col = max_col(v);
    for (int row{0}; current_max_col.begin()+row!=current_max_col.end(); ++row) {
        int min_element = min_element_in_row(v,row);
        for(auto& col : v) {
            if (row >= static_cast<int>(col.size())) continue; // out of range
            int candidate = col[row];
            if (candidate > min_element) {
                if (is_element_in_next_rows(v,candidate,row)) {
                    col.push_back(0);
                    std::rotate(col.begin()+row,col.end()-1,col.end());
                }
            }
        }
        current_max_col = max_col(v);
    }
}

int main() {

    Column c1 = {5,6,8,11};
    Column c2 = {2,5,3,1,4,6};
    Column c3 = {8,7,2,4,5,3,1};

    Matrix v;
    v.push_back(c1);
    v.push_back(c2);
    v.push_back(c3);

    std::cout << "original: n";
    print_elements(v);
    organize_elements(v);
    std::cout << "organized: n";
    print_elements(v);
    return 0;
}

Which produces the following output:

original: 
5 6 8 11 
2 5 3 1 4 6 
8 7 2 4 5 3 1 
organized: 
0 0 0 0 5 6 8 11 
1 2 3 4 5 6 
1 2 3 4 5 7 8 

Questions

I particularly don’t like this part of my code:

auto current_max_col = max_col(v);
for (int row{0}; current_max_col.begin()+row!=current_max_col.end(); ++row) {
    // ....
    current_max_col = max_col(v);
}

I also don’t like that I have to do this cast:

if (row >= static_cast<int>(col.size())) continue; // out of range

Is there a way to do this check better?

The fact that I need to declare current_max_col out of the for, and then update it inside the loop. Any idea how to avoid this?

Any other ideas about how to solve this problem? Any other tips? Thank you!

Solution

I believe that you can leverage the stl algorithms better. There are still too many raw loops in your code…

I understand why you fell in love with std::rotate but it isn’t the only tool at your disposal.

For instance, std::min_element is flexible enough to handle the min_element_in_row:

int row_minimum(const Matrix& m, std::size_t row) {
    return std::min_element(m.begin(), m.end(), [row](auto&& lhs, auto&& rhs) {
            if (row < lhs.size()) {
                if (row < rhs.size()) return lhs[row] < rhs[row];
                return true;
            }
            return false;
        })->at(row);
}

Also, there is no need to compare the results of std::lower_bound and std::upper_bound. You know that the element is in the range if *std::lower_bound(b, e, elem) == elem, because it returns an iterator to the first element equal to or greater than elem.

That said, here’s my suggestion (I’m not handling row combinations though):

#include <vector>
#include <iostream>
#include <algorithm>
#include <numeric>

void normalize_columns(Matrix& m) {
    for (auto& col : m) std::sort(col.begin(), col.end());
    // build a 'synthetized' column by merging columns and removing duplicates
    Column merger;
    for (const auto& col : m) {
        Column tmp;
        std::merge(col.begin(), col.end(), merger.begin(), merger.end(), std::back_inserter(tmp));
        std::swap(tmp, merger);
    }
    auto uniques = std::unique(merger.begin(), merger.end());
    // replace each column by a copy of the synthetized column 
    //from which all elements not included in the original column are replaced by 0
    for (auto& col : m) {
        Column normalized;
        std::replace_copy_if(merger.begin(), uniques, std::back_inserter(normalized),
                             [&col](auto&& elem) { 
                                 return *std::lower_bound(col.begin(), col.end(), elem) != elem; 
                             }, 0);
        std::swap(col, normalized);
    }
}

int main() {

    Column c1 = {5,6,8,11};
    Column c2 = {2,5,3,1,4,6};
    Column c3 = {8,7,2,4,5,3,1};

    Matrix m{c1, c2, c3};

    normalize_columns(m);
    for (auto& col : m) {
        for (auto elem : col) std::cout << elem << ' ';
        std::cout << 'n';
    }

}

The output is slightly different from yours but seems to fit the problem specification:

0 0 0 0 5 6 0 8 11    
1 2 3 4 5 6 0 0 0 
1 2 3 4 5 0 7 8 0 

It seems that the transform you are after is a k-way merge in disguise.

In the nutshell, set up a min-heap of pairs <col_id, vector<int>::iterator>. On each iteration construct a vector of values equal to heap head, 0 otherwise; push it to the resulting matrix; remove those entries from heap an pull new values from the respective columns. Sorry if I miss something obvious.

I don’t think you shall strive for in-place transform.


I am pretty sure that merging rows deserve a separate pass.


The

                col.push_back(0);
                std::rotate(col.begin()+row,col.end()-1,col.end());

is no better than col.insert(col.begin() + row, 0);


The static_cast<int>(col.size()) suggests that row should not be int, but along the lines of std::vector<int>::size_type.

Leave a Reply

Your email address will not be published. Required fields are marked *