Compare collections

Posted on

Problem

I have two Collections and want to compare them.

var valueFirst = new ValueFirst();
valueFirst.Id = 1;
valueFirst.Amount = 12;

...

var valueSecond = new ValueSecond();
valueSecond.Id = 123;
valueSecond.Amount = 12;


var firstCollection = new FirstCollection<ValueFirst>();
firstCollection.Add(valueFirst);

var secondCollection = new SecondCollection<ValueSecond>();
secondCollection.Add(valueSecond);

The values in the second Collection can be changed compared to first Collection – than I need to make a copy and overwrite the values in the first collection with the changed values from second with the same ID.

It is also possible that values in the second Collection were deleted – than I have to delete the values from the first Collection with the deleted ID.

The values can be also inserted to the second Collection – than I need to create them and add to the first Collection.

This is the way I used to solve it. But it is not very simple to understand.

foreach (var valueSecond in secondCollection)
{
   if (firstCollection.Any(valueFirst => valueFirst.Id == valueSecond.Id))
   {
      var valueChanged = firstCollection.First(v => v.Id == valueSecond.Id);

      valueChanged.Amount = valueSecond.Amount;
   }
   else
   {
       var valueCreated = new ValueFirst();
       valueChanged.Id = valueSecond.Id;
       valueCreated .Amount = valueSecond.Amount;
       firstCollection.Add(valueCreated);
   }
}

//// delete
var listValuesToDelete = new Collection<ValueFirst>();

foreach (var valueFirst in firstCollection)
{
   if (secondCollection.Any(valueSecond => valueSecond.Id != valueFirst.Id))
   {
      listValuesToDelete.Add(valueFirst);
   }
}

foreach (var valueFirstToDelete in listValuesToDelete)
{
   firstCollection.Remove(valueFirstToDelete );
}

Solution

I would write it in this way:

// Copy from the second collection for items with same Id
foreach (var a in firstCollection.Join(secondCollection, 
                                       ok => ok.Id, ik => ik.Id, 
                                       (vf, vs) => new { vf, vs }))
{
    a.vf.Amount = a.vs.Amount;
}

// Add items to the first collection which are only in the second.
foreach (var valueSecond in secondCollection.Where(
                            vs => firstCollection.All(vf => vf.Id != vs.Id))) 
{
    firstCollection.Add(new ValueFirst(valueSecond));
}

// Remove from the first which are not in the second.
foreach (var valueFirst in firstCollection.Where(
                           vf => secondCollection.All(vs => vs.Id != vf.Id)).ToList())
{
    firstCollection.Remove(valueFirst);
}

If your types ValueFirst and ValueSecond have a common base type and your collections don’t contain duplicate items I would prefer to create an EqualityComparer for them and use an Except method for the second and the third blocks.

There’s a bug here:

foreach (var valueFirst in firstCollection)
{
    if (secondCollection.Any(valueSecond => valueSecond.Id != valueFirst.Id))
    {
        listValuesToDelete.Add(valueFirst);
    }
}

That should be

foreach (var valueFirst in firstCollection)
{
    if (!secondCollection.Any(valueSecond => valueSecond.Id == valueFirst.Id))
    {
        listValuesToDelete.Add(valueFirst);
    }
}

If you don’t have duplicate Ids in secondCollection, then it seems you can achieve what you want with

firstCollection = secondCollection
    .Select(x => new ValueFirst { Id = x.Id, Amount = x.Amount })
    .ToList();

I’d like to challenge the notion that you need to identify each change in collection 2 and update collection 1 with each change. Why not just overwrite collection 1 with collection 2 when you determine they’re not equal?

First, optimize this to compare the counts (if they have a .Count property, enumerating over the collections with Count() will be less performant) and compare them to skip more processor-heavy computation.

Second, do one of the following:

  • Implement an IEqualityComparer that compares Id and Amount, sort the collections, and call SequenceEqual()
  • use Join() to join the collections on Id and Amount (you could do this with an IEqualityComparer or with lambdas). Then compare the count of the result set with the count of the collections (you only need one comparison since you’ve established collections 1 and 2 have equal counts at this point).

If at any point these values are not equal, simply overwrite collection 1 with collection 2.

Leave a Reply

Your email address will not be published. Required fields are marked *