Problem
Purpose
I came across this relatively simple question on HackerRank regarding finding missing values across two List
s. I took a fairly straight-forward approach, but is there a “better” / “slicker” way of implementing a solution?
Task
Strategy
- Iterate through
List
with complete data. - If
Integer
is inList
with incomplete data, removeInteger
from bothList
s. Else, addInteger
to aTreeSet
of all missing values.
Implementation
public class MissingNumbersImpl implements MissingNumbers {
@Override
public Set<Integer> identifyMissingNumbers(final List<Integer> all, final List<Integer> incomplete) {
if (all.size() <= incomplete.size()) {
throw new IllegalArgumentException("list containing all numbers must have a size greater than list not containing all numbers");
}
final Set<Integer> missingNumbers = new TreeSet<>();
final Iterator<Integer> iterator = all.iterator();
while (iterator.hasNext()) {
final Integer next = iterator.next();
if (incomplete.contains(next)) {
incomplete.remove(next);
iterator.remove();
} else {
missingNumbers.add(next);
}
}
return missingNumbers;
}
}
Solution
Building on the comments made by h.j.k., you forgot to handle one case in your implementation: the case when the bigger list has a count of a number n
that is less than the count for that number n
for the smaller list. For example, if you consider the following:
public static void main(String[] args) {
List<Integer> all = new ArrayList<>(Arrays.asList(1, 1, 2, 3, 4, 5, 5));
List<Integer> incomplete = new ArrayList<>(Arrays.asList(1, 1, 1, 2));
System.out.println(identifyMissingNumbers(all, incomplete));
}
It will print with your current code [3, 4, 5]
disregarding 1 as a missing number although it should be missing since it does not appear with the same frequency in both lists.
The reason is that you’re iterating on the number in the bigger list, removing them from the smaller list as you encounter them and adding them to the missing number when you finally don’t find a match. But if the smaller list as a bigger count for that number, it won’t be taken into account.
There are a couple of possibilities to tackle this: you could introduce a new loop that would have the same logic but iterating on the smaller list instead and removing elements from the bigger list.
I find a better approach would be the following:
- Sort each input list in ascending order
- Create an empty
Map<Integer, Integer>
that will store the count for each number - Iterate over the first list. If the number if not in the Map, add it with a value of 1. Else, increment the current count by 1.
- Iterate over the second list. If the number if not in the Map, add it with a value of -1. Else, decrement the current count by 1.
- At this point, all values in the map that are different than 0 are missing numbers: since we added 1 when encountered in the first list and subtracted 1 when encountered in the second list, the result must be 0 if that number is not missing. Hence, we remove all entries in the map where the value is 0.
- Finally, we return the keys of the map in ascending order.
Assuming Java 8, an implementation would be the following:
public static Set<Integer> identifyMissingNumbers(List<Integer> list1, List<Integer> list2) {
Map<Integer, Integer> count = new HashMap<>();
for (Integer number : list1) {
count.merge(number, 1, Integer::sum);
}
for (Integer number : list2) {
count.merge(number, -1, Integer::sum);
}
count.values().removeIf(i -> i == 0);
return new TreeSet<>(count.keySet());
}
Mutability
After getting the missing numbers, Numeros has a new problem now: most of the original lists of numbers are gone!
You should not be calling Iterator.remove()
or List.remove()
on the incoming arguments, as that modifies their state (which can be really bad if the caller isn’t expecting such a behavior), or throws a UnsupportedOperationException
when immutable List
instances are passed in (which isn’t bad, but you have to deal with exception handling now).
Validation
if (all.size() <= incomplete.size()) { ... }
Is it really an error when there’s the same number of numbers in both inputs? What if it’s a best-case scenario and there are no dropped numbers?