Problem
As soon as I saw the First missing positive question, I decided to write my own method for this before writing my review on the question.
Judge me by the same standards as bazang wants to be judged, i.e. As if this was written for a big company such as Google for an interview.
I wasn’t sure how the input 7, 8, 9 should be treated so to make it a bit harder for myself I gave myself the requirement that it should return 10.
@Test
public void test() {
Assert.assertEquals(3, missingInt(new int[]{ 1, 2, 0 }));
Assert.assertEquals(4, missingInt(new int[]{ 1, 2, 3 }));
Assert.assertEquals(2, missingInt(new int[]{ 3, 4, -1, 1 }));
Assert.assertEquals(10, missingInt(new int[]{ 7, 8, 9 }));
Assert.assertEquals(5, missingInt(new int[]{ 3, 4, 2, 7, 6, 1 }));
Assert.assertEquals(5, missingInt(new int[]{ 3, 4, 2, 7, 6, 1, -4 }));
}
public int missingInt(int[] array) {
boolean[] foundIntegers = new boolean[array.length];
int smallestPositive = Integer.MAX_VALUE;
for (int i : array) {
if (i > 0 && i < smallestPositive)
smallestPositive = i;
}
for (int i : array) {
if (i < smallestPositive)
continue;
int index = i - smallestPositive;
if (index < foundIntegers.length)
foundIntegers[index] = true;
}
for (int i = 0; i < foundIntegers.length; i++) {
if (!foundIntegers[i])
return i + smallestPositive;
}
return foundIntegers.length + smallestPositive;
}
Solution
That solution is the algorithm I would have chosen.
I like the data you chose for your list of test cases; you could have added at least one more test case, i.e. an input array with zero elements.
In fact IMO your code will fail when the input is a zero-length array.
You code will also fail when all the input integers are <= 0
.
Stupid edge-cases!
While doing some testing for coming up with a better version for the First missing positive question, I added a bunch of more test cases and compared several different methods with each other. Then I found the following problems:
Input [-5] returned 2147483647 but expected 1
Input [-5, -4, -3] returned 2147483647 but expected 1
Input [0, 0, 0, 0] returned 2147483647 but expected 1
Input [] returned 2147483647 but expected 1
To correct these problems, I added some edge-case checking; to check for array where there are no positive integers. As these really are edge-cases, I haven’t decided how to handle these. As there technically isn’t a clear “first missing positive number”. If I would have handled 7 8 9
as return 1
then I would have returned 1
for these edge cases also, but I feel that wouldn’t be expected behavior from this algorithm so therefore I decided to throw an exception.
The new code is:
public int simonNew(int[] array) {
boolean[] foundIntegers = new boolean[array.length];
int smallestPositive = Integer.MAX_VALUE;
for (int i : array) {
if (i > 0 && i < smallestPositive)
smallestPositive = i;
}
if (smallestPositive == Integer.MAX_VALUE)
throw new IllegalArgumentException("Array must not be null and must contain at least one positive integer");
for (int i : array) {
if (i < smallestPositive)
continue;
int index = i - smallestPositive;
if (index < foundIntegers.length)
foundIntegers[index] = true;
}
for (int i = 0; i < foundIntegers.length; i++) {
if (!foundIntegers[i])
return i + smallestPositive;
}
return foundIntegers.length + smallestPositive;
}
-
if (i > 0 && i < smallestPositive) smallestPositive = i;
could be
if (i > 0) smallestPositive = Math.min(smallestPositive, i);
I think it’s a little bit easier to read.
-
At work I’d use parametrized tests which help defect localization. Anyway, you could extract out a verifier method to remove some duplication:
@Test public void test() { verifyMissingInt(3, 1, 2, 0); verifyMissingInt(4, 1, 2, 3); verifyMissingInt(2, 3, 4, -1, 1); verifyMissingInt(10, 7, 8, 9); verifyMissingInt(5, 3, 4, 2, 7, 6, 1); verifyMissingInt(5, 3, 4, 2, 7, 6, 1, -4); } private void verifyMissingInt(int expected, int... data) { assertEquals(expected, missingInt(data)); }
Unfortunately, every parameter is integer, so it’s not easy to separate expected output from input data. Two other ways:
verifyMissingInt(3, new int[] { 1, 2, 0 });
and
verifyMissingInt(3, array(1, 2, 0));
with
private int[] array(int... data) { return data; }
It might be possible to solve this problem without using a bitfield or boolean array at all. In an interview, it’s important to ask questions to clarify the requirements.
Assuming that the positive inputs always consist of a consecutive range of numbers with no duplicates and exactly one missing number, the answer can be obtained by arithmetic in a single pass. The following solution passes all of the tests in the question.
public static int missingInt(int[] array) {
int min = Integer.MAX_VALUE, max = 0, count = 0;
long sum = 0;
for (int i : array) {
if (i <= 0) continue;
count++;
sum += i;
if (i < min) min = i;
if (i > max) max = i;
}
int range = max - min;
if (count == 0) {
// No positive inputs
assert sum == 0;
return 1;
} else if (range == count) {
// Probably one missing element between min and max.
// Calculate expected sum of all integers from min to max, inclusive.
long rangeSum = ((long)max + min) * (max - min + 1) / 2;
return (int)(rangeSum - sum);
} else if (range == count - 1) {
// Probably all consecutive numbers. As defined in the
// question, we want max + 1.
assert sum == ((long)max + min) * (max - min + 1) / 2;
return max + 1;
} else {
throw new IllegalArgumentException();
}
}
The caveat is that if the input array contains duplicate entries, or if it is missing multiple entries between the smallest and largest positive number, then the behaviour is undefined. It is sometimes possible to detect such unexpected inputs, but not always.
As indicated by the assertions, there is some redundancy in the statistics. If you don’t want to perform the consistency check (which is pretty weak anyway), you could eliminate count
.
Since there are multiple ways to solve this problem, and you don’t know which solution your interviewer prefers to see, it’s important to ask and clarify what the requirements are.
Tests
As I said in my comment, you should use import static Assert.*
as it help improve the readability of the code. This is very minor, but it could help you demonstrate that you know this features of the language. (I know that you’re using it, but for future readers it could help)
You could have more than one method for your tests. In modern IDE, having single method/test case with a meaningful case would help to know at first glance what is working, from what is not. This is minor, but it could show to an interviewer that you’re as organize in your test than in your code. If the major company that you’re applying works with TDD, it could really help.
Coding Style
I was wondering why you were not using brackets for all your if
:
if (i > 0 && i < smallestPositive)
smallestPositive = i;
In the same time, you’re not doing the same thing for for
loops :
for (int i = 0; i < foundIntegers.length; i++) {
if (!foundIntegers[i])
return i + smallestPositive;
}
I don’t know if this would come up in an interview, but this struck me. Why the difference ? If the reason is readability, I would : Why is it good for for
but not for if
?
Anyway, I would recommend to always use brackets.
Conclusion
This is just consideration about the style of your code, and nothing that would mark you as a bad or good candidate. Your code in general is really impressively good.