Problem
Input:
Array of n integers, containing numbers in the range [1,n].
Each integer appears once except A which repeats twice and B which is missing.
Output:
Return A and B.
Example:
Input:[3 1 2 5 3]
Output:[3, 4]
A = 3, B = 4
My approach:
public class Solution {
// DO NOT MODIFY THE LIST. IT IS READ ONLY
public ArrayList<Integer> repeatedNumber(final List<Integer> A) {
//O(n) solution
//Store the count of all the numbers appearing in the list
int [] count = new int[A.size()];
int rep_num = 0,miss = 0;
//Increase the count at the index location
for( int i = 0; i < A.size(); i++ )
{
int ind = A.get(i);
count[ind  1]++;
//
if(count[ind  1] == 2)
{
rep_num = ind;
break;
}
}
//If the count has not been updated, then the number is missing in the array
for(int i = 0; i < count.length; i++ )
{
if(count[i] == 0)
miss = i+1;
}
ArrayList<Integer> num = new ArrayList<Integer>();
num.add(rep_num);
num.add(miss);
return num;
}
}
I have the following questions:

How can I further optimize my code?

Is there any better way to solve this question (i.e. using a better data structure, lesser lines of code)?
Question asked on: interviewbit.com
Solution

Your solution takes about 25 lines which is, IMO, a lot for this task. I’ll propose my solution at the end.

int rep_num = 0,miss = 0;
Avoid declaring two variables on the same line, also rep_num
does not correspond to the java recommended notations. Java uses lowerCamelCase for variable (and method) name so you should write rep_num
as repNum
.
Also miss
is not really clear as a variable name IMO. Maybe you should consider missingNumber
.
 You are using an array to store duplicated elements, but it’s clearer to use Set or a Map (or even multiSet with a library) to count duplicate elements. Actually if you have to count a lot of duplicate (or missing) elements, it’s likely to be faster than your solution.
Here is a code sample :
Set<Integer> seen = ...
for (Integer elementFromA : a) {
if (seen.contains(elementFromA)) {
// a set can only contains one elements, thus we the current value of elementFromA is a duplicate
// let's do something with it
}
seen.add(v);
}
 You should consider turning this method into a
static
method as it don’t use any field from your object.
Here is the solution I came up with (it should be about the same performance as yours) :
public static List<Integer> repeatedNumber(final List<Integer> a) {
ArrayList<Integer> res = IntStream.range(1, a.size() + 1).boxed().collect(toCollection(ArrayList::new));
// by removing everything elements from a, we are basically doing an exclusion/disjunction
// res will now contain only the missing elements from a
res.removeAll(a);
final Set<Integer> seen = new HashSet<>();
for (Integer v : a) {
if (seen.contains(v)) {
res.add(0, v);
break;
}
seen.add(v);
}
return res;
}
To make sure I wasn’t making mistakes, I made a quick unit test class :
public class SolutionTest {
@Test
public void basicTest() {
List<Integer> expected = Arrays.asList(3, 4);
Assert.assertEquals(expected, Solution.repeatedNumber(Arrays.asList(3, 1, 2, 5, 3)));
}
@Test
public void testWithLastNumberModified() {
List<Integer> expected = Arrays.asList(6, 7);
Assert.assertEquals(expected, Solution.repeatedNumber(Arrays.asList(1, 2, 3, 4, 5, 6, 6)));
}
@Test
public void testWithBigList() {
List<Integer> expected = Arrays.asList(2, 3);
Assert.assertEquals(expected, Solution.repeatedNumber(Arrays.asList(11, 10, 9, 8, 7, 6, 5, 4, 2, 2, 1)));
}
@Test
public void anotherTestWithBigList() {
List<Integer> expected = Arrays.asList(6, 1);
Assert.assertEquals(expected, Solution.repeatedNumber(Arrays.asList(10, 11, 7, 8, 9, 3, 2, 6, 4, 5, 6)));
}
}
There is a shortcut you can take to find the missing number after you’ve found the duplicated number. You may have come across this fact before, where the sum of the numbers 1 to n
is n*(n+1)/2
. We can leverage this along with the duplicated number to find the missing number. If you add up every number in your list and subtract that from what the expected sum would be from 1 to n
, most of the terms will cancel, leaving you with missing  duplicate
. You can visualize that with an example:
(1+2+3+4+5+6+7)
(1+2+3+4+5+1+7)
=(6)(1)
In that example, n
is 7, the duplicated number is 1, and the missing number is 6. So you get the formula expected_sum  actual_sum = missing  duplicate
. We can solve for missing
by adding duplicate
to both sides, leaving us with missing = expected_sum  actual_sum + duplicate
! Let’s do that.
First, you can calculate the expected sum from that formula, n*(n+1)/2
. In your loop searching for the duplicate, you will need to keep a running total (and don’t break early). After you find the sum of your list and identify the duplicate number, you can calculate the missing number using the formula we talked about in the above example, missing = expected_sum  actual_sum + duplicate
.
Also, you don’t need to record the frequency as an integer, you can just keep a list of booleans for numbers already seen. I’ve also taken the liberty of cleaning up the indentation issues and renaming some of the variables for the purpose of readability. Here is the solution:
public class Solution {
// DO NOT MODIFY THE LIST. IT IS READ ONLY
public ArrayList<Integer> repeatedNumber(final List<Integer> A) {
//O(n) solution
//Keep a record of all the numbers appearing in the list
boolean [] seen = new boolean[A.size()];
int rep_num = 0, miss_num = 0;
int expected_sum = A.size() * (A.size() + 1) / 2;
int actual_sum = 0;
//Find sum of list and identify duplicate number
for( int i = 0; i < A.size(); i++ )
{
int num = A.get(i);
if(seen[num  1])
rep_num = num;
seen[num  1] = true;
actual_sum += num;
}
miss_num = expected_sum  actual_sum + rep_num;
ArrayList<Integer> results = new ArrayList<Integer>();
results.add(rep_num);
results.add(miss_num);
return results;
}
}