Problem
I recently interviewed with a company for software engineering position. I was asked the question of finding the longest unique sub-string in a string.
My algorithms was as follows:
Start from the left-most character, and keep storing the character in a hash table with the key as the character and the value as the index_where_it_last_occurred. Add the character to the answer string as long as its not present in the hash table. If we encounter a stored character again, I stop and note down the length. I empty the hash table and then start again from the right index of the repeated character. The right index is retrieved from the (index_where_it_last_occurred) flag. If I ever reach the end of the string, I stop and return the longest length.
For example, say the string was abcdecfg
.
I start with a
, store in hash table. I store b and so on till e. Their indexes are stored as well. When I encounter c
again, I stop since it’s already hashed and note down the length which is 5. I empty the hash table, and start again from the right index of the repeated character. The repeated character being c
, I start again from the position 3 ie., the character d
. I keep doing this while I don’t reach the end of string.
I know there could be better algorithms than this. But I am interested in knowing what the time complexity of this algorithm will be. I believe it’ll be O(n2).
import java.util.*;
public class longest
{
static int longest_length = -1;
public static void main(String[] args)
{
Scanner in = new Scanner(System.in);
String str = in.nextLine();
calc(str,0);
System.out.println(longest_length);
}
public static void calc(String str,int index)
{
if(index >= str.length()) return;
int temp_length = 0;
LinkedHashMap<Character,Integer> map = new LinkedHashMap<Character,Integer>();
for (int i = index; i<str.length();i++)
{
if(!map.containsKey(str.charAt(i)))
{
map.put(str.charAt(i),i);
++temp_length;
}
else if(map.containsKey(str.charAt(i)))
{
if(longest_length < temp_length)
{
longest_length = temp_length;
}
int last_index = map.get(str.charAt(i));
// System.out.println(last_index);
calc(str,last_index+1);
break;
}
}
if(longest_length < temp_length)
longest_length = temp_length;
}
}
Solution
static int longest_length = -1;
No. Static is useless here and makes you method much less usable. A non-static variable would be better, but actually, why not return a result???
There’s no longest_length
in Java. Maybe longestLength
.
Better, return the substring, the length can be easily obtained from it.
Separated printing and computation, good.
if(index >= str.length()) return;
Spacing (after if
). Also according to conventions, you should always use braces (I personally don’t).
LinkedHashMap<Character,Integer> map = new LinkedHashMap<Character,Integer>();
Use Java 7 diamonds or Guava’s Maps.newLinkedHashMap()
to save yourself repeated generics.
You need no linked map.
calc(str,last_index+1);
I find the recursion confusing here. You really do nothing, but a simple loop, so use a loop (Even a goto
would be clearer than recursion here).
The algorithm indeed runs in O(n*n)
, which can be shown with a string like
abcde.... abcde....
where you always run through half of the string. I was assuming an unlimited alphabet here, more precisely the complexity is O(n*alphabetSize)
as no runs can be longer than alphabetSize
.
A much better O(n)
algorithm would keep track of the starting and ending positions. Wherever you see a duplicate, advance the start. Sounds damn simple.