# Finding the longest unique sub-string in a string

Posted on

Problem

I recently interviewed with a company for software engineering position. I was asked the question of finding the longest unique sub-string in a string.

My algorithms was as follows:

Start from the left-most character, and keep storing the character in a hash table with the key as the character and the value as the index_where_it_last_occurred. Add the character to the answer string as long as its not present in the hash table. If we encounter a stored character again, I stop and note down the length. I empty the hash table and then start again from the right index of the repeated character. The right index is retrieved from the (index_where_it_last_occurred) flag. If I ever reach the end of the string, I stop and return the longest length.

For example, say the string was abcdecfg.

I start with a, store in hash table. I store b and so on till e. Their indexes are stored as well. When I encounter c again, I stop since it’s already hashed and note down the length which is 5. I empty the hash table, and start again from the right index of the repeated character. The repeated character being c, I start again from the position 3 ie., the character d. I keep doing this while I don’t reach the end of string.

I know there could be better algorithms than this. But I am interested in knowing what the time complexity of this algorithm will be. I believe it’ll be O(n2)$O\left({n}^{2}\right)$$O(n^2)$.

import java.util.*;
public class longest
{
static int longest_length = -1;
public static void main(String[] args)
{
Scanner in = new Scanner(System.in);
String str = in.nextLine();
calc(str,0);
System.out.println(longest_length);
}

public static void calc(String str,int index)
{
if(index >= str.length()) return;
int temp_length = 0;
for (int i = index; i<str.length();i++)
{
if(!map.containsKey(str.charAt(i)))
{
map.put(str.charAt(i),i);
++temp_length;
}
else if(map.containsKey(str.charAt(i)))
{
if(longest_length < temp_length)
{
longest_length = temp_length;
}
int last_index = map.get(str.charAt(i));
//              System.out.println(last_index);
calc(str,last_index+1);
break;
}
}
if(longest_length < temp_length)
longest_length = temp_length;
}
}


Solution

static int longest_length = -1;


No. Static is useless here and makes you method much less usable. A non-static variable would be better, but actually, why not return a result???

There’s no longest_length in Java. Maybe longestLength.

Better, return the substring, the length can be easily obtained from it.

Separated printing and computation, good.

if(index >= str.length()) return;


Spacing (after if). Also according to conventions, you should always use braces (I personally don’t).

LinkedHashMap<Character,Integer> map = new LinkedHashMap<Character,Integer>();


Use Java 7 diamonds or Guava’s Maps.newLinkedHashMap() to save yourself repeated generics.

calc(str,last_index+1);


I find the recursion confusing here. You really do nothing, but a simple loop, so use a loop (Even a goto would be clearer than recursion here).

The algorithm indeed runs in O(n*n), which can be shown with a string like

abcde.... abcde....


where you always run through half of the string. I was assuming an unlimited alphabet here, more precisely the complexity is O(n*alphabetSize) as no runs can be longer than alphabetSize.

A much better O(n) algorithm would keep track of the starting and ending positions. Wherever you see a duplicate, advance the start. Sounds damn simple.