# Determine if a string has all unique characters

Posted on

Problem

I’m practicing algorithms (for future interviews) while learning about Big O notation and was wondering if there’s a faster or more optimal way of writing an algorithm to see if a string has all unique characters (assuming you can’t use additional data structures).

- (BOOL)containsUniqueCharacters:(NSString *)string {

//Make sure that it's case insensitve
string = [string uppercaseString];

//Create a buffer for our string
NSUInteger length = [string length];
unichar buffer[length];
[string getCharacters:buffer range:NSMakeRange(0, length)];

//Iterate through each of our characters
for (int i = 0; i < length; i++) {

//Iterate through every other character and compare it with our current character
for (int n = 0; n < length; n++) {

//Compare current character with every other character
if (buffer[i] == buffer[n] && i != n) {
return NO;
}
}
}

return YES;
}


If I’m correct, is the time complexity for this O(n2)$O\left({n}^{2}\right)$$O(n^2)$, and the space complexity O(1)$O\left(1\right)$$O(1)$?

Solution

You are right on the time complexity, but wrong on the space complexity.

because you copy the data from the String to the buffer, you incur an O(n)$O\left(n\right)$$O(n)$ space complexity too.

vnp has suggested an alternate algorithm that scales at better than O(n2)$O\left({n}^{2}\right)$$O(n^2)$, but you can also significantly improve your current algorithm. Your current system compares each character with every other character twice…. Consider the word “and”, you will compare a with n and d, then compare n with a and d, but you have already done the n and a comparison.

What you need to do is alter your inner loop to restrict the range of the comparison. Consider the following:

//Iterate through each of our characters
// **Note, start from 1
for (int i = 1; i < string.length; i++) {

unichar letter = [string characterAtIndex:i];

//Iterate through all previous other characters and compare it with our current character
for (int n = 0; n < i; n++) {

//Compare current character with every other character
if (letter == [string characterAtIndex:n]) {
return NO;
}
}
}


The above code still checks each letter, but only does it once. It is smart about the indices, so you only compare previous letters with the current, and it does half the comparisons. Also, by using a starting offset of 1, and a < comparison on the n loop, you never compare the character against itself.

For smallish strings (less than 100 characters or so), this O(n2)$O\left({n}^{2}\right)$$O(n^2)$ algorithm, using no additional space (other than the letter variable), will be very fast still.

I would suggest an algorithmically better O(n) solution by using a buffer or hashtable or other such structure to record whether or not you have “seen” a character.

For example if you are dealing with a basic string containing char, then you can get away with a bitset of size 256. Each character you see marks a bit. If the bit was already marked, return YES.

Furthermore, if the size of the string you’re testing is longer than 256, then that is an automatic NO (not all unique chars) as well.

Since you are dealing with unicode for which the possible values a character can take are very many, then you will want to use a hash table. It will be dramatically faster than even any O(n log n) sorting method.

Your analysis is correct. Sorting an original string, followed by a linear scan to fund duplicates, reduces the time complexity to O(nlogn))$O\left(n\mathrm{log}n\right)\right)$$O(n log n))$, for a price of O(logn)$O\left(\mathrm{log}n\right)$$O(log n)$ space. If the string modification is not allowed, an additional O(n)$O\left(n\right)$$O(n)$ space penalty is incurred.