Problem
I am working on Leetcode challenge #3 (https://leetcode.com/problems/longest-substring-without-repeating-characters/)
Here is my solution using sliding window and a dictionary. I specifically added start = seen[s[i]]+1
to skip ahead. I am still told I am far slower than most people (for example, given abcdefgdabc
, I am skipping abc
when I see the second d
. I thought this would save ton of time, but apparently this algorithm has a poor run time.
class Solution(object):
def lengthOfLongestSubstring(self, s):
"""
:type s: str
:rtype: int
"""
seen = {}
start = 0
max_size = 0
# check whether left pointer has reached the end yet
while start < len(s):
size = 0
for i in range(start, len(s)):
if s[i] in seen:
start = seen[s[i]]+1
seen = {}
break
else:
seen[s[i]] = i
size += 1
max_size = max(size, max_size)
return max_size
UPDATE
i = 0
j = 0
seen = {}
max_length = 0
while j < len(s):
if s[j] not in seen:
seen[s[j]] = j
j += 1
max_length = max(max_length, j-i)
else:
i = j = seen[s[j]] + 1
seen = {}
return max_length
Solution
In this loop, while start < len(s):
, the string s
is not changing, so the length of the string is not going to change either. Yet, for each iteration, you will evaluate len(s)
in order to evaluate the loop condition. len(s)
is undoubtably a fast operation, but you should still cache the result to avoid the function call every iteration.
limit = len(s)
while start < limit:
# ...
The loop for i in range(start, len(s)):
, you are again calling len(s)
for each iteration (of the outer loop). You could replace this with for i in range(start, limit):
, but we will still have a different issue…
You are (mostly) not using i
inside the loop; you are using s[i]
. And you look up the character s[i]
in 3 different places inside the loop. Instead of looping over the indices, and looking up the character at the index, you should directly loop over the characters of the string.
for ch in s[start:]:
# ...
Except for that pesky seen[...] = i
statement. You still want the index. The enumerate()
function can be used to count as you are looping:
for i, ch in enumerate(s[start:], start):
if ch in seen:
start = seen[ch]+1
seen = {}
break
else:
seen[ch] = i
size += 1
Hint
When scanning the string abcdefgdhijklmn
, after encountering the second d
, you reset to the character after the first d
, and continue your scan. But …
Do you have to? Aren’t all the characters between the first d
and the second d
unique? Is there any way you could preserve that information, and continue on, without needing to restart with the e
? Maybe you don’t need nested loops!