Problem
https://leetcode.com/problems/is-subsequence/
Given a string s and a string t, check if s is subsequence of t.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, “ace” is a subsequence of “abcde” while “aec” is not).
Follow up:
If there are lots of incoming S, say S1, S2, … , Sk where k >= 1B, and you want to check one by one to see if T has its subsequence. In this scenario, how would you change your code?
Credits:
Special thanks to @pbrother for adding this problem and creating all test cases.
Example 1:
Input: s = “abc”, t = “ahbgdc”
Output: true
Example 2:
Input: s = “axc”, t = “ahbgdc”
Output: false
Constraints:
0 <= s.length <= 100
0 <= t.length <= 10^4
Both strings consists only of lowercase characters.
Please review for performance.
using System.Collections.Generic;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace RecurssionQuestions
{
/// <summary>
/// https://leetcode.com/problems/is-subsequence/
/// </summary>
[TestClass]
public class IsSubsequenceTest
{
[TestMethod]
public void TestMethod1()
{
string s = "abc", t = "ahbgdc";
Assert.IsTrue(IsSubsequence(s, t));
Assert.IsTrue(IsSubsequence2(s, t));
}
[TestMethod]
public void TestMethod2()
{
string s = "axc", t = "ahbgdc";
Assert.IsFalse(IsSubsequence(s, t));
Assert.IsFalse(IsSubsequence2(s, t));
}
public bool IsSubsequence(string s, string t)
{
return Helper(s, t, 0, 0);
}
/// <summary>
/// if the letters are equal (letter is found) continue to the next letter of s
/// if the letter is not found in t continue to the next letter of t.
/// if you reached the end of s, then s is a subsequence of t.
/// if you reach the end of t. then s is not a subsequence of t
/// </summary>
/// <param name="s"></param>
/// <param name="t"></param>
/// <param name="sIndex"></param>
/// <param name="tIndex"></param>
/// <returns></returns>
private bool Helper(string s, string t, int sIndex, int tIndex)
{
if (sIndex == s.Length)
{
return true;
}
if (tIndex == t.Length)
{
return false;
}
if (s[sIndex] == t[tIndex])
{
return Helper(s, t, sIndex + 1, tIndex + 1);
}
else
{
return Helper(s, t, sIndex, tIndex + 1);
}
}
public bool IsSubsequence2(string s, string t)
{
if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(t))
{
return false;
}
Dictionary<char, List<int>> map = new Dictionary<char, List<int>>();
//pre-process t
for (int i = 0; i < t.Length; i++)
{
if (!map.ContainsKey(t[i]))
{
map.Add(t[i], new List<int>());
}
map[t[i]].Add(i);
}
int prev = -1;
for (int i = 0; i < s.Length; i++)
{
if (!map.ContainsKey(s[i]))
{
return false;
}
else
{
var list = map[s[i]];
prev = BinarySearch(prev, list, 0, list.Count - 1);
if (prev == -1)
{
return false;
}
prev++;
}
}
return true;
}
private int BinarySearch(int prev, List<int> list, int start, int end)
{
while (start <= end)
{
int mid = (end - start) / 2 + start;
if (list[mid] < prev)
{
start = mid + 1;
}
else
{
end = mid - 1;
}
}
if (start == list.Count)
{
return -1;
}
return list[start];
}
}
}
Solution
From a performance point of perspective, IsSubsequence
will always be faster than IsSubsequence2
, because it only needs to iterate 2 arrays simultaneously and has no further allocations on the heap.
If you want to improve on performance and also make sure, you never get a StackOverFlowException
, you should get rid of the recursion in your Helper
method and instead use a loop.
public static bool IsSubsequence3(string s, string t)
{
var sIndex = 0;
for (var tIndex = 0; tIndex < t.Length; tIndex++)
{
if (s[sIndex] == t[tIndex])
sIndex++;
if(sIndex == s.Length)
return true;
}
return false;
}
When running a Debug build, using the input string s = "abc", t = "ahbgdc";
and executing the Method more than 1,000,000 times, IsSubsequence4
becomes faster. However when you run it in a Release build, the performance is the same, as compiler optimizations kick in and they will implicitly do what is written here explicitly (trade the property access for a local field).
public static bool IsSubsequence4(string s, string t)
{
var sIndex = 0;
var sLength = s.Length;
var tLength = t.Length;
for (var tIndex = 0; tIndex < tLength; tIndex++)
{
if (s[sIndex] == t[tIndex])
sIndex++;
if(sIndex == sLength)
return true;
}
return false;
}
I also tried using AsSpan()
and ToCharArray()
and the strings to get rid of the callvirt System.String.get_Chars
in the IL code and I tried using an Enumerator. Both where slower than IsSubsequence3
public static bool IsSubsequence5(string s, string t)
{
var sIndex = 0;
var sLength = s.Length;
var tLength = t.Length;
var sChars = s.AsSpan();
var tChars = t.AsSpan();
for (var tIndex = 0; tIndex < tLength; tIndex++)
{
if (sChars[sIndex] == tChars[tIndex])
sIndex++;
if(sIndex == sLength)
return true;
}
return false;
}
public static bool IsSubsequence6(string s, string t)
{
using var sEnum = s.GetEnumerator();
sEnum.MoveNext();
using var tEnum = t.GetEnumerator();
while (tEnum.MoveNext())
{
if (sEnum.Current == tEnum.Current)
{
if(!sEnum.MoveNext())
return true;
}
}
return false;
}