Problem
I’ve a list of names that I’d like to iterate over and extract from it those names that have two letters that are the same right next to each other (i.e. Bill, Terry, Debby, Aaron, etc.).
The below code I have accomplishes this task, however I was wondering if there was a more efficient or line-friendly method of approaching this problem.
I started thinking about a solution involving LINQ’s Where<>
method, but then I wasn’t sure how to use both a string
variable and an int
index variable inside a lambda expression.
Also, does there exist a way to accomplish this task in faster than O(n^2)
time?
public static List<string> DoubleLetters(string[] array)
{
List<string> doubleLetter = new List<string>();
for (int i = 0; i < array.Length; i++)
{
for (int j = 0; j < array[i].Length - 1; j++)
{
if (char.ToLower(array[i][j]) == char.ToLower(array[i][j + 1]))
{
doubleLetter.Add(array[i]);
break;
}
}
}
return doubleLetter;
}
Solution
- The outer
for
loop can be replaced with aforeach
loop:foreach (var name in names)
. It removes the need for two counter variables, and simplifies indexing of the array, making the code easier to read. - This method would be more flexible if it accepted a single string and returned a boolean:
bool HasRepeatingLetters(string str)
. That allows you to use it for single names as well as for multiple names. - For multiple names, the above method can easily be combined with Linq:
var matchingNames = names.Where(name => HasRepeatingLetters(name)).ToArray()
(you can remove theToArray
(orToList
) call if you want lazy evaluation). In this case, you don’t even need an anonymous function, because the signature ofHasRepeatingLetters
already matches whatWhere
requires:names.Where(HasRepeatingLetters)
. - Keep in mind that
ToLower
is culture-dependent (not all languages use the same upper/lowercase conversion rules). You may want to give your function an extra (optional)CultureInfo
argument to control its behavior. - Performance of your function should be
O(n)
, notO(n^2)
. Passing in twice as many strings, or strings that are twice as long, requires twice as much work, not four times as much. It would only beO(n^2)
if, for example, you compared each string against every other string. - Calling
ToLower
on the given string would simplify your code, and it should be a bit faster too for names without repeating characters. You could also keep yourfor
loop, but store the previous lower-cased character as you go, so you don’t need to convert it to lower-case twice. That would make the code a little more complicated, so it’s probably not worth the hassle. Of course, if performance is important, you should actually measure it.
Here is a LINQ version using Zip.
Enumerate over each character in a word and compares adjacent characters by offsetting, one enumeration, by one character, using ‘Skip(1)’.
Pair characters with ‘Zip’.
A match on ‘Any’ pair confirms.
var doubleLetters =
array
.Where(word => word.Zip(
word.Skip(1),
(ch1, ch2) => char.ToUpper(ch1) == char.ToUpper(ch2))
.Any(
match => match))
.ToList();
I was able to find a solution to my question through the use of lovely, lovely regular expressions (and LINQ, conveniently enough)!
I was able to extract the names Terry
, Bill
, Aaron
, and Debby
using the below line of code:
string[] newArray = array.Where(x => new Regex(@"((?i)[a-zd])1").IsMatch(x)).ToArray();
…where
(?i)
– Ignore case of letter in the name(a-zd)
– Look for any letter of the alphabet1
– Repeat search of(a-zd)
group
Still not too sure about time complexity, though. Regex time complexity eludes me 😛
For a while I thought you code was broken but I had a bad test
I think this would work but not much different than your
I only wrote it because I thought your was broken
public static List<string> DoubleLetters2(string[] array)
{
List<string> doubleLetter = new List<string>();
foreach(string s in array)
{
if (DoubleLettersHelper(s))
{
doubleLetter.Add(s);
}
}
return doubleLetter;
}
public static bool DoubleLettersHelper(string array)
{
string lower = array.ToLower();
for (int i = 0; i < lower.Length-1; i++)
{
if (lower[i] == lower[i + 1])
{
return true;
}
}
return false;
}