Problem
I know this question has been asked before now but in Python. Recently, I set out to write an anagram in C#. For instance, orchestra can be rearranged into carthorse and the anagram should not have more repeated than the original. The function is meant to check if the two words are the same and return true in this case. I started with putting the two words in two separate dictionaries with the letters as a key and the count as value.
The idea is using a dictionary ensures the keys can not be repeated. Then I looped through the second dictionary (secondString
) and checked if each letter existed as a key in the first dictionary (firstString
). Additionally, if the letter existed, the code checks if they have the same count. The rest of the code is self-explanatory.
using System;
using System.Collections.Generic;
public class AreAnagrams
{
public static bool AreStringsAnagrams(string a, string b)
{
Dictionary<char,int> firstString = new Dictionary<char,int>();
Dictionary<char,int> secondString = new Dictionary<char,int>();
foreach(char character in a)
{
if(firstString.ContainsKey(character)== true)
{
firstString[character]+=1;
}
else
firstString[character]=1;
}
foreach(char character2 in b)
{
if(secondString.ContainsKey(character2)== true)
{
secondString[character2]+=1;
}
else
secondString[character2]=1;
}
foreach(KeyValuePair<char,int> letterValue in secondString)
{
if(firstString.ContainsKey(letterValue.Key))
{
if(firstString[letterValue.Key] != secondString[letterValue.Key]){
return false;
}
}
else
{
return false;
}
}
//throw new NotImplementedException("Waiting to be implemented.");
return true;
}
public static void Main(string[] args)
{
Console.WriteLine(AreStringsAnagrams("momdad", "dadmom"));
}
}
But I scored 75% and I feel this can be improved on. Also, is there any easy conversion from string to dictionary?
Solution
I took your new code and cleaned it up a bit by
- adjusting bracket location
- removed unnecessary nesting of if statements
- gave some operations some breathing room
- fixed some odd indentations
I took your code
using System;
using System.Collections.Generic;
public class AreAnagrams
{
public static bool AreStringsAnagrams(string a, string b)
{
Dictionary<char,int> firstString = new Dictionary<char,int>();
Dictionary<char,int> secondString = new Dictionary<char,int>();
foreach(char character in a)
{
if(firstString.ContainsKey(character))
{
firstString[character]+=1;
}
else{
firstString[character]=1;
}
}
foreach(char character2 in b)
{
if(secondString.ContainsKey(character2))
{
secondString[character2]+=1;
}
else{
secondString[character2]=1;
}
}
if(firstString.Count !=secondString.Count){
return false;
}
else {
foreach(KeyValuePair<char,int> letterValue in secondString)
{
if(firstString.ContainsKey(letterValue.Key))
{
if(firstString[letterValue.Key] != secondString[letterValue.Key]){
return false;
}
}
else
{
return false;
}
}
}
return true;
}
public static void Main(string[] args)
{
Console.WriteLine(AreStringsAnagrams("momdad", "dadmom"));
}
}
and refactored it to this (there is an issue with the if then statement inside last for loop, which I address later)
public class AreAnagrams
{
public static bool AreStringsAnagrams(string a, string b)
{
Dictionary<char,int> firstString = new Dictionary<char,int>();
Dictionary<char,int> secondString = new Dictionary<char,int>();
foreach(char character in a)
{
if(firstString.ContainsKey(character))
{
firstString[character] += 1;
}
else
{
firstString[character] = 1;
}
}
foreach(char character2 in b)
{
if(secondString.ContainsKey(character2))
{
secondString[character2] += 1;
}
else
{
secondString[character2] = 1;
}
}
if(firstString.Count != secondString.Count)
{
return false;
}
else
{
foreach(KeyValuePair<char,int> letterValue in secondString)
{
if(firstString.ContainsKey(letterValue.Key)
&& (firstString[letterValue.Key] != secondString[letterValue.Key]))
{
return false;
}
else
{
return false;
}
}
}
return true;
}
public static void Main(string[] args)
{
Console.WriteLine(AreStringsAnagrams("momdad", "dadmom"));
}
}
I also decided to change this so that there are two separate if statements, this will reduce the O(n) in the optimistic case
(Your code cleaned up)
foreach(KeyValuePair<char,int> letterValue in secondString)
{
if(firstString.ContainsKey(letterValue.Key))
{
if(firstString[letterValue.Key] != secondString[letterValue.Key]){
return false;
}
}
else
{
return false;
}
}
New Code with less complexity
foreach(KeyValuePair<char,int> letterValue in secondString)
{
if (!firstString.ContainsKey(letterValue.Key))
{
return false;
}
if (firstString[letterValue.Key] != secondString[letterValue.Key])
{
return false;
}
}
You could merge these into a single if statement, but for the sake of reading I left them separate here.
If one doesn’t contain the key of the other one, you want to return false, no need to check the other condition.
Another overall thing that you could do to reduce the O(n) is to evaluate if the strings are the same length, immediately and return false if they are not, because you already know they cannot be anagrams of each other.
like this
using System;
using System.Collections.Generic;
public class AreAnagrams
{
public static bool AreStringsAnagrams(string a, string b)
{
if (a.Length != b.Length) {
return false;
}
Dictionary<char,int> firstString = new Dictionary<char,int>();
Dictionary<char,int> secondString = new Dictionary<char,int>();
/// ...
then you can remove the check later in the code.
I did that, and also changed += 1
to ++
twice in your code. I also made it case insensitive, but sentences will not work.
Here is the code that works, I tested with your example and with Joker
and JReko
if (a.Length != b.Length)
{
return false;
}
Dictionary<char, int> firstString = new Dictionary<char, int>();
Dictionary<char, int> secondString = new Dictionary<char, int>();
foreach (char character in a.ToLower())
{
if (firstString.ContainsKey(character))
{
firstString[character]++;
}
else
{
firstString[character] = 1;
}
}
foreach (char character2 in b.ToLower())
{
if (secondString.ContainsKey(character2))
{
secondString[character2]++;
}
else
{
secondString[character2] = 1;
}
}
foreach (KeyValuePair<char, int> letterValue in secondString)
{
if (!firstString.ContainsKey(letterValue.Key)
|| (firstString[letterValue.Key] != secondString[letterValue.Key]))
{
return false;
}
}
return true;
There seems to be a lot of answers but not much explaining why one would be better than the others. Generally when improving on a solution, you either want to cut down on memory usage or execution time. Memory usage is pretty easy to track and in Computer Science we have Big-O notation to figure out worst case execution time.
I’ll preface this with I’m doing this all in my head so I maybe off here and there.
So lets take the final code that Malachi provided and first go through worst case execution time.
- Populating firstString dictionary – O(n)
- Populating secondString dictionary – O(n)
- Then we iterate over the second dictionary – O(n)
- Do a look up in the dictionary – O(1)
And finally return
So execution wise this is O(3n) which would just simplify down to O(n)
Memory wise we are creating a lot of pointers here.
Each dictionary is a pointer, the pointer to the key and the pointer to the value are all either 32bit or 64bit depending on your .Net system. Then the iteration of the KeyValuePair creates yet another pointer along with some underlying logic to create the class that holds those values (I’m too lazy to go look up the open source code).
Overall we have
- 1 pointer for each dictionary
- n pointers for each character in the key
- n pointers for the values
- n char (which is an int)
- n int
And all that multiplied by 2 since we have two of them
So on a 64 bit machine where pointers are 8 bytes and ints are 4 bytes this ends up being 2 * (8 + 8*n + 8*n + 4*n + 4*n) and this doesn’t even include the creation of the KeyValuePair stuff.
Now lets take a look at something that maybe a bit less execution efficient but likely more memory efficient.
public static bool AreStringsAnagrams(string a, string b)
{
if (a == b)
{
return true;
}
if (a == null || b == null)
{
return false;
}
if (a.Length != b.Length)
{
return false;
}
var charInA = a.ToCharArray();
var charInB = b.ToCharArray();
Array.Sort(charInA);
Array.Sort(charInB);
for (int i = 0; i < charInA.Length; i++)
{
if (charInA[i] != charInB[i])
{
return false;
}
}
return true;
}
static void Main(string[] args)
{
Console.Out.WriteLine(AreStringsAnagrams("orchestra", "carthorse"));
Console.ReadKey();
}
So worst case run time here would be:
- Copy all characters in string a to array – O(n)
- Copy all characters in string b to array – O(n)
- Sort charInA – O(nlog(n))
- Sort charInb – O(nlog(n))
- Iterate through both sorted arrays and see if there are any differences – O(n)
Memory would be the following:
- Copy all characters in string a to array – n * 4 bytes
- Copy all characters in string b to array – n * 4 bytes
Total 8 * n bytes.
So it all depends on what you want to optimize here memory or time.
And there are ways of combining both solutions, for example:
public static bool AreStringsAnagrams(string a, string b)
{
if (a == b)
{
return true;
}
if (a == null || b == null)
{
return false;
}
if (a.Length != b.Length)
{
return false;
}
Dictionary<char, int> firstString = new Dictionary<char, int>();
foreach (char character in a)
{
if (firstString.ContainsKey(character))
{
firstString[character]++;
}
else
{
firstString[character] = 1;
}
}
for (int i = 0; i < b.Length; i++)
{
if (!firstString.ContainsKey(b[i]))
{
return false;
}
else
{
firstString[b[i]]--;
}
if (firstString[b[i]] < 0)
{
return false;
}
}
return true;
}
which reduces the number of dictionaries to 1 but still maintains an O(n) worst case runtime.
Data collections are useful but tend to be memory intensive if used incorrectly. Just something to keep in mind :-).
Also broader community, it’s pretty easy to see when someone is posting a homework assignment. I’m posting this larger analysis because the poster said they already got a 100%. Is there a policy regarding doing other peoples homework questions on this site? Seems like we should guide people to the answer in that situation, not give them the answer… or else I wouldn’t have learned anything in university!
It looks like you got your answer for making it more efficient. Here is how you can use LINQ to make your code (in my opinion) more concise, safer, and more readable.
To create a dictionary with the count of each letter, you can group a collection of letters by their value, get each group’s count, and then convert the result into a dictionary:
Dictionary<char,int> firstString = a.GroupBy(x => x).ToDictionary(g => g.Key, g => g.Count());
The body of the foreach loop that helps determine if the strings are anagrams can be condensed into one if statement:
if(!firstString.ContainsKey(letterValue.Key) ||
firstString[letterValue.Key] != secondString[letterValue.Key])
{
return false;
}
You can actually replace that foreach loop and the if statement above it with code similar to this answer to determine if two dictionaries are equal:
return firstString.Count == secondString.Count && !firstString.Except(secondString).Any();
Edit: You can check if the strings are the same length right away to avoid doing extra work when they aren’t. Thanks to JonathanR for the suggestion.
The class then looks like this:
using System;
using System.Collections.Generic;
using System.Linq;
public class AreAnagrams
{
public static bool AreStringsAnagrams(string a, string b)
{
if(a.Count() != b.Count())
{
return false;
}
Dictionary<char,int> firstString = a
.GroupBy(x => x)
.ToDictionary(g => g.Key, g => g.Count());
Dictionary<char,int> secondString = b
.GroupBy(x => x)
.ToDictionary(g => g.Key, g => g.Count());
return firstString.Count == secondString.Count &&
!firstString.Except(secondString).Any();
}
public static void Main(string[] args)
{
Console.WriteLine(AreStringsAnagrams("momdad", "dadmom"));
}
}
If you want, you can extract the code that converts a string to a dictionary into a method, since it’s used twice. Because the string lengths are checked right away, firstString.Count == secondString.Count &&
is not necessary if you want to leave that out.
Edit 2: k2snowman69 pointed out that you could sacrifice some speed for less memory usage by avoiding dictionaries and instead sorting both strings and comparing each character in sequence. Now I have an excuse to post this solution, which not only does that, but is shorter and a lot more readable:
public static bool AreStringsAnagrams(string a, string b)
{
return a.Count() == b.Count() && a.OrderBy(x => x).SequenceEqual(b.OrderBy(x => x));
}
How about this:
bool AreAnagrams(string a, string b)
{
return Enumerable.SequenceEqual(
a.Normalize(NormalizationForm.FormC).OrderBy(c => c),
b.Normalize(NormalizationForm.FormC).OrderBy(c => c));
}
Shorter and very readable. First it normalizes the string, which solves the issue raised in RobH’s answer. Then it arranges the characters of each string into alphabetical order, which should return the same sequence for both strings if they are anagrams. Finally it compares those two sequences.
Your original attempt, and a lot of these answers, seem to involve a hell of a lot of code. I would suggest that unless you’re looking for micro-optimised performance, this is a good implementation.
After adding a little detail based on Jonathan’s help. The full code is
using System;
using System.Collections.Generic;
public class AreAnagrams
{
public static bool AreStringsAnagrams(string a, string b)
{
Dictionary<char,int> firstString = new Dictionary<char,int>();
Dictionary<char,int> secondString = new Dictionary<char,int>();
foreach(char character in a)
{
if(firstString.ContainsKey(character))
{
firstString[character]+=1;
}
else{
firstString[character]=1;
}
}
foreach(char character2 in b)
{
if(secondString.ContainsKey(character2))
{
secondString[character2]+=1;
}
else{
secondString[character2]=1;
}
}
if(firstString.Count !=secondString.Count){
return false;
}
else {
foreach(KeyValuePair<char,int> letterValue in secondString)
{
if(firstString.ContainsKey(letterValue.Key))
{
if(firstString[letterValue.Key] != secondString[letterValue.Key]){
return false;
}
}
else
{
return false;
}
}
}
return true;
}
public static void Main(string[] args)
{
Console.WriteLine(AreStringsAnagrams("momdad", "dadmom"));
}
}