Problem
I know there is a simpler way of doing this, but I just really can’t think of it right now. Can you please help me out?
String sample = "hello world";
char arraysample[] = sample.toCharArray();
int length = sample.length();
//count the number of each letter in the string
int acount = 0;
int bcount = 0;
int ccount = 0;
int dcount = 0;
int ecount = 0;
int fcount = 0;
int gcount = 0;
int hcount = 0;
int icount = 0;
int jcount = 0;
int kcount = 0;
int lcount = 0;
int mcount = 0;
int ncount = 0;
int ocount = 0;
int pcount = 0;
int qcount = 0;
int rcount = 0;
int scount = 0;
int tcount = 0;
int ucount = 0;
int vcount = 0;
int wcount = 0;
int xcount = 0;
int ycount = 0;
int zcount = 0;
for(int i = 0; i < length; i++)
{
char c = arraysample[i];
switch (c)
{
case 'a':
acount++;
break;
case 'b':
bcount++;
break;
case 'c':
ccount++;
break;
case 'd':
dcount++;
break;
case 'e':
ecount++;
break;
case 'f':
fcount++;
break;
case 'g':
gcount++;
break;
case 'h':
hcount++;
break;
case 'i':
icount++;
break;
case 'j':
jcount++;
break;
case 'k':
kcount++;
break;
case 'l':
lcount++;
break;
case 'm':
mcount++;
break;
case 'n':
ncount++;
break;
case 'o':
ocount++;
break;
case 'p':
pcount++;
break;
case 'q':
qcount++;
break;
case 'r':
rcount++;
break;
case 's':
scount++;
break;
case 't':
tcount++;
break;
case 'u':
ucount++;
break;
case 'v':
vcount++;
break;
case 'w':
wcount++;
break;
case 'x':
xcount++;
break;
case 'y':
ycount++;
break;
case 'z':
zcount++;
break;
}
}
System.out.println ("There are " +hcount+" h's in here ");
System.out.println ("There are " +ocount+" o's in here ");
Solution
Oh woah! xD It’s just.. woah! What patience you have to write all those variables.
Well, it’s Java so you can use a HashMap.
Write something like this:
String str = "Hello World";
int len = str.length();
Map<Character, Integer> numChars = new HashMap<Character, Integer>(Math.min(len, 26));
for (int i = 0; i < len; ++i)
{
char charAt = str.charAt(i);
if (!numChars.containsKey(charAt))
{
numChars.put(charAt, 1);
}
else
{
numChars.put(charAt, numChars.get(charAt) + 1);
}
}
System.out.println(numChars);
- We do a
for
loop over all the string’s characters and save the current char in thecharAt
variable - We check if our HashMap already has a
charAt
key inside it- If it’s true we will just get the current value and add one.. this means the string has already been found to have this char.
- If it’s false (i.e. we never found a char like this in the string), we add a key with value 1 because we found a new char
- Stop! Our HashMap will contain all chars (keys) found and how many times it’s repeated (values)!
A possibly faster, and at least more compact version than using a HashMap
is to use a good old integer array. A char
can actually be typecasted to an int
, which gives it’s ASCII code value.
String str = "Hello World";
int[] counts = new int[(int) Character.MAX_VALUE];
// If you are certain you will only have ASCII characters, I would use `new int[256]` instead
for (int i = 0; i < str.length(); i++) {
char charAt = str.charAt(i);
counts[(int) charAt]++;
}
System.out.println(Arrays.toString(counts));
As the above output is a bit big, by looping through the integer array you can output just the characters which actually occur:
for (int i = 0; i < counts.length; i++) {
if (counts[i] > 0)
System.out.println("Number of " + (char) i + ": " + counts[i]);
}
-
Actually, there is an even better structure than maps and arrays for this kind of counting:
Multiset
s. Documentation of Google Guava mentions a very similar case:The traditional Java idiom for e.g. counting how many times a word occurs in a document is something like:
Map<String, Integer> counts = new HashMap<String, Integer>(); for (String word : words) { Integer count = counts.get(word); if (count == null) { counts.put(word, 1); } else { counts.put(word, count + 1); } }
This is awkward, prone to mistakes, and doesn’t support collecting a variety of useful statistics, like the total number of words. We can do better.
With a multiset you can get rid of the
contains
(orif (get(c) != null)
) calls, what you need to call is a simpleadd
in every iteration. Callingadd
the first time adds a single occurrence of the given element.String input = "Hello world!"; Multiset<Character> characterCount = HashMultiset.create(); for (char c: input.toCharArray()) { characterCount.add(c); } for (Entry<Character> entry: characterCount.entrySet()) { System.out.println(entry.getElement() + ": " + entry.getCount()); }
(See also: Effective Java, 2nd edition, Item 47: Know and use the libraries The author mentions only the JDK’s built-in libraries but I think the reasoning could be true for other libraries too.)
-
int length = sample.length(); .... for (int i = 0; i < length; i++) { char c = arraysample[i];
You could replace these three lines with a foreach loop:
for (char c: arraysample) {
-
int length = sample.length(); .... for (int i = 0; i < length; i++) { char c = arraysample[i];
You don’t need the
length
variable, you could usesample.length()
in the loop directly:for (int i = 0; i < sample.length(); i++) {
The JVM is smart, it will optimize that for you.
-
char arraysample[] = sample.toCharArray(); int length = sample.length(); for (int i = 0; i < length; i++) { char c = arraysample[i];
It’s a little bit confusing that the loop iterating over
arraysample
but usingsample.length()
as the upper bound. Although their value is the same it would be cleaner to usearraysample.length
as the upper bound.
Yes… there is a simpler way. You have two choices, but each about the same. Use an Array, or a Map. The more advanced way of doing this would certainly be with a Map.
Think about a map as a type of array where instead of using an integer to index the array you can use anything. In our case here we’ll use char as the index. Because chars are ints you could just use a simple array in this case, and just mentally think of ‘a’ as 0, but we’re going to take the larger step today.
String sample = "Hello World!";
// Initialization
Map <Character, Integer> counter = new HashMap<Character, Integer>();
for(int c = 'a'; c <= 'z'; c++){
counter.put((Character)c, 0);
}
// Populate
for (int i = 0; i < sample.length(); i++){
if(counter.containsKey(sample.charAt(i)))
counter.put(sample.charAt(i), counter.get(sample.charAt(i)) + 1 );
}
Now anytime you want to know how many of whatever character there was just call this method
int getCount(Character character){
if(counter.containsKey(character))
return counter.get(character);
else return 0;
}
Note: This only will work for counting punctuation.
The use of functional programming can even simplify this problem to a great extent.
public static void main(String[] args) {
String myString = "hello world";
System.out.println(
myString
.chars()
.mapToObj(c -> String.valueOf((char) c))
.filter(str -> !str.equals(" "))
.collect(Collectors.groupingBy(ch -> ch, Collectors.counting()))
);
}
First we get the stream of integers from the string
myString.chars()
Next we transform the integers into string
mapToObj(c -> String.valueOf((char) c))
Then we filter out the charcters we don’t need to consider, for example above we have filtered the spaces.
filter(str -> !str.equals(" "))
Then finally we collect them grouping by the characters and counting them
collect(Collectors.groupingBy(ch -> ch, Collectors.counting()))