Cops and robbers program to find the most frequent character in a string

Posted on

Problem

I made this program to train me for a competition. The hypothetical question I’ve used was

The cops need help finding who is the most probable criminal to commit a crime.
Witnesses tell them multiple different people, making it hard to find the criminal.
Use your knowledge of coding and the one-letter identifications provided for the criminals
to find who most likely did it and print.

It was fun making the code; however, it felt like it took too long for such a simple code. Is there any way to make this simpler or better?

import java.util.*;

public class CopsAndRobbers {

    public static void main(String[] args) {
        //The cops need help finding who is the most probable criminal to commit a crime
        //Witnesses tell them multiple different people, making it hard to find the criminal
        //Use your knowledge of coding and the one-letter identifications provided for the criminals
        //To find who most likely did it and print.

        TreeSet<String> done = new TreeSet<String>();
        ArrayList<String> all = new ArrayList<String>();
        ArrayList<Integer> num = new ArrayList<Integer>();
        ArrayList<String> some = new ArrayList<String>();
        @SuppressWarnings("resource")
        Scanner sc = new Scanner(System.in);
        String input = sc.nextLine();

        for(int i = 0; i < input.length(); i++)
        {
            done.add(String.valueOf(input.charAt(i)));
            all.add(String.valueOf(input.charAt(i)));
        }
        int amount = done.size();
        for(int i = 0; i < amount; i++)
        {
            String letter = done.first();
            int tempNum = 0;
            for(int j = 0; j < all.size(); j++)
            {
                if(all.get(j).equals(letter))
                {
                    tempNum++;
                }
            }
            num.add(tempNum);
            some.add(letter);
            done.remove(letter);
        }
        int highest = 0;
        int overallIndex = 0;
        for(int i = 0; i < num.size(); i++)
        {
            if(num.get(i) > highest)
            {
                highest = num.get(i);
                overallIndex = i;
            }
        }
        System.out.println("The most likely criminal is " + some.get(overallIndex) + " with " + highest + " witnesses.");
    }

}

Sample Input

aggdhaamfa

Sample Output

The most likely criminal is a with 4 witnesses.

Solution

try-with-resources

Since Java 7, you can use try-with-resources on your Scanner for safe and efficient handling of the underlying I/O resource.

// Scanner scanner = new Scanner(System.in);
// ...
// scanner.close();
try (Scanner scanner = new Scanner(System.in)) {
    // ...
}

Interfaces over implementations, and generic type inference

Declarations should be done as List instead of ArrayList, and since Java 7, you are able to rely on generic type inference for simplification too:

// ArrayList<String> list = new ArrayList<String>();
List<String> list = new ArrayList<>();

Index-based retrieval across different List instances

Your usage of the num and some lists are merely the keys and values to what should be modeled as a Map, i.e. the number of occurrences to the ‘smallest’

Problem formulation (and unit testing)

What will be the expected solution if there is an equal number of witnesses for two suspects? Or when there are no witnesses?

From your implementation, it suggests that it will pinpoint the ‘smallest’ character for the former, and will simply crash with an IndexOutOfBoundsException at some.get(overallIndex) for the latter, since some will be empty and overallIndex is initialized to 0.

This suggests that you should be writing a unit test for your implementation too (after putting the logic in its own method), to make sure your expectations are verified. On a related note, even your sample output is wrong, there’s only 4 occurrences of a in aggdhaamfa.

Stream-based approach

Putting it altogether, the whole logic can be quite easily done in one return statement:

private static Map<Character, Integer> maxOccurrence(String input) {
    return Stream.of(Objects.toString(input, "").chars()
            .boxed()
            .collect(Collectors.groupingBy(
                    i -> (char) i.intValue(),
                    Collectors.counting()))
            .entrySet().stream()
            .collect(Collectors.groupingBy(
                    Entry::getValue,
                    TreeMap::new,
                    Collectors.mapping(
                            Entry::getKey,
                            Collectors.reducing('~',
                                    (a, b) -> a.compareTo(b) < 0 ? a : b))))
            .lastEntry())
            .filter(Objects::nonNull)
            .collect(Collectors.toMap(
                    Entry::getValue,
                    entry -> entry.getKey().intValue()));
}

For starters, we deal with null input safely via Objects.toString(Object, String), before we stream on its characters via String.chars().

Then, the first Collectors.groupingBy(Function, Collector) gives us our preliminary result, counting() each character’s presence.

Next, we need to then group by their occurrence, so the second Collectors.groupingBy(Function, Supplier, Collector) is used in conjunction with a TreeMap so that we can get our highest repeated character(s) as the lastEntry().

The character(s) are checked against each other using Collectors.reducing(T, BinaryOperator) to find out which is the ‘smallest’, using ~ as an initial value as it is an arbitrary large character compared to the standard English alphabet.

Finally, since lastEntry() can potentially give us a null for an empty Map, we wrap it up in a Stream just so we can filter null away, before finally converting the Entry representation into a Map via Collectors.toMap(Function, Function). This will return an empty Map at the very least.

Plain-old iteration

Of course, if the above approach looks too… advanced, the plain-old iteration approach is even simpler and easier to comprehend (with only a tiny bit of Java 8/9 magic):

private static Map<Character, Integer> maxOccurrence(String input) {
    Map<Character, Integer> occurrences = new HashMap<>();
    int maxCount = 0;
    char mostRepeated = '~';
    for (char c : Objects.toString(input, "").toCharArray()) {
        if (occurrences.merge(c, 1, Integer::sum) > maxCount
                && c < mostRepeated) {
            maxCount = occurrences.get(c);
            mostRepeated = c;
        }
    }
    return maxCount == 0
            ? Collections.emptyMap()
            : Map.of(mostRepeated, occurrences.get(mostRepeated));
}

Here, we update our occurrences map by Map.merge(K, V, BiFunction) and compare against maxCount/mostRepeated sequentially. The use of Collections.emptyMap() and Map.of(K, V) is slightly better too, as they ensure the results are immutable.

for(int i = 0; i < input.length(); i++)
{
    done.add(String.valueOf(input.charAt(i)));
    all.add(String.valueOf(input.charAt(i)));
}

Instead of iterating over the input string like that, converting every single character to a new String, actually doing all of this twice, and adding each to a collection, you can convert the string to a list of Strings (for each character) like this:

List<String> inputChars = Arrays.asList(input.split(""));

Note that previously to Java 8 this results in the first element being an empty string, which must be handled accordingly.

Then add all of the characters in the array to the collections:

done.addAll(inputChars);
all.addAll(inputChars);

int highest = 0;
int overallIndex = 0;
for(int i = 0; i < num.size(); i++)
{
    if(num.get(i) > highest)
    {
        highest = num.get(i);
        overallIndex = i;
    }
}

To better structure your code, move the loop to a method findIndexOfHighest(). Your overallIndex actually is the indexOfHighest. Then you get the index of the highest element and the highest element like this:

int indexOfHighest = findIndexOfHighest();
int highest = num.get(indexOfHighest);

Add blank lines between blocks of code, e. g. between for loops. That makes your code much more readible.


int amount = done.size();
for(int i = 0; i < amount; i++)
{
    String letter = done.first();
    int tempNum = 0;
    for(int j = 0; j < all.size(); j++)
    {
        if(all.get(j).equals(letter))
        {
            tempNum++;
        }
    }
    num.add(tempNum);
    some.add(letter);
    done.remove(letter);
}

Here the inner loop can be a for-each loop. Also it should be in a separate method, because what it does is countOccurences(String s, char c). The variable tempNum should be called count or occurences.

Leave a Reply

Your email address will not be published. Required fields are marked *