Split word based on terms from an array

Posted on

Problem

I was faced with a coding challenge and I’m looking for a more elegant way to achieve the goal.

The challenge

Take the word baseball from the first item in the string
array and split it into two comma separated words using only the terms
provided in the second item of the string array. If it’s impossible to
split the word into two separate comma separated words, return the
string “not possible”.

Example

Given the array new String[]
{“baseball”,”a,all,b,ball,bas,base,cat,code,d,e,quit,z”};

The program should output base,ball

If your program is unable to split baseball using the words from the second item in the string array return “not possible”

I provided the following solution, does anybody have something a little bit more elegant?

public static String ArrayChallenge(String[] strArr) {

    //my code starts here

    String pattern = strArr[0];
    String[] dictionaryWords = strArr[1].split(",");

    Set<String> dictionaryWordSet = new HashSet<>(Arrays.asList(dictionaryWords));

    String[] tempArr = new String[2];

    for (int i = 0; i < pattern.length(); i++) {
        String firstWord = pattern.substring(0, i);

        if(dictionaryWordSet.contains(firstWord)) {
            if(tempArr[0] == null) {
                tempArr[0] = firstWord;
            } else if(tempArr[0].length() < firstWord.length()) {
                tempArr[0] = firstWord;
            }
        }

        String lastWord = pattern.substring(i);

        if(dictionaryWordSet.contains(lastWord)) {
            tempArr[1] = lastWord;
            break;
        }
    }

    if(!pattern.equals(tempArr[0] + tempArr[1])) {
        return "not possible";
    }

    strArr[0] = tempArr[0] + "," + tempArr[1];

    //My code ends here

    return strArr[0];
}

public static void main(String[] args) {
    String[] arr = new String[] {"baseball", "a,all,b,ball,bas,base,cat,code,d,e,quit,z"};

    System.out.println(ArrayChallenge(arr));
}

Solution

The goal when writing code should almost always be to optimize for readability. Only worry about performance when you have a known performance bottleneck. Readable code can always be rewritten to be performant later. The reverse is much harder.

In Idiomatic java, methods begin with a lowercase letter. I understand this is not your method name, but I’m immediately dubious about whoever has created this challenge.

In idiomatic java, there is whitespace between if and (, to visually distinguish control flow keywords from method names.

pattern is a misleading name, as this is not a pattern. It’s the word to be split.

Streaming might make it cleaner to pull out the words into a set.

dictionaryWordSet can be shortened to dictionaryWords. It is not generally necessary to embed the type of a variable in its name.

tempArr is not a helpful name. Try to avoid abbreviations, as they make the code harder to read, and try to find a descriptive name.

Your algorithm appears to be trying to maximize the length of the first word. This is not in the problem description. The more elegant way to handle this is to reverse the loop so you’re starting from the back, not the front. This will allow you to get rid of tempArr altogether, which is good because it’s confusing. Return early if you find a valid partition.

If you make these changes, your code might look something like:

public static String arrayChallenge(String[] strArr) {

    String wordToSplit = strArr[0];
    Set<String> dictionaryWords = Arrays.stream(strArr[1].split(",")).collect(toSet());

    for (int i = wordToSplit.length() - 1; i > 0; i--) {
        String firstWord = wordToSplit.substring(0, i);
        String lastWord = wordToSplit.substring(i);

        if (dictionaryWords.contains(firstWord) && dictionaryWords.contains(lastWord)) {
            return firstWord + "," + lastWord;
        }
    }
    return "not possible";
}

I provided the following solution, does anybody have something a little bit more elegant?

Depends on how you define elegant. Shorter? More readable? Extendable to not only include string matching but also regex matching?

I’d argue that shorter is only guaranteed more elegant if it removes repetition. There can be shorter versions but at least some of them are less readable and therefore less elegant in my opinion.

Which transitions to my next topic, readability. I think improving readability always improves elegance.

Your code has some variables that have a pretty wide scope (in the context of your program): take the variable tmpArr, defined on the 4th line of the logic function. It is used through the entire function, and I have no clue what it is there for from the name.

Here are my main rules for naming variables:

  • I usually always write out abbreviations unless they are well known in my context. Sorry C devs.
  • I never name my variables tmp if they live longer than two lines of code.
  • I name my variables after their purpose.
  • Variables are objects, their names are nouns.

Testing for tempArr[0].length() < firstWord.length() looks unwarranted. Nothing in the problem statement requires the first word to be as long as possible. Besides, it introduces a bug: running "baseball" against the dictionary "base", "ball", "baseb" results in "not possible". Do you see why?

Expanding on the above observation, you better гet rid of tempArray at all, and return as soon as the decomposition is found:

    if (dictionaryWordSet.contains(firstWord) && dictionaryWordSet.contains*secondWord))

Leave a Reply

Your email address will not be published.