Hamming distance implementation in Java

Posted on

Problem

I’m exploring String relativity and comparison techniques, and I found Hamming distance algorithm in my search so I give a try to implement that approach in Java and below I’m sharing my implementation.

Please let me know if this approach is correct or I need to improve it more.

HammingDistance.java

public interface HammingDistance {

 public int distance(String first,String second);
    
}

HammingDistanceImpl.java

public class HammingDistanceImpl implements HammingDistance {

    @Override
    public int distance(String first, String second) {

        first = first.trim().toLowerCase();
        first = flipFyString(first);

        second = second.trim().toLowerCase();
        second = flipFyString(second);

        // System.out.println(first + " " + second);

        if (first.length() - second.length() == 1 || first.length() - second.length() == 2
                || first.length() - second.length() == 3) {
            // System.out.println("Second String bit shorter in length to First String");
            return calculatedDistance(second, first);
        }

        if (second.length() - first.length() == 1 || second.length() - first.length() == 2
                || second.length() - first.length() == 3) {
            // System.out.println("First String bit shorter in length to Second String");
            return calculatedDistance(first, second);
        }

        if (first.length() == second.length()) {
            // System.out.println("Both String Are Equals");
            return calculatedDistance(first, second);
        }

        if (first.length() == second.length() / 2) {
            // System.out.println("First String is Half of Second String");
            int result = calculatedDistance(first, second.substring(0, second.length() / 2));

            return result;
        }

        if (first.length() == second.length() / 2) {

            int result2 = calculatedDistance(first, second.substring(second.length() / 2, second.length()));

            return result2;
        }

        if (second.length() == first.length() / 2) {
            // System.out.println("Second String is Half of First String");
            int result = calculatedDistance(second, first.substring(0, first.length() / 2));

            return result;
        }

        if (second.length() == first.length() / 2) {

            int result2 = calculatedDistance(second, first.substring(first.length() / 2, first.length()));

            return result2;
        }

        return 0;
    }

    private int calculatedDistance(String first, String second) {
        int messasureDistance = 0;
        char[] firstStrCharArray = first.toCharArray();
        char[] secondStrCharArray = second.toCharArray();

        // System.out.println(first + " " + second);

        for (int i = 0; i < firstStrCharArray.length; i++) {

            if (firstStrCharArray[i] != secondStrCharArray[i]) {
                messasureDistance++;
            }

        }

        return messasureDistance;
    }

    private String flipFyString(String str) {

        StringBuffer word = new StringBuffer();
        char[] charArray = str.toCharArray();

        for (int i = 0; i <= charArray.length; i++) {
            int j = i + 1;
            if (j <= charArray.length - 1 && charArray[i] != charArray[j]) {
                word.append(charArray[i]);
            }
        }

        word.append(charArray[charArray.length - 1]);
        return word.toString();
    }

}

HammingImplTest.java

import java.util.ArrayList;
import java.util.List;
import java.util.OptionalInt;

public class HammingImplTest {

    public static void main(String... strings) {

        List<Boolean> resultlist = new ArrayList<>();

        resultlist.add(testHammingAlgo("India is near to pakistan.", "India is neighbour of pakistan."));
        resultlist.add(testHammingAlgo("India is near to pakistan.", "India is neighbour of nepal."));
        resultlist.add(testHammingAlgo("Simmant Yadav", "Seemant Yadav"));
        resultlist.add(testHammingAlgo("I love code in java", "I love coding in java"));
        /*
         * Specific result count
         * 
         * System.out.println(resultlist.stream().filter(data->!data).count());
         * if(resultlist.stream().filter(data->data).count()==3 &&
         * resultlist.stream().filter(data->!data).count()==1) {
         * System.out.println("Test cases satisfied"); }
         */
        resultlist.forEach(data -> System.out.println(data));

    }

    private static boolean testHammingAlgo(String first, String second) {

        boolean result = false;

        HammingDistanceImpl hamingImpl = new HammingDistanceImpl();

        String[] sentenceFirst = first.split(" ");
        String[] sentenceTwo = second.split(" ");

        ArrayList<Integer> distance = new ArrayList<>();

        for (int i = 0; i <= sentenceFirst.length - 1; i++) {
            distance.add(hamingImpl.distance(sentenceFirst[i], sentenceTwo[i]));
        }

        OptionalInt maxD = distance.stream().mapToInt(v -> v).max();

        if (maxD.getAsInt() <= 3) {
            result = true;
        }
        return result;
    }

}

Test Results :

true
false
true
true

Solution

I have some suggestions.

Don’t edit the parameter values, create your own instance / copy

In my opinion, this is a bad habit, since some objects in Java are not immutable (Collections, Date, ect) when passed as parameters, you will edit the original instance of the caller. In your code, since the string is immutable, you are fine, but keep this in mind.

HammingDistanceImpl#distance

Before

public int distance(String first, String second) {
   first = first.trim().toLowerCase();
   //[...]
}

After

public int distance(String first, String second) {
   String firstValue = first.trim().toLowerCase();
   //[...]
}

Move the conversion of the string in the method flipFyString

Since this method seems to need the trimmed and lowered case version of it, I suggest that you do this in the method.

Before

public int distance(String first, String second) {
   first = first.trim().toLowerCase();
   first = flipFyString(first);
   //[...]
}

After

public int distance(String first, String second) {
   first = flipFyString(first);
   //[...]
}

private String flipFyString(String str) {
   String value = str.trim().toLowerCase();
   //[...]
}

Use java.lang.StringBuilder instead of java.lang.StringBuffer (flipFyString)

As stated in the java documentation, the java.lang.StringBuilder class is recommended over the java.lang.StringBuffer since it performs no synchronization and is generally faster.

Extract the result of .length() in variables

In your code, instead of calling multiples times the method, I suggest that you extract the result in a variable and reuse it.

Before

//[...]
if (first.length() - second.length() == 1 || first.length() - second.length() == 2
      || first.length() - second.length() == 3) {
   return calculatedDistance(second, first);
}
//[...]

After

//[...]
int firstLength = firstValue.length();
int secondLength = secondValue.length();

if (firstLength - secondLength == 1 || firstLength - secondLength == 2 || firstLength - secondLength == 3) {
   return calculatedDistance(secondValue, firstValue);
}
//[...]

Instead of using a variable to store the value, return the result directly

In your code, you can directly return the result, instead of storing it into a variable and then, returning it. This will make the code shorter and easier to read.

Before

//[...]
if (first.length() == second.length() / 2) {
   int result2 = calculatedDistance(first, second.substring(second.length() / 2, second.length()));
   return result2;
}
//[...]

After

//[...]
if (first.length() == second.length() / 2) {
   return calculatedDistance(first, second.substring(second.length() / 2, second.length()));
}
//[...]

Review your logic, since you have duplicated condition ‘first.length() == second.length() / 2’

You have two if with the same condition, the second one is dead code at the moment.

Leave a Reply

Your email address will not be published. Required fields are marked *