Increase performance of removing character from String

Posted on

Problem

I wrote this function as part of interview practice. This method removes a character from a given string.

I was wondering how I could make this code more efficient when it comes to runtime/space. I think my code is O(n) and I’m not sure if I can increase the efficiency. However, perhaps using things such as StringBuffer or StringBuilder would increase it a bit? I’m not sure as I’m still a bit new to Java.

public static String takeOut(String str, char c) {
    int len = str.length();
    String newstr = "";
    for (int i = 0; i < len; i++) {
        if (str.charAt(i) != c) {
            newstr = newstr + str.charAt(i);
        }
    }
    return newstr;
}

Solution

Learn from existing implementations, they usually have solutions to corner cases, common pitfalls and performance bottlenecks. For example, Apache Commons Lang StringUtils also has a remove function. It’s implementation is quite simple and similar to yours:

public static String remove(final String str, final char remove) {
    if (isEmpty(str) || str.indexOf(remove) == INDEX_NOT_FOUND) {
        return str;
    }
    final char[] chars = str.toCharArray();
    int pos = 0;
    for (int i = 0; i < chars.length; i++) {
        if (chars[i] != remove) {
            chars[pos++] = chars[i];
        }
    }
    return new String(chars, 0, pos);
}

Anyway, there are some differences:

  1. It uses char array. Strings are immutable, the following concatenation creates a new String object:

    newstr = newstr + str.charAt(i);
    

    I guess it’s slower than changing a value in an array.

    Using the string concatenation operator
    repeatedly to concatenate n strings requires time quadratic in n.

    Source: Effective Java, 2nd edition, Item 51: Beware the performance of string concatenation

  2. At the first line it checks whether the String contains the removed character at all. If it doesn’t contain it remove saves the creation the char array. It might help but it depends on your input.

  3. The code in the question calls String.charAt() in the comparison which checks the given index:

    public char charAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }
    

    The StringUtils implementation skips this test since it knows that the index is valid.

(See also: Effective Java, 2nd edition, Item 47: Know and use the libraries)

First indeed your code has a complexity of O(n) (n the number of characters of the String), it would precisely depend on the implementation of the method String.charAt(index), but we could most likely consider it has direct access.

O(n) is the lowest one can do. Indeed, think as the following: how could you remove all instances of a character of the String if you do not look at all the characters of this String. So this is hardly possible to do better.

There is a question on SO about performance of String concatenation in Java. For your question I would say then the StringBuilder might be a better option. I would also consider creating an array of char from scratch, but I don’t know how the String constructor from an array behaves in terms of performance.

In the end, if the goal is just to remove a character from a String (out of an interview question), usually call String.replaceAll(myChar, “”);

I would add the following:

  • unit test

    Code some unit tests and it has to become an habit for you. If you publish some code then also add the unit tests. First, you can check yourself the correctness of your code and second it will help us to refactor your code (producing a better solution).

  • good names

    Please use good names. I know, even the names used by Apache are not good. No ‘c’, no ‘str’. A name that is self descriptive. No comments needed. I know that not everybody will agree because the code seems a little bit more ‘heavy’, but what a pleasure to read!

Leave a Reply

Your email address will not be published. Required fields are marked *