Reverse character order of substrings delineated by white space chars

Posted on

Problem

The Challenge: Given a string, return a string where each word is reversed individually without changing its position. EDIT: (i.e.: the input “hello world I am a frog?” should produce the output: “olleh dlrow I ma a gorf?”)

Thoughts and questions: Perhaps I was cheating the spirit of the challenge by using string builder? What do you think?

Complexity? Would the complexity be? I feel like it is O(n) since it will basically look at almost every char twice. I’m thinking there is a more efficient way to do it where n is just looked at once, but I think that to achieve that we will double the use of memory. Not that that is a problem, but I think that is the trade off.

Coding tests tend to be timed and I did this quickly. In evaluating the code, how do you factor in the pressure of time as you administer the test? I am talking myself into the idea that the answer should have been creating a new string and entering the chars one at a time using pointers so that each string char is looked at once and moved once. Is that really better then this solution? Perhaps there are points for easily readable code? Do you think this is readable?

    public static string ReverseOnlyWords(string InitialValue)
    {
        if (InitialValue == null)
        {
            throw new System.NullReferenceException();
        }
        StringBuilder seperatedValue = new StringBuilder();
        StringBuilder currentWord = new StringBuilder();
        foreach (char c in InitialValue)
        {
            if (c == ' ' || c == '.' || c == ',')
            {
                seperatedValue.Append(Reverse(currentWord.ToString()));
                seperatedValue.Append(c.ToString());
                currentWord.Clear();
            }
            else
            {
                currentWord.Append(c.ToString());
            }
        }
        seperatedValue.Append(Reverse(currentWord.ToString()));

        string resultValue = seperatedValue.ToString();
        return resultValue;
    }

In Reverse, perhaps I should have simply created a copy of toBeReversed and inserted/replaced characters in the reversed order instead of a string builder?

    static public string Reverse(string toBeReversed)
    {
        if (toBeReversed == null)
        {
            throw new System.NullReferenceException();
        }
        StringBuilder reversedString = new StringBuilder();
        for (int i = (toBeReversed.Length - 1); i >= 0; i--)
        {
            reversedString.Append(toBeReversed[i].ToString());
        }
        return reversedString.ToString();
    }

Solution

First, NullReferenceException is used by the CLR, and it’s the default exception for nullity in most cases. The ArgumentNullException used to prevent NullReferenceException from thrown, and gives more meaningful exception to your code.

So, in your current code, if you pass a null, the thrown exception message would be something like

Object reference not set to an instance of an object

. This would not give you any hint on what’s going on your code, so if you used

throw new ArgumentNullException(nameof(InitialValue));

this would give you something like :

Value cannot be null. (Parameter 'InitialValue')   

isn’t clearer to point out the issue ?

Another point is that you don’t need to throw an exception all the time as a way to handle your code. Only throw exception when it’s critical to the application. As exceptions like a Stop sign to the application, whenever it’s thrown, it stops the application from running the code. if the code is related to other dependencies such as required class, extensions, databases, ..etc. Then, this code should throw an exception if it leaks the dependency requirements.

In your case, it’s not critical to pass null, yes it would break the code, but also it’s not required to stop the code just for this, as you can simply return null to the caller. Same thing if Empty string or Whitespace passed. because you

so if you do this :

if(string.IsNullOrWhitespace(value)) { return value; }

would be enough and better than thrown exception for this case.

For the methods ReverseOnlyWords and Reverse I don’t see what was the reason behind splitting them?. It should be one method that does the process. having two methods is confusing, because it’s required to us (or any developer) to read both methods to understand the logic and to know which to use ! of course, splitting methods can be useful if there are independent procedures inside the method that can be reused or simply outside the main scope of the method, but not in your current code.

Using StringBuilder is a good practice. However, you only need one, the other one is unnecessary. Also, you don’t need ToString() when appending char in StringBuilder

this line :

if (c == ' ' || c == '.' || c == ',') { ... }

punctuations can’t be handled like this, because if you keep it that way, you will miss other punctuations. if we assumed you’ll process English context, then you only covered 3 out of 14 punctuations that I know of. What about other languages? So Instead, use the built in char.IsPunctuation which covers most of the punctuations that in UnicodeCategory.

Complexity? Would the complexity be? I feel like it is O(n) since it
will basically look at almost every char twice.

You’re using 3 loops, so your complexity is O(N2)

O(N2)

. though, it can be simplified to two loops (one for the words, the other for the characters), and one StringBuilder is enough. However, the overall code can be written as follow :

public string Reverse(string value, char separator)
{       
    if(string.IsNullOrEmpty(value)) { return value; } // just return the value, leave the handling to the caller.
    
    var words = value.Split(separator); // split it to words by spaces

    // initiate a string builder with a fixed size based on the original string size. 
    // setting the capacity would avoid oversized allocation.
    var resultBuilder = new StringBuilder(value.Length);

    // iterate over words 
    for(int x=0; x < words.Length; x++)
    {           
        // store the tailing punctuation
        char? punctuation = null;
        // iterate over characters in reverse 
        for(int c = words[x].Length - 1; c >= 0; c--)
        {    
            var current = words[x][c];
        
            if(char.IsPunctuation(current))
            {
                if(c == 0) // for leading punctuation
                {
                    // get the first poistion of the current word 
                    var index = resultBuilder.ToString().Length - (words[x].Length - 1);
                    
                    // insert the leading punctuation to the first poition (its correct poistion)
                    resultBuilder.Insert(index, current);                     
                }
                else 
                {
                    // store tailing punctuation to insert it afterward
                    punctuation = current;
                }
                
            }
            else 
            {
                // everything else, just append
                resultBuilder.Append(current);                
            }                                
            
      }
        
        if(punctuation != null)
        {
            // insert tailing punctuation 
            resultBuilder.Append(punctuation);
            punctuation = null; //reset 
        }

        resultBuilder.Append(separator);        
    }

    return resultBuilder.ToString();
}           

this is just a revision of your code in its simplest form possible (at least to my knowledge) and it’s one way process (process each word and character once). Using pointers would not increase the overall performance to the point where it would be recommended!. Array.Reverse also can be used, but still might add more memory allocation and slow down the performance specially with large strings.

Other missing part that needs to be counted when dealing with strings is Unicode which in some cases invalidate the results if not handled correctly.

UPDATE :

Here is another version that uses one loop, and two StringBuilder (one stores the results, and one for the processing).

public static string ReverseString(string value , char separator)
{
    if(string.IsNullOrEmpty(value)) { return value; }

    var tempBuilder = new StringBuilder(value.Length);

    var resultBuilder = new StringBuilder(value.Length);

    var isCompleted = false;

    for(int x = 0, index = value.Length - 1; index >= 0; index--, x++)
    {
        var current = value[index];
        
        if(current == separator)
        {
            isCompleted = true;
        }
        else
        {
            tempBuilder.Append(current);

            if(index == 0)
            {
                isCompleted = true;
            }
        }

        if(isCompleted)
        {
            // handle the lead and tail punctuations
            if(char.IsPunctuation(tempBuilder[0]) && char.IsPunctuation(tempBuilder[tempBuilder.Length - 1]))
            {
                var tail = tempBuilder[0];
                var lead = tempBuilder[tempBuilder.Length - 1];
                tempBuilder.Remove(0 , 1);
                tempBuilder.Remove(tempBuilder.Length - 1 , 1);
                tempBuilder.Insert(0 , lead);
                tempBuilder.Append(tail);
            }
            else if(char.IsPunctuation(tempBuilder[0]))
            {
                tempBuilder.Append(tempBuilder[0]);
                tempBuilder.Remove(0 , 1);
            }
            else if(char.IsPunctuation(tempBuilder[tempBuilder.Length - 1]))
            {
                tempBuilder.Insert(0 , tempBuilder[0]);
                tempBuilder.Remove(tempBuilder.Length - 1 , 1);
            }

            //store the results
            resultBuilder.Insert(0 , separator);
            resultBuilder.Insert(0 , tempBuilder);

            //reset
            tempBuilder.Clear();
            x = 0;
            isCompleted = false;

        }
    }

    return resultBuilder.ToString();
}

it may need more work, but I thought it would be useful to share it. It still lakes the Unicode handling though.

Leave a Reply

Your email address will not be published. Required fields are marked *