Length of longest palindrome subsequence

Posted on

Problem

I had an interview, where I was asked to solve to this question:

Given an string str, find the maximum length of palindrome subsequence.

Example: >if str = ‘abcda’:

maximum_length = 3 ‘aca’

How can I improve this code?

def maximum_length_palindrome_subsequence(a):
    a = list(a)
    n = len(a)
    table = [[0 for i in xrange(n)] for j in xrange(n)]
    for i in xrange(n):
        table[i][i] = 1
    for gap in xrange(1,n):
        for i in xrange(n-1):
            h = i+gap
            if h>=n:
                continue
            if a[i] == a[h]:
                if i+1==h:
                    table[i][h] = 2
                else:    
                    table[i][h] = table[i+1][h-1] + 2 
            else:
                if h<n:
                    table[i][h] = max(table[i+1][h], table[i][h-1])
    return table[0][n-1]                
                
print maximum_length_palindrome_subsequence('abcbda')

Solution

Back to the basic of code review, what does this code do and how do you do it. At first glance I’ve got not clue what so ever this code does. I don’t know what the content of the table is. I don’t know what the h or the gap is referring to, and I don’t know why the table[0][n-1] is the correct answer.

The only thing I know is that your function is supposed to return the “maximum length palindrome subsequence” of a. Which is not entirely clear either. (I wrote an answer (now deleted) based on a misconception of the terms included, as I didn’t fully realize the definition of subseqence)

So let us review some unknown code:

  • Vague variable naming – The variable could strongly benefit from indicating what they are actually containing. Like instead of a, it could be sourceString or source or text. The i and n is commonly known as the loop counter, and the number of elements. The a, gap, h and table, no idea what they contain. Especially the table is badly named.
  • No comments – It would have been nice to see some comments describing what the various loops are trying to achieve, or how the table is initialized. For the method possibly some examples of valid responses.
  • Break out early – As in the answer by @Dair, since i is ever increasing, whenever the h >= n you can do a break.
  • Strange comparison, no I – You compare a[i] == a[h], why? If i == h this will always be true. However h = i + gap, so this does essentially say a[i] == a[i + gap] which would be a little bit clearer, although I still don’t quite get why we compare this.
  • Strange comparison, no II – Right below you compare i+1 ==h, which is the same as i + 1 == i + gap, which can be simplified to gap == 1.
  • Unneeded if h<n – If you get this far, then it’s a given that h < n, as otherwise you’d continue‘d. Hence the if could be removed all together.

Based on the code so far I’m guessing that table holds the length of the palindrome starting at that first index and ending at last index. Not quite how it gets there, but that is my assumption. Let us rewrite the code according to this, and my comments above:

def maximum_length_palindrome_subsequence(text):
    """Search through text to find the maximum length of a 
       palindrome looking at subsequences of the text.
         text = "abcda" -> palindrome: aca -> length = 3
         text = "xaybzcydxe" -> palindrome: xyzyx -> length = 5
    """
    text_length = len(text)

    # palindrome_length is a two dimensional table of palindrome
    # lengths where the first dimension is start index, and the 
    # second dimension is the end index of the palindrome in text

    # Initialize with all zeroes
    palindrome_length = [[0 for i in xrange(text_length)] for j in xrange(text_length)]

    # As a base, every single character through text is a
    # palindrome of length 1
    for i in xrange(text_length):
        palindrome_length[i][i] = 1

    # Loop through the various gaps in text, and check for
    # similarity for every index of the text. Update the
    # palindrome_length accordingly if matching characters
    for gap in xrange(1, text_length):
        for i in xrange(text_length - 1):

            # h is the candidate index? (or something like that?!)
            h = i + gap
            # If candidate index is after length of text, break out
            if h >= text_length:
                break

            # If equal characters, update palindrome_length
            if a[i] == a[h]:
                if gap == 1:
                    palindrome_length[i][h] = 2
                else:    
                    palindrome_length[i][h] = palindrome_length[i + 1][h - 1] + 2 
            else:
                # Characters not equal, so choose a max based upon
                # ... some criteria ...
                palindrome_length[i][h] = max(palindrome_length[i + 1][h], 
 palindrome_length[i][h - 1])

    # Return the maximum length based upon entire text
    return palindrome_length[0][n - 1]                

print maximum_length_palindrome_subsequence('abcda')

Based upon this rewrite I’ve got some additional comments:

  • Simplify initialization of palindromeLength – Instead of first filling it with 0, and then refilling parts with 1, you could do:

    palindrome_length = [[ 1 if i == j else 0
                             for i in xrange(text_length)]
                                for j in xrange(text_length)]
    
  • Criteria for the max() – Is this somewhat related to palindromes of an even or odd length? That it chooses the longest of those two alternatives?

To conclude, your code does follow proper indentation, is a little lacking in space around operators, but the main issue is bad naming and unclear intentions of the various code blocks. There is some magic happening in this code, which could benefit from clear comments sprinkled out in the code.

Imaging what you yourself would think of your code if it was presented to you as it stood originally, but with the name function_x and the question: What does function_x do? Would you be able to answer it?

If my assumption on the max is correct, my final version is:

def maximum_length_palindrome_subsequence(text):
    """Search through text to find the maximum length of a 
       palindrome looking at subsequences of the text.
         text = "abcda" -> palindrome: aca -> length = 3
         text = "xaybzcydxe" -> palindrome: xyzyx -> length = 5
    """
    text_length = len(text)

    # palindromeLength is a two dimensional table of palindrome
    # lengths where the first dimension is start index, and the 
    # second dimension is the end index of the palindrome in text

    # Every character is considered a palindrom of length 1, whilst
    # the rest is initialized to 0 
    palindrome_length = [[ 0 if i != j else 1 
                             for i in xrange(text_length)] 
 for j in xrange(text_length)]

    # Loop through the various gaps in text, and check for
    # similarity for every index of the text. Update the
    # palindrome_length accordingly if matching characters
    for gap in xrange(1, text_length):
        for i in xrange(text_length - 1):

            # h is the candidate index? (or something like that?!)
            h = i + gap
            # If candidate index is after length of text, break out
            if h >= text_length:
                break

            # If equal characters, update palindrome_length
            if text[i] == text[h]:
                if gap == 1:
                    palindrome_length[i][h] = 2
                else:    
                    palindrome_length[i][h] = palindrome_length[i + 1][h - 1] + 2 
            else:
                # Characters not equal, so choose a max based upon
                # the even or odd length palindrome length?
                palindrome_length[i][h] = max(palindrome_length[i + 1][h], 
 palindrome_length[i][h - 1])

    # Return the maximum length based upon entire text
    return palindrome_length[0][text_length - 1]                

print maximum_length_palindrome_subsequence('abcda')           # "3"
print maximum_length_palindrome_subsequence('xaybzcydxe')      # "5"
print maximum_length_palindrome_subsequence('abcdaabcda')      # "6"
print maximum_length_palindrome_subsequence('racecar')         # "7"
print maximum_length_palindrome_subsequence('racecars')        # "7"
print maximum_length_palindrome_subsequence('iiiiragceciars')  # "7"

At the very end I presented a few more cases, to verify the correctness of the revised version.

Update: Changed to snake_case to conform to standards. Hopefully I didn’t break the code. 🙂

You don’t need to convert a string to a list in this case. Remove:

a = list(a)

Also:

h = i+gap
if h>=n:
    continue

h is strictly increasing. Calling continue will stop a particular iteration quickly, but really you can stop a whole cycle, switch it out with break.

Leave a Reply

Your email address will not be published. Required fields are marked *