Code Style and Efficiency of 3SUM

Posted on

Problem

I’m trying to implement a version of the 3SUM problem. Any help with code style and efficiency would be greatly appreciated. I’m trying to get it roughly O(n2)O(n2). Also, there should be no duplicates. I specifically wanted to try and build upon 2SUM. I know there are other, maybe better, ways to do it.

def two_sum(nums, target):
    d = {}
    for i in range(len(nums)):
        key = target - nums[i]
        if key in d:
            return [nums[d[key]], nums[i]]
        else:
            d[nums[i]] = i
    return []

def three_sum(nums):
    ans = set()
    for n in nums:
        lst = two_sum(nums, -n)
        if lst:
            lst.append(n)
            ans.add(tuple(sorted(lst)))
    return ans

print(three_sum(nums))

Solution

1. Bug

The algorithm in three_sum is not correct (at least not in the usual version of the 3SUM problem):

>>> three_sum([1, -2])
{(-2, 1, 1)}

In the call two_sum(nums, -n) there’s nothing to stop two_sum from using the number n in its pair, even though n is also needed to make up the triple.

2. Review

  1. two_sum and three_sum have similar names, so you might guess that they do similar things, but two_sum return at most one pair adding up to an a given target whereas three_sum returns a list of triples adding up to zero. When writing a group of similar functions it’s worth taking care that they have similar interfaces — this makes them easier to use reliably.

    In this case the difference in implementations means that it’s not clear exactly what three_sum does — we can’t say that it returns all the triples adding up to zero, because when it calls two_sum it gets at most one result, and this might cause it to miss some triples. This is quite unsatisfactory — we can’t use this function if we want all the triples, but if we only want one triple then we have to wait longer than we need to.

    (This observation is similar in spirit to the zero one infinity rule — we often need functions that find one thing, or find all the things, but very rarely do we need functions that find some of the things.)

  2. There are no docstrings. What do these functions do? What do they return? (I think that if you had written docstrings then you might have spotted the problem in point 1 above.)

  3. In two_sum, it’s not clear what the name d means. (I guess d stands for dictionary, but a dictionary containing what?) I would use a more descriptive name like seen. Similarly, instead of key I would use a name like remainder.

Leave a Reply

Your email address will not be published. Required fields are marked *