I’d like suggestions for optimizing this brute force solution to problem 1. The algorithm currently checks every integer between 3 and 1000. I’d like to cut as many unnecessary calls to
isMultiple as possible:
''' If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000. ''' end = 1000 def Solution01(): ''' Solved by brute force #OPTIMIZE ''' sum = 0 for i in range(3, end): if isMultiple(i): sum += i print(sum) def isMultiple(i): return (i % 3 == 0) or (i % 5 == 0)
The sum 3+6+9+12+…+999 = 3(1+2+3+…+333) = 3 (n(n+1))/2 for n = 333. And 333 = 1000/3, where “/” is integral arithmetic.
Also, note that multiples of 15 are counted twice.
def sum_factors_of_n_below_k(k, n): m = (k-1) // n return n * m * (m+1) // 2 def solution_01(): return (sum_factors_of_n_below_k(1000, 3) + sum_factors_of_n_below_k(1000, 5) - sum_factors_of_n_below_k(1000, 15))
I think the best way to cut out possible checks is something like this:
valid = set() for i in range(3, end, 3): valid.add(i) for i in range(5, end, 5): valid.add(i) total = sum(valid)
There’s still a bit of redundancy (numbers which are multiples of both 3 and 5 are checked twice) but it’s minimal.
I would get rid of the
for loops and use
sum on generator expressions.
def solution_01(n): partialsum = sum(xrange(3, n, 3)) return partialsum + sum(x for x in xrange(5, n, 5) if x % 3)
Note that we’re using
xrange instead of
range for python 2. I have never seen a case where this isn’t faster for a
for loop or generator expression. Also consuming a generator expression with
sum should be faster than adding them up manually in a
If you wanted to do it with sets, then there’s still no need for
def solution_01(n): values = set(range(3, n, 3)) | set(range(5, n, 5)) return sum(values)
Here, we’re just passing the multiples to the set constructor, taking the union of the two sets and returning their sum. Here, I’m using
range instead of
xrange. For some reason, I’ve seen that it’s faster when passing to
list. I guess it would be faster for
set as well. You would probably want to benchmark though.
Well, this is an answer to a Project Euler question after all, so perhaps the best solution is basically a pencil-paper one.
sum(i for i in range(n + 1)) # sums all numbers from zero to n
is a triangular number, the same as
n * (n + 1) / 2
This is the triangular function we all know and love. Rather, more formally,
triangle(n) = n * (n + 1) / 2
With that in mind, we next note that the sum of the series
3, 6, 9, 12, 15, 18, 21, 24, ...
is 3 * the above triangle function. And the sums of
5, 10, 15, 20, 25, 30, 35, ...
are 5 * the triangle function. We however have one problem with these current sums, since a number like 15 or 30 would be counted in each triangle number. Not to worry, the inclusion-exclusion principle comes to the rescue! The sum of
15, 30, 45 ,60, 75, 90, 105, ...
is 15 * the triangle function. Well, if this so far makes little sense don’t worry. Finding the sum of the series from 1 up to n, incrementing by k, is but
triangle_with_increments(n, k = 1) = k * (n/k) * (n/k + 1) / 2
with this, and the inclusion-exclusion principle, the final answer is but
triangle_with_increments(100, 5) + triangle_with_increments(100, 3) - triangle_with_increments(100, 15)
Wow. Who’da thunk? an n complexity problem suddenly became a constant time one. That’s what I call optimization IMHO :P. But in all seriousness, Project Euler asks you to answer problems in really the lowest computational complexity possible.
I would do it like this:
total = 0 for i in range(3, end, 3): total += i for i in range(5, end, 5): if i % 3 != 0: # Only add the number if it hasn't already total += i # been added as a multiple of 3
The basic approach is the same as g.d.d.c’s: iterate all the multiples of 3, then 5. However instead of using a set to remove duplicates, we simply check that the multiples of 5 aren’t also multiples of 3. This has the following upsides:
- Checking divisibility is less expensive than adding to a set.
- We build the total up incrementally, so we don’t need a separate call to sum at the end.
- We got rid of the set, so we only need constant space again.