Problem
Project Euler Problem 12 asks (paraphrased):
Considering triangular numbers Tn = 1 + 2 + 3 + … + n, what is the first Tn with over 500 divisors? (For example, T7 = 28 has six divisors: 1, 2, 4, 7, 14, 28.)
I have written the following solution. The code is calculates the correct answer, but the run time is appalling (6m9s on a 2.4GHz i7!!!)…
I know Python is definitely not the fastest of languages, but I’m clearly missing a trick here.
Could anyone highlight any optimizations that could be made to bring the execution time down to something sensible?
from math import sqrt, floor
class findDivisors:
def isPrime(self, num):
for i in range(2, int(floor(sqrt(num)) + 1)):
if not num % i:
return False
return True
def numDivisors(self, num):
factors = {}
numFactors = 2
currNum = num
currFactor = 2
while not self.isPrime(currNum):
if not currNum % currFactor:
if currFactor not in factors:
factors[currFactor] = 1
else:
factors[currFactor] += 1
currNum = currNum / currFactor
else:
currFactor += 1
if currNum not in factors:
factors[currNum] = 1
else:
factors[currNum] += 1
total = 1
for key in factors:
total *= factors[key] + 1
return total
def firstWithNDivisors(self, noDivisors):
triGen = genTriangular()
candidate = triGen.next()
while self.numDivisors(candidate) < noDivisors:
candidate = triGen.next()
return candidate
def genTriangular():
next = 1
current = 0
while True:
current = next + current
next += 1
yield current
if __name__ == '__main__':
fd = findDivisors()
print fd.firstWithNDivisors(5)
Addendum: I have since made to revisions of the code The first is the alteration made as suggested by nabla; check for currentNum being > 1 instead of prime. This gives an execution time of 2.9s
I have since made a further improvement by precomputing the first 10k primes for quick lookup. This improves runtime to 1.6s and is shown below. Does anyone have any suggestions for further optimisations to try and break below the 1s mark?
from math import sqrt, floor
from time import time
class findDivisors2:
def __init__(self):
self.primes = self.initPrimes(10000)
def initPrimes(self, num):
candidate = 3
primes = [2]
while len(primes) < num:
if self.isPrime(candidate):
primes.append(candidate)
candidate += 1
return primes
def isPrime(self, num):
for i in range(2, int(floor(sqrt(num)) + 1)):
if not num % i:
return False
return True
def primeGen(self):
index = 0
while index < len(self.primes):
yield self.primes[index]
index += 1
def factorize(self, num):
factors = {}
curr = num
pg = self.primeGen()
i = pg.next()
while curr > 1:
if not curr % i:
self.addfactor(factors, i)
curr = curr / i
else:
i = pg.next()
return factors
def addfactor(self, factor, n):
if n not in factor:
factor[n] = 1
else:
factor[n] += 1
def firstWithNDivisors(self, num):
gt = genTriangular()
candidate = gt.next()
noDivisors = -1
while noDivisors < num:
candidate = gt.next()
factors = self.factorize(candidate)
total = 1
for key in factors:
total *= factors[key] + 1
noDivisors = total
return candidate
def genTriangular():
next = 1
current = 0
while True:
current = next + current
next += 1
yield current
if __name__ == '__main__':
fd2 = findDivisors2()
start = time()
res2 = fd2.firstWithNDivisors(500)
fin = time()
t2 = fin - start
print "Took", t2, "s", res
Solution
Here is one possible improvement:
Don’t check whether currNum
is prime, instead simply search factors until currNum
becomes 1
. Checking for prime takes O(sqrt(n))
time, but the rest of the loop checking for factors takes approximately O(1)
per loop iteration, so you can save those O(sqrt(n))
here:
while currNum > 1:
if not currNum % currFactor:
if currFactor not in factors:
factors[currFactor] = 1
else:
factors[currFactor] += 1
currNum = currNum / currFactor
else:
currFactor += 1
total = 1
for key in factors:
total *= factors[key] + 1
Result and timing for firstWithNDivisors(300)
with the original code:
2162160 2.21085190773
and without checking for prime:
2162160 0.0829100608826
In fact with this improvement alone the firstWithNDivisors(501)
only takes 2.3 sec on my machine.
There are very likely much more efficient algorithms to do this. One can probably make use of the fact that the n
-th triangle number is n*(n+1)/2
.