Problem

Note: this script requires `gmpy2`

, if you are using Python 3.9.5 x64 on Windows 10 21H1 like me, then you need to download the pre-built `gmpy2`

wheel from here: https://www.lfd.uci.edu/~gohlke/pythonlibs/.

So this is a script that approaches pi using 9 algorithms (actually only 7, two of the algorithms are implemented in two different ways each).

In this script I used Liu Hui’s π algorithm, Leibniz formula for π, Summing the area of a disk, Madhava’s series, Archimedes’ method, Arctangent series, and

Bailey–Borwein–Plouffe formula.

As these methods are all well-known and implemented countless times well before me, I will not explain the principles and details.

This question is about making the implementations as efficient and accurate as possible.

The code:

```
import gmpy2
import sys
from gmpy2 import mpfr, sqrt, atan, ceil, factorial
from itertools import product
gmpy2.get_context().precision = 339
def cyclotomy(n):
s = 6
M = 1
for _ in range(n):
G = mpfr(sqrt(1 - M ** 2 / 4))
j = mpfr(1 - G)
m = mpfr(sqrt((M / 2) ** 2 + j ** 2))
s *= 2
M = m
return '{0:.100f}'.format(mpfr(s * m / 2))
def Leibniz(n):
pi = mpfr(0.0)
d = 1
for i in range(1, n):
a = 2 * (i % 2) - 1
pi += mpfr(a * 4 / d)
d += 2
return '{0:.100f}'.format(mpfr(pi))
def sumarea(r):
r_sq = r**2
quardrant_sq = product((x**2 for x in range(r+1)), repeat=2)
q = sum(a+b <= r_sq for a, b in quardrant_sq)
return '{0:.100f}'.format(mpfr((4*q - 4*r + 1) / r_sq))
def pixelpi(r):
r_sq = r**2
s = sum(ceil(sqrt(r_sq - x**2)) for x in range(r+1))
return '{0:.100f}'.format(mpfr((4*s - 4*r + 1) / r_sq))
def Madhava(n):
a = mpfr(0.0)
for i in range(n):
a += mpfr((-1.0/3.0)**i/(2.0*i+1.0))
return '{0:.100f}'.format(mpfr(sqrt(12)*a))
def Archimedes(n):
pa = 6
pb = 4 * sqrt(3)
for _ in range(n):
pb = mpfr(2 * pa * pb / (pa + pb))
pa = mpfr(sqrt(pa * pb))
return '{0:.100f}'.format(mpfr((pa + pb) / 4))
def Arctangent(n):
term = lambda x: mpfr((2 ** (x + 1) * factorial(x) ** 2) / factorial(2 * x + 1))
return '{0:.100f}'.format(mpfr(sum(term(i) for i in range(n))))
def ArctanPi(n):
term = lambda x: mpfr(sqrt(2)) if x == 1 else mpfr(sqrt(2 + term(x - 1)))
if n >= 2:
return '{0:.100f}'.format(mpfr(2 ** (n + 1) * atan(sqrt(2 - term(n - 1)) / term(n))))
def BBP(n):
term = lambda x: mpfr((4 / (8 * x + 1) - 2 / (8 * x + 4) - 1 / (8 * x + 5) - 1 / (8 * x + 6)) * 16 ** -x)
return '{0:.100f}'.format(mpfr(sum(term(i) for i in range(n))))
def main(a, b):
switcher = {
0: cyclotomy,
1: Leibniz,
2: sumarea,
3: pixelpi,
4: Madhava,
5: Archimedes,
6: Arctangent,
7: ArctanPi,
8: BBP
}
return switcher.get(a)(b)
if __name__ == '__main__':
print(main(*map(int, sys.argv[1:])))
```

So far I have found the algorithm based on Liu Hui’s converges the fastest (except the recursive arctangent function which is prone to error), it yields the first 100 decimal places of pi at 166th iteration with 339 bit precision, though it will produce inaccurate result (the last two digits will change from “79” to “80”) if you input any higher number due to rounding.

But at the 166th iteration we are calculating the perimeter of a regular polygon with 6 * 2 ^ 166 sides to get the result.

The third and fourth methods use the same logic, but the third counts all the cells inside a circle which is always an integer, and this is somehow more accurate than the fourth.

The fourth method sums the maximum y value at each x value, because it uses square roots y isn’t always an integer, because issues of rounding this is less accurate than the third, but this is much faster than the third. And I have found rounding up y make the result most accurate.

The seventh and eighth methods are using the same logic, but the eighth uses recursion, and it achieves the most accurate result at the fifth iteration (input is 6), then it gives very inaccurate results at inputs 164 to 169, and then it gives complete wrong results at input 170, and at input 172 it gives 4.0000, and at any number after that it yields -0.0000, and the maximum input is 997 in terminal and 998 in Python shell, otherwise it throws `RecursionError: maximum recursion depth exceeded in comparison`

.

These issues are caused by the way the computer works, the errors in precision and recursion limits, and I am unable to fix it.

Beyond that there are no issues to report.

I would like to know how can I improve the efficiency and accuracy of the algorithms, how can they be optimized. Any help will be appreciated.

Solution

# Efficiency and accuracy

If the goal was to have the best function to calculate π, then you should just pick the algorithm which is known the converge the fastest, and optimize that as much as possible. There’s not much point in wanting to optimize an algorithm you know isn’t ever going to be the most efficient one to begin with. But I assume you want to do this anyway.

Accuracy is the goal, and efficiency is how fast you reach the goal. The number of iterations to reach a certain accuracy is probably a bad measure of efficiency, because an algorithm that needs twice the number of iterations to reach the desired accuracy but is three times faster per iteration is going to win. So ideally, you want to benchmark your code and measure the time it takes to reach a given number of digits of accuracy.

# Short != fast

Dense code is not necessarily fast, and making the code as compact as possible hurts readability and maintainability. I had to read this line a few times before I understood how it worked:

```
print(main(*map(int, sys.argv[1:])))
```

Just keep it plain and simple. I would have written this as:

```
algorithm = int(sys.argv[1])
n_iterations = int(sys.argv[2])
result = main(algorithm, n_iterations)
print(result)
```

The temporary variables are not *necessary*, but just by giving a name to things this will automatically document the code: it’s clear from this that we expect the first argument to be an integer that tells us which `algorithm`

to use, and so on.

Also, you can nest function definitions in Python instead of writing lambdas. This allows you to write more verbose code, for example:

```
def Arctangent(n):
def term(x):
numerator = 2 ** (x + 1) * factorial(x) ** 2
denominator = factorial(2 * x + 1)
return numerator / denominator
return sum(term(mpfr(i)) for i in range(n))
```

# Use consistent and descriptive variable names

Related to the above, you use a lot of one and two letter variable names. It’s hard to figure out what each of them means. If you are implementing a formula from a Wikipedia page or a scientific paper, then add a comment with a reference to the source material. Otherwise, just give the variables names that self-document them if possible, for example `d`

in `Leibniz()`

could be renamed `denominator`

.

Exceptions to this are commonly used short variable names, such as `i`

for iterators, and `x`

in lambdas.

# Return raw results instead of formatting them

Functions are more useful if they `return`

the raw results without post-processing them unnecessarily. This allows the caller to decide what to do with the data. It can always decide to format it to a string if necessary. In this case, it also removes code duplication. So instead of:

```
return '{0:.100f}'.format(mpfr(pi))
```

Just call:

```
return pi
```

And then the caller can do:

```
result = main(algorithm, n_iterations)
print("{:.100f}".format(result))
```

# Casting to `mpfr`

Let’s look at this line:

```
pi += mpfr(a * 4 / d)
```

Since `a`

and `d`

are just regular `float`

s, this means that here we first calculate `a * 4 / d`

using Python’s `float`

precision, and then cast the result of that to an `mpfr`

. That’s of course not so great. Make sure you do all operations with variables of type `mpfr`

to fully benefit from gmpy2’s multi-precision reals.

Also, once a variable is an `mpfr`

, there is no need to cast it again, like in this line:

```
return '{0:.100f}'.format(mpfr(pi))
```

Hmm, I have tested many algorithms and found the Ramanujan algorithm to be fastest, it only takes 13 iterations to arrive at pi to 100 decimal places (with 336 bits precision, though I always use 512 bits precision because it’s binary solid).

So this is the code:

```
import gmpy2
from gmpy2 import factorial, mpfr, sqrt
gmpy2.get_context().precision = 512
def Ramanujan(n):
term = lambda k: mpfr(factorial(4 * k) * (1103 + 26390 * k)) / (factorial(k) ** 4 * 396 ** (4 * k))
s = sum(term(i) for i in range(n))
return mpfr(1) / (2 * sqrt(2) * s / 9801)
```

At exactly n == 13 the result is correct to 100 decimal places:

```
str(Ramanujan(13))[0:102]
'3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679'
```

Performance information:

```
%timeit Ramanujan(13)
111 µs ± 1.74 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```

Also I have improved the BBP algorithm and reduced it to one line:

```
def BBP(n):
return sum(1/mpfr(16)**k * (mpfr(4)/(8*k+1) - mpfr(2)/(8*k+4) - mpfr(1)/(8*k+5) - mpfr(1)/(8*k+6)) for k in range(n))
```

It takes 80 iterations to get the first 100 decimal places of pi.

Performance:

```
%timeit BBP(80)
377 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```

The Archimedes method and cyclotomy method both take 166 iterations to get the first 100 decimal places.

Performance:

```
%timeit Archimedes(166)
466 µs ± 3.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit cyclotomy(166)
1.32 ms ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```

All the other methods take much longer time to compute and many more iterations to arrive at 100 decimal places.

P.S.

I know of the triangular function expressions of pi, for example:

pi = arccos(-1) and pi = 4 * arctan(1)

However they aren’t iterative algorithms and do need a value of pi predefined to get the results.

Though I do use them to check the validity of the results.