# Sieve of Eratosthenes

dodo · May 21, 2020

In this notebook, I implement a few versions of the sieve of Eratosthenes for finding prime numbers.

Starting from $n=2$ and given the set $P$ of prime numbers up until $n-1$, we check if $n$ is divisible by any $p \in P$. If not, $P \leftarrow P + \{n\}$.

from math import sqrt

from math import sqrt, pi
import numpy as np

cached_primes = [2]
evaluated = 2


The first version of the sieve works just like explained at the top.

def eratosthenes(n):
primes = []
for i in range(2, n):
for p in primes:
if i % p == 0:
break
else:
primes.append(i)

return primes


The second version starts with the number 2 (the only even prime) already in the set of primes and only checks the primality of odd numbers. It should be twice as fast.

def eratosthenes_no_evens(n):
primes = [2]
for i in range(3, n, 2):
for p in primes:
if i % p == 0:
break
else:
primes.append(i)

return primes


The last version uses all the gimmicks from the previous ones, but also caches the primes found in previous runs and reuses them in subsequent calls of the function.

def eratosthenes_cached(n):
global evaluated, cached_primes

if evaluated < n:
start = evaluated if evaluated % 2 != 0 else evaluated + 1
for i in range(start, n + 1, 2):
for p in cached_primes:
if i % p == 0:
break
else:
cached_primes.append(i)

evaluated = n

for i in range(len(cached_primes)):
if cached_primes[i]>=n:
return cached_primes[:i]

return cached_primes


Check if the output of all functions is equal for an arbitrary number of input integers.

from numpy.random import randint

for v in randint(5000, size=100):
a = eratosthenes(v)
b = eratosthenes_no_evens(v)
c = eratosthenes_cached(v)

if not a == b == c:
raise RuntimeError('not equal')


Let’s check the performance of the three functions.

%timeit -r 5 -n 30 [eratosthenes(v) for v in randint(5000, size=100)]
%timeit -r 5 -n 30 [eratosthenes_no_evens(v) for v in randint(5000, size=100)]
%timeit -r 5 -n 30 [eratosthenes_cached(v) for v in randint(5000, size=100)]

30 loops, best of 5: 430 ms per loop
30 loops, best of 5: 404 ms per loop
30 loops, best of 5: 200 ms per loop


We can see that, because the third function caches primes from previous runs, it is twice as fast than the other two.