This blog post is all about the solution to Problem 10 of Project Euler. Just like Problem 7 the problem is all about primes. And the solution strategy I posted for problem 7 would be valid for this problem as well.
The problem reads
The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
There is nothing particular tricky about this question, and since there isn’t a formula for finding all primes, we will have to brute force a solution. However, the brute force solution can be more or less elegant. In the source code I have made available, you can find the solution approach from problem 7, adopted to this problem. I wont spent any more time on that method, but instead introduce you to Sieve of Eratosthenes.
Sieve of Eratosthenes
Sieve of Eratosthenes was as the name implies invented by Eratosthenes who was a Greek Mathematician living around 200 BC.
The algorithm needs to have an upper limit for the primes to find. Lets call this limit N
The algorithm works as follows.
- Create a list l of consecutive integers {2,3,…,N}.
- Select p as the first prime number in the list, p=2.
- Remove all multiples of p from the l.
- set p equal to the next integer in l which has not been removed.
- Repeat steps 3 and 4 until p2 > N, all the remaining numbers in the list are primes
It is a pretty simple algorithm, and the description of it on Wikipedia has a really nice graphical illustration, which I decided I couldn’t do better my self. So go and have a look at it. The algorithm finds a prime, and then marks all multiples of that primes. The first new number will always be a prime, since all numbers which are not will have been removed.
It can be pretty straight forward to implement the algorithm, and the challenge is definitely to optimize the implementation both execution and memory wise.
I found an implementation two implementation which are similar over at Stack Overflow and digitalBush which seemed very promising. I have then further optimized it a bit. The optimized code is shown here
public int[] ESieve(int upperLimit) {
int sieveBound = (int)(upperLimit - 1) / 2;
int upperSqrt = ((int)Math.Sqrt(upperLimit) - 1) / 2;
BitArray PrimeBits = new BitArray(sieveBound + 1, true);
for (int i = 1; i <= upperSqrt; i++) {
if (PrimeBits.Get(i)) {
for (int j = i * 2 * (i + 1); j <= sieveBound; j += 2 * i + 1) {
PrimeBits.Set(j, false);
}
}
}
List numbers = new List((int)(upperLimit / (Math.Log(upperLimit) - 1.08366)));
numbers.Add(2);
for (int i = 1; i <= sieveBound; i++) {
if (PrimeBits.Get(i)) {
numbers.Add(2 * i + 1);
}
}
return numbers.ToArray();
}
Once we have a list of all the primes we need the rest of the code is a trivial for loop summing up the array. You can check the source code for that bit. In the following sections I will touch on different aspects of the code.
Data representation
It uses a BitArray to store all the numbers. It is a enumerable type which uses one bit per boolean. Using a BitArray means the algorithm will limit the memory usage by a factor 32 compared to an array of booleans according this discussion. However, it will lower the operational performance. We need an array to hold 2.000.000 numbers, which means a difference of 250kB vs. 8MB.
I played around with it a bit, and didn’t see much of a performance difference for a small set of primes. For a large set of primes I noticed that the BitArray was slightly faster. This is likely due to better cache optimization, since the the BitArray is easier to store in the CPU cache, and thus increasing the performance.
Eliminating even numbers
Over at digitalBush he optimizes his code by skipping the even numbers in the loops. I have chosen a bit of another approach, to avoid allocating memory for it. It just takes a bit of keeping track of my indexes.
Basically I want to start with three and then for a counter i = {1,2,….,N/2} represent every odd number p = {3,5,7,….,N}. That can be done as p = 2i+1. And that is what I have done in the code. It makes the code a bit more complex, but saves half the memory, and thus it can treat even larger sets.
Furthermore we start our inner loop at p2 = (2i+1)(2i+1) = 4i2 + 4i + 1, which will have the index 2i(i+1), which is where we start the search of the inner loop. By increasing p=2i+1 indexes in every iteration of the inner loop, I am jumping 2p, and thus only taking the odd multiples of p. Since multiplying with an even number will give an even result, and therefore not a prime.
Sieve of Atkin
Another Sieve method to generate primes is the Sieve of Atkin. It should be faster than the Sieve of Eratosthenes. I have made a reference implementation of it, but I can’t wrap my head around how to optimize it, so I have never been able to optimize it to the same degree as the Sieve of Eratosthenes. However, I have included the reference implementation in the source code, so you can play around with it if you like.
Wrapping up
This post took a bit of a different approach. I have worked hard to really optimize the code. Since I believe that the Sieve method for finding primes will be used later. So making a reusable bit of fast code, will make later problems easier to solve.
I took three 3 different approaches and tried to optimize them. The result of the three are
Prime sum of all primes below 2000000 is 142913828922
Solution took 203,125 ms using Trial Division
Prime sum of all primes below 2000000 is 142913828922
Solution took 31,25 ms using Sieve of Eratosthenes
Prime sum of all primes below 2000000 is 142913828922
Solution took 31,25 ms using Sieve of Atkin
The difference between the two sieve methods are not really noticeable for such small numbers, but if we start calculating all primes below one billion, the differences in execution time will show. And as usual you can download the source code.
If you can come up with further optimizations of the sieve methods, I would appreciate you to leave a comment. I am always eager to learn more and speeding things up further.