This blog post is all about the solution to Problem 10 of Project Euler. Just like Problem 7 the problem is all about primes. And the solution strategy I posted for problem 7 would be valid for this problem as well.

The problem reads

The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.

Find the sum of all the primes below two million.

There is nothing particular tricky about this question, and since there isn’t a formula for finding all primes, we will have to brute force a solution. However, the brute force solution can be more or less elegant. In the source code I have made available, you can find the solution approach from problem 7, adopted to this problem. I wont spent any more time on that method, but instead introduce you to Sieve of Eratosthenes.

## Sieve of Eratosthenes

Sieve of Eratosthenes was as the name implies invented by Eratosthenes who was a Greek Mathematician living around 200 BC.

The algorithm needs to have an upper limit for the primes to find. Lets call this limit *N*

The algorithm works as follows.

- Create a list
*l*of consecutive integers {2,3,…,N}. - Select
*p*as the first prime number in the list, p=2. - Remove all multiples of
*p*from the*l*. - set
*p*equal to the next integer in*l*which has not been removed.

- Repeat steps 3 and 4 until
*p*^{2}>*N,*all the remaining numbers in the list are primes

It is a pretty simple algorithm, and the description of it on Wikipedia has a really nice graphical illustration, which I decided I couldn’t do better my self. So go and have a look at it. The algorithm finds a prime, and then marks all multiples of that primes. The first new number will always be a prime, since all numbers which are not will have been removed.

It can be pretty straight forward to implement the algorithm, and the challenge is definitely to optimize the implementation both execution and memory wise.

I found an implementation two implementation which are similar over at Stack Overflow and digitalBush which seemed very promising. I have then further optimized it a bit. The optimized code is shown here

public int[] ESieve(int upperLimit) { int sieveBound = (int)(upperLimit - 1) / 2; int upperSqrt = ((int)Math.Sqrt(upperLimit) - 1) / 2; BitArray PrimeBits = new BitArray(sieveBound + 1, true); for (int i = 1; i <= upperSqrt; i++) { if (PrimeBits.Get(i)) { for (int j = i * 2 * (i + 1); j <= sieveBound; j += 2 * i + 1) { PrimeBits.Set(j, false); } } } List numbers = new List((int)(upperLimit / (Math.Log(upperLimit) - 1.08366))); numbers.Add(2); for (int i = 1; i <= sieveBound; i++) { if (PrimeBits.Get(i)) { numbers.Add(2 * i + 1); } } return numbers.ToArray(); }

Once we have a list of all the primes we need the rest of the code is a trivial for loop summing up the array. You can check the source code for that bit. In the following sections I will touch on different aspects of the code.

### Data representation

It uses a BitArray to store all the numbers. It is a enumerable type which uses one bit per boolean. Using a BitArray means the algorithm will limit the memory usage by a factor 32 compared to an array of booleans according this discussion. However, it will lower the operational performance. We need an array to hold 2.000.000 numbers, which means a difference of 250kB vs. 8MB.

I played around with it a bit, and didn’t see much of a performance difference for a small set of primes. For a large set of primes I noticed that the BitArray was slightly faster. This is likely due to better cache optimization, since the the BitArray is easier to store in the CPU cache, and thus increasing the performance.

### Eliminating even numbers

Over at digitalBush he optimizes his code by skipping the even numbers in the loops. I have chosen a bit of another approach, to avoid allocating memory for it. It just takes a bit of keeping track of my indexes.

Basically I want to start with three and then for a counter i = {1,2,….,N/2} represent every odd number *p* = {3,5,7,….,N}. That can be done as *p* = 2i+1. And that is what I have done in the code. It makes the code a bit more complex, but saves half the memory, and thus it can treat even larger sets.

Furthermore we start our inner loop at p^{2} = (2i+1)(2i+1) = 4i^{2} + 4i + 1, which will have the index 2i(i+1), which is where we start the search of the inner loop. By increasing p=2i+1 indexes in every iteration of the inner loop, I am jumping 2p, and thus only taking the odd multiples of *p*. Since multiplying with an even number will give an even result, and therefore not a prime.

## Sieve of Atkin

Another Sieve method to generate primes is the Sieve of Atkin. It should be faster than the Sieve of Eratosthenes. I have made a reference implementation of it, but I can’t wrap my head around how to optimize it, so I have never been able to optimize it to the same degree as the Sieve of Eratosthenes. However, I have included the reference implementation in the source code, so you can play around with it if you like.

## Wrapping up

This post took a bit of a different approach. I have worked hard to really optimize the code. Since I believe that the Sieve method for finding primes will be used later. So making a reusable bit of fast code, will make later problems easier to solve.

I took three 3 different approaches and tried to optimize them. The result of the three are

Prime sum of all primes below 2000000 is 142913828922 Solution took 203,125 ms using Trial Division Prime sum of all primes below 2000000 is 142913828922 Solution took 31,25 ms using Sieve of Eratosthenes Prime sum of all primes below 2000000 is 142913828922 Solution took 31,25 ms using Sieve of Atkin

The difference between the two sieve methods are not really noticeable for such small numbers, but if we start calculating all primes below one billion, the differences in execution time will show. And as usual you can download the source code.

If you can come up with further optimizations of the sieve methods, I would appreciate you to leave a comment. I am always eager to learn more and speeding things up further.

Here is my LINQ version of your Sieve Of Eratosthenes implementation:

https://gist.github.com/999053

I used a bool array instead of a BitArray for more speed (even though it allocates more memory).

Hi SuprDewd

First of all, thanks for sharing the code.

It is funny that you mention this piece of code right now. I was just messing around with it to speed it up. I had a problem which I could solve in 11ms and the 10 of them was used to generate primes.

I had the hypothesis that there would be two speed ups in the code.

1. Changing the BitArray to a bool array, which gives a speed up of 30-40%. This could be significantly reduced for larger problems, as we might not be able to hold all the information in cache.

2. Avoiding to convert it all to an int array in the end. This turned out to make no difference.

Even a search on the internet didn’t really turn up anything faster, so either this is pretty good or I am missing something obvious.

/Kristian

Your sieve implementation is the fastest I’ve found, apart from http://cr.yp.to/primegen.html.

I am always on the hunt for faster prime generators.

You might want to take a look at:

https://github.com/SuprDewd/SharpBag/blob/master/SharpBag/Math/BagMath.cs

This is what I use for prime problems.

I have 3 prime sieve methods and 1 optimized trial division method with the primes below 2000 cached.

I also have a Miller-Rabin test for BigIntegers here:

https://github.com/SuprDewd/SharpBag/blob/master/SharpBag/Math/MathExtensions.cs

Hope you find some of that interesting.

The prime generator you are pointing to looks really fast, I haven’t tried it though.

Also, nice work on the math library you compiled, it looks like a huge collection you have made by now. Some of it could be optimized, but for most practical application it should be awesome.

/Kristian

Thanks 😀

I also use my library for private/school projects, so I’ve packed a lot more than just math functions into it. All from networking to Database stuff to various little helpers.

If there’s anything that you can optimize or make better in some way, please, feel free to help me through GitHub.

Hi Kristian,

i need your help in understanding your approach.

doubts :-

1) why BitArray size is 1/2 of the upperLimit.

eg upperLimit=30 and you are considering upperLimit = 15.

but 29 is prime number, hence it will not be considered.

2) in first iteration

j=4

j=7 -> but 7 is prime and you are setting it false.

I guess your iteration starts from index 2.

0 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 –> you are setting this ..so at 7th position = 8 so we mark it false.

Please help me in understanding this.

Thanks in advance.

replacing space by ‘.’

above format get messed up.

0 1 2 3 4 5 6 7 8

….1 2 3 4 5 6 7 –> you are setting this ..so at 7th position = 8 so we mark it false.

I think the missing part is the fact that I only consider odd numbers. so index 0 represents 3, index 1 represents 5 and so on.

Does it make sense now?

ok , i got the idea…but still have some confusion

0 1 2 3 4 5 6 7 – bitArray

3 5 7 9 11 13 15 17 – representation

now in inner loop :-

j=4 which is equal to 11…you are setting it to false but 11 is prime number.

now j=7

again set 7 to false which is equivalent to 17(as represented in above array) – which is prime

where i am getting it wrong? 🙁

Sorry I forgot. The indexing starts at 1. So index 0 represents 1. Does it fit then?

yes now it fine,

now my last doubt is about the inner loop.As mentioned in Wikipedia link

..make every 2p,3p,4p element as false.

but according to your inner loop , it seems to me that it follows some rule which states that for a given value j=some number , every element at 2*i+1 is non-prime.So how can we say that element at that index is a non-prime number ?

so what formula are you applying while making it false.(like in Wikipedia it is 2p , 3p , 4p etc)

Try to go through it. I think it will make sense. Let me give you an example.

If we take i= 1, we are at the number p=3, which we agree is prime.

We then need to eliminate every multiple of 3 which is 6,9,12,15. However, we don’t really count 6 and 12, since they are even. So we don’t need to eliminate those. So we increment 2p everytime instead. Furthermore we can start at p^2 according to wikipedia.

Which is also what I try to explain in the “eliminating even numbers”

We have that p^2 in terms of i is (2i+1)^2 = 4i^2 + 4i + 1. If we want to express that on the form 2(xxx) – 1 we get that 2(2i^2+2i) + 1 or 2i(i+1) if you like. This is our starting number.

After that we need to add 2p = 2*(2i+1), but since we only count every other number, the increments should be halved so we divide by 2 and get that we should increment the counter with 2i+1.

Lastly an example:

We start at j=i*2*(i+1) = 4 (which corresponds to p=9=3^2).

We then increment the counter 2*i+1 = 3 so we reach j=4+3 = 7 (which corresponds to 15).

is below code is equivalent faster then the one posted or its slower ?

or your code is more optimized then the one i have posted below ?

please let me know

my above code is not correct , here is updated one :-

Looking at the code I would guess mine is faster, since pow is usually slow and division is slower than multiplication. But why don’t you make the comparison yourself?

i had doubt that by incrementing it by 2 every time and running it upto upperSqrt will nullify the use costlier operation i am using in inside loop.

thanks you for the time 🙂

Well, did you try to time it? I would love to see the results.

yes , For num = 2000000

your’s took = Solution took 23.177 ms

mine took = Solution took 45.302 ms

Hey.. This is really nice blog.. liked it a lot..

By the way.. I tried this code to calculate sum of all primes below 30.

TrialDivision() function calculates sum as 160 which is wrong where as other two “Sieve” functions calculate it correct:129.

Hi Pradnya

Yes I can see that it did. It wrongly counted 31 as being a prime below 30 as well. It is fixed now. Thanks for noticing.

/Kristian

Thanks Kristian..! Happy coding 🙂

Hey Kristian,

I was trying this problem in C. Since the elements given are 2 million I tries to use an array of 1 million elements. But some how my program crashes while running. It works fine for 50000 elements. So is there a limit to the size of array in C. What can I do to fix the issue?

Any help would be highly appreciated.

My c skills are very limited. But it is likely that there is an upper bound for array sizes. It may be limited by the max integer value numer bytes and when you try to store an integer you use 4 or 8 bytes. But I am not really sure what happens to you.

A crash while running sounds more like a pointer error though.

Table 1

Sum of all Prime numbers below 2 000 000

Re: Project Euler – Problem 10

Sum of all Primes below 2000000

http://img11.hostingpics.net/pics/260336pe10tableone.jpg

Table 2

Prime numbers below 2 000 000

Re: Project Euler – Problem 10

Sum of all Primes below 2000000

http://img11.hostingpics.net/pics/579534pe10table2.jpg

___

Sources:

1) Project Euler 10

Kristian’s algorithm

http://www.mathblog.dk/files/euler/Problem10.cs

2) Microsoft Visual C# 2010 Express

3) Microsoft Office Excel (2007)

4) MathBlog

Prime Factorization Calculator

http://www.mathblog.dk/tools/prime-factorization/

5) BIG Primes.NET

http://www.bigprimes.net/index.php

public static void runProblem10(){

long sum = 2;

for(int i = 3;i<2000000;i+=2){

if(isPrime(i)){sum+=i;}

}System.out.println("Sum: "+sum);

}

public static boolean isPrime(int num){

for(int i = 3;i<=(int)Math.sqrt(num)*2;i+=2){

if(num%i==0 || num==i)return false;

}return true;

}

4 seconds – please critique my code for improvements 😀

edit to previous comment: change

“for(int i = 3;i<=(int)Math.sqrt(num)*2;i+=2){"

to

"for(int i = 3;i<=(int)Math.sqrt(num);i+=2){"

to cut the time in half – around 2 sec 😀

I think inclusion-exclusion would be faster, or if you’re doing this this way why not add up all the numbers you’re setting up as false instead of removing them and subtract that from the total sum of 1+…2 million?

Because I wanted to develop a general sieving algorithm which I could use later on, and I wanted the output to be a list of primes. So you might be right, but that was the reason for doing it this way.

why do I get 142913828917…its just 5 short of the right answer..so wat could be the problem??

got it..left out 3 and 2

Hello,

i calculated the sum as 142,915,960,832 with a C program and with a Python program on the same computer i got 142,913,828,922.

Line by line comparison of the two summing processes shows “rounding” errors in the Python program execution. This is somewhat irritating as one would not expect rounding errors using integer values.

I am sure that the first sum is the right one.

If someone tells me a way i can upload two data files showing the differences.