Project Euler

My solutions for Project Euler, most of them are written in C#. I have two aims for the solution – speed and readability.

Whenever possible, I exploit the math behind the problem to obtain a much faster solution than what can be done through bruteforcing the problem.

Project Euler 18: Maximum Sum from Top to Bottom of the Triangle

The problem presented by Project Euler in Problem 18 is an optimization problem where you need to find the route through a triangle which maximizes the sum.  The problem reads

By starting at the top of the triangle below and moving to adjacent numbers on the row below, the maximum total from top to bottom is 23.

3
7 4
2
4 6
8 5
9 3

That is, 3 + 7 + 4 + 9 = 23.

Find the maximum total from top to bottom of the triangle below:

75
95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23

NOTE: As there are only 16384 routes, it is possible to solve this problem by trying every route. However, Problem 67, is the same challenge with a triangle containing one-hundred rows; it cannot be solved by brute force, and requires a clever method! ;o)

I will present you with two different solutions for this Project Euler problem. First a brute force solution, which the note state is possible for this problem, but not for Problem 67.  And latter I will present a Dynamic programming solution which can be reused for Problem 67. However, before addressing the algorithm part of the answer I will address the data representation, and how to read the data input.

At each level in the pyramid we always have two choices; we can go either left or right. Choosing left all he time will make us end up in the lower left corner, while choosing right will end us up in the lower right corner.

Data Representation

I have chosen to store the input data in a 2-dimensional array (int[ , ] in C# syntax) for this problem, instead of an array of arrays (int[ ][ ] in C# syntax). I know that I could have saved almost half of the memory by doing it the other way, but I felt it was easier.

I have stored the problem in memory such in the following way

3 0 0 0
7 4 0 0
2 4 6 0
8 5 9 3

such that going right increases the index by one, and going left keeps the index, and the triangle is padded with zeros.

I wrote a small input function so I could read the problem data from a text file and store it in a 2-dimensional array. I will use the input reader for both solutions. The C# implementation of the input function looks like

private int[,] readInput(string filename) {
    string line;
    string[] linePieces;
    int lines = 0;

    StreamReader r = new StreamReader(filename);
    while ((line = r.ReadLine()) != null) {
        lines++;
    }

    int[,]  inputTriangle = new int[lines, lines];
    r.BaseStream.Seek(0, SeekOrigin.Begin);

    int j = 0;
    while ((line = r.ReadLine()) != null) {
        linePieces = line.Split(' ');
        for (int i = 0; i < linePieces.Length; i++) {
            inputTriangle[j, i] = int.Parse(linePieces[i]);
        }
        j++;
    }
    r.Close();
    return inputTriangle;
}

All it does is find the number of lines in the file, allocate an array that fits, and read in the data.

Brute Force

When answering the problem with a brute force solution, it is pretty simple to go through the steps, we just need to try all combinations. Since we have a binary choice each time. We can iterate through all possibilities with a normal integer counter, and use the bits of the number to pick the direction left or right. While running through the path, we sum up the numbers and check if they are larger than the current maximum found.

I chose an implementation where I bitshift to find the left/right direction. It can be implemented in C# as

string filename = Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) + "\\input.txt";
int[,] inputTriangle = readInput(filename);

int posSolutions = (int)Math.Pow(2, inputTriangle.GetLength(0) - 1);
int largestSum = 0;
int tempSum, index;

for (int i = 0; i <= posSolutions; i++) {
    tempSum = inputTriangle[0, 0];
    index = 0;
    for (int j = 0; j < inputTriangle.GetLength(0) - 1; j++) {         index = index + (i >> j & 1);
        tempSum += inputTriangle[j + 1, index];
    }
    if (tempSum > largestSum) {
        largestSum = tempSum;
    }
}

It is a simple and easy approach, which works fine for this problem. The output of the code is

The largest sum through the triangle is: 1074
Solution took 15 ms

Dynamic Programming

The given problem is a classical example of dynamic programming, and it really works well for it. The methodology is a bit more complex than the brute force solution, but I will take a shot at explaining it anyway.
I will use the four line example to explain the idea throughout this section.
Two Sub-problems

Standing at the top of the triangle we have to choose between going left and right. In order to make the optimal choice (which maximizes the sum), we would have to know how large a sum we can get if we go either way. So in order to answer the question we would basically have to solve the two smaller problems which I have marked with blue and the orange in the figure to the right.

Solving a sub-problemWe can break each of the sub-problems down in a similar way, and we can continue to do so until we reach a sub-problem at the bottom line consisting of only one number, then the question becomes trivial to answer, since the answer is the number it self. Once that question is answered we can move up one line, and answer the questions posed there with a solution which is a + max(b,c).

Once we know the answer to all 3 sub-problems on the next to last line, we can move up and answer the posed sub-problems by the same formula already applied. And we can continue to do so until we reach the original question of whether to go left or right.

Optimal Sub-structure

Optimal sub-structreIf we break down the problem into sub-problems we can see that breaking the orange sub-problem into two and breaking the blue sub-problem into two would yield us 4 sub-problems. However the sub-problem in the overlapping part is identical.  In this problem solving the sub-problem yields the same result no matter how we reached the it. This is fairly easy to see in this example. When a problem has this property it is said to have optimal sub-structure. Since we have a problem with optimal we only have to solve three sub-problem in the next to bottom line, and therefore the dynamic programming is effective.

It can be proven that the problem has an optimal sub-structure. However, I wont go into the details of that here, but leave that as an open end you can pick up and explore.

Savings with Dynamic Programming

If we want to solve the small problem with brute force, we would need to test all 8 paths, each resulting in 3 additions, in total 24 additions.

If we use dynamic programming, the first iteration would require 3 maximum comparison operations and 3 additions. The next line would require 2 maximum comparison operations and 2 additions, and the first line would require one of each. So a total of 6 maximum comparison operations and 6 additions.

For small problems this saving is small if any at all, but for a problem with 15 lines, solving the first iteration would and brute forcing from there would reduce the number of brute force additions from 15*214 to 14*213 a saving of approximately 131000 fewer additions, at the cost of 15 additions and 15 maximum comparison operations. That is a pretty good saving.

Dynamic Programming – The Algorithm

We can make a short-cut with the algorithm, as we don’t have to break the problem into sub-problems, but can start from the bottom and work the way up through the triangle until we reach the top and the algorithm spits out a number.

We start with a triangle that looks like

3
7 4
2 4 6
8 5 9 3

Applying the algorithm to the small problem we will need three iterations. The first iteration we apply the rule a + max(b,c) which creates a new triangle which looks as

3
7 4
10 13 15

Making the second iteration of the algorithm makes the triangle look

3
20 19

And if we run the algorithm once more, the triangle collapses to one number – 23 – which is the answer to the question.

My implementation of the algorithm in C# looks like

string filename = Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory) + "\\input.txt";
int[,] inputTriangle = readInput(filename);
int lines = inputTriangle.GetLength(0);

for (int i = lines - 2; i >= 0; i--) {
    for (int j = 0; j <= i; j++) {
        inputTriangle[i, j] += Math.Max(inputTriangle[i+1,j], inputTriangle[i+1, j+1]);
    }
}

The output of the algorithm is

The largest sum through the triangle is: 1074
Solution took 3 ms

Closing Remarks

For Problem 18 as we have solved here, the difference is very small. However, once we apply the algorithms one Problem 67, the different approaches will be a matter of getting the answer or not.

I have seen another solution using Dynamic Programming over at Functional fun, where he goes the other way through the triangle. I think mine is easier to understand, and ends up with one solution rather than a series of numbers you need to find the maximum of, but I will let you decide.

As usual I have uploaded the source code for you to play around with. Any comments, suggestions or improvements are very welcome; both for the code and for the explanation. I love to get feedback on how to improve my communication and problem solving skills.

 

Posted by Kristian in Project Euler, 33 comments

Project Euler 17: Letters in the numbers 1-1000

Problem 17 of Project Euler changes character completely compared to the previous exercises. The problem reads

If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are 3 + 3 + 5 + 4 + 4 = 19 letters used in total.

If all the numbers from 1 to 1000 (one thousand) inclusive were written out in words, how many letters would be used?

NOTE: Do not count spaces or hyphens. For example, 342 (three hundred and forty-two) contains 23 letters and 115 (one hundred and fifteen) contains 20 letters. The use of “and” when writing out numbers is in compliance with British usage.

Continue reading →

Posted by Kristian in Project Euler, 23 comments

Project Euler 16: The sum of digits in 2^1000

Just like the solution to problem 13 the answer to problem 16 of Project Euler has become trivial with .NET 4.0. And since I am lazy I intend to exploit the tools I am given. The problem description reads

215 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.
What is the sum of the digits of the number 21000?

If there didn’t exist a BigInteger class in .NET, then we would have needed to implement a way of storing the large result, which would need 1000 bits of storage, or around the size of 32 integers. Continue reading →

Posted by Kristian in Project Euler, 25 comments

Project Euler 15: Routes through a 20×20 grid

The problem description in Problem 15 of Project Euler contains a figure, which I wont copy, so go ahead an read the full description at the Project Euler site. The problem can be understood without it though. The problem reads

Starting in the top left corner of a 2×2 grid, there are 6 routes (without backtracking) to the bottom right corner.

How many routes are there through a 20×20 grid?

My first question for many of the problems has been – Can it be brute forced? And my best answer to that is “probably”, but I cannot figure out how to generate all the routes. So instead I will give you two other approaches, which are both efficient. One is inspired by dynamic programming and the other gives an analytic solution using combinatorics. Continue reading →

Posted by Kristian in Project Euler, 40 comments

Project Euler 14: Longest Collatz Sequence

Project Euler is asking a question regarding the Collatz Conjecture in Problem 14

The problem reads

The following iterative sequence is defined for the set of positive integers:

nn/2 (n is even)
n
→ 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

 

13 40 20 10 5 16 8 4 2 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

Continue reading →

Posted by Kristian in Project Euler, 45 comments

Project Euler 13: Sum of 50-digit numbers

There is nothing particularly mathematically interesting in Problem 13 of Project Euler.  Since the question is about summing numbers. The tricky part of the question is that the numbers are so big they don’t fit into an ordinary data type. The question goes

Work out the first ten digits of the sum of the following one-hundred 50-digit numbers.

I have left out the actual numbers, but check the question for them.  I originally solved the question using arrays of integers to store each number in, and then I thought I was very smug and all. But when I sought a bit of inspiration on the internet before writing this blog post, I realised that many other languages has built in support for large integers. Continue reading →

Posted by Kristian in Project Euler, 23 comments

Project Euler 12: Triangle Number with 500 Divisors

Problem 12 of Project Euler has a wording which is somewhat different than previous problems. However, as we shall see deriving efficient solutions for the problem, we can use theory which is very similar to some of the previous problems.  The problem reads

The sequence of triangle numbers is generated by adding the natural numbers. So the 7th triangle number would be 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. The first ten terms would be:

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, …

Let us list the factors of the first seven triangle numbers:

1: 1
3: 1,3
6: 1,2,3,6
10: 1,2,5,10
15: 1,3,5,15
21: 1,3,7,21
28: 1,2,4,7,14,28
We can see that 28 is the first triangle number to have over five divisors.

What is the value of the first triangle number to have over five hundred divisors?

Continue reading →

Posted by Kristian in Project Euler, 32 comments

Greatest product in 20×20 grid – Problem 11

We are more or less revisiting a well known problem. I here I am thinking of Problem 8. Only in Problem 11 of Project Euler. The problem has become two dimensional.

The problem reads

In the 20*20 grid below, four numbers along a diagonal line have been marked in red.

08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48

The product of these numbers is 26 * 63 * 78 * 14 = 1788696.

What is the greatest product of four adjacent numbers in any direction (up, down, left, right, or diagonally) in the 20*20 grid?

Continue reading →

Posted by Kristian in Project Euler, 18 comments

Sum of all Primes below 2000000 – Problem 10

This blog post is all about the solution to Problem 10 of Project Euler. Just like Problem 7 the problem is all about primes.  And the solution strategy I posted for problem 7 would be valid for this problem as well.

The problem reads

The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.

Find the sum of all the primes below two million.

There is nothing particular tricky about this question, and since there isn’t a formula for finding all primes, we will have to brute force a solution. However, the brute force solution can be more or less elegant. In the source code I have made available, you can find the solution approach from problem 7, adopted to this problem. I wont spent any more time on that method, but instead introduce you to Sieve of Eratosthenes.

Sieve of Eratosthenes

Sieve of Eratosthenes was as the name implies invented by Eratosthenes who was a Greek Mathematician living around 200 BC.

The algorithm needs to have an upper limit for the primes to find. Lets call this limit N

The algorithm works as follows.

  1. Create a list l of consecutive integers {2,3,…,N}.
  2. Select p as the first prime number in the list, p=2.
  3. Remove all multiples of p from the l.
  4. set p equal to the next integer in l which has not been removed.
  5. Repeat steps 3 and 4 until p2 > N, all the remaining numbers in the list are primes

It is a pretty simple algorithm, and the description of it on Wikipedia has a really nice graphical illustration, which I decided I couldn’t do better my self. So go and have a look at it. The algorithm finds a prime, and then marks all multiples of that primes.  The first new number will always be a prime, since all numbers which are not will have been removed.

It can be pretty straight forward to implement the algorithm, and the challenge is definitely to optimize the implementation both execution and memory wise.

I found an implementation two implementation which are similar over at Stack Overflow and digitalBush which seemed very promising. I have then further optimized it a bit. The optimized code is shown here

public int[] ESieve(int upperLimit) {
    int sieveBound = (int)(upperLimit - 1) / 2;
    int upperSqrt = ((int)Math.Sqrt(upperLimit) - 1) / 2;

    BitArray PrimeBits = new BitArray(sieveBound + 1, true);

    for (int i = 1; i <= upperSqrt; i++) {
        if (PrimeBits.Get(i)) {
            for (int j = i * 2 * (i + 1); j <= sieveBound; j += 2 * i + 1) {
                PrimeBits.Set(j, false);
            }
        }
    }

    List numbers = new List((int)(upperLimit / (Math.Log(upperLimit) - 1.08366)));
    numbers.Add(2);

    for (int i = 1; i <= sieveBound; i++) {
        if (PrimeBits.Get(i)) {
            numbers.Add(2 * i + 1);
        }
    }

    return numbers.ToArray();
}

Once we have a list of all the primes we need the rest of the code is a trivial for loop summing up the array. You can check the source code for that bit. In the following sections I will touch on different aspects of the code.

Data representation

It uses a BitArray to store all the numbers. It is a enumerable type which uses one bit per boolean. Using a BitArray means the algorithm will limit the memory usage by a factor 32 compared to an array of booleans according this discussion. However, it will lower the operational performance. We need an array to hold 2.000.000 numbers, which means a difference of 250kB vs. 8MB.

I played around with it a bit, and didn’t see much of a performance difference for a small set of primes. For a large set of primes I noticed that the BitArray was slightly faster. This is likely due to better cache optimization, since the the BitArray is easier to store in the CPU cache, and thus increasing the performance.

Eliminating even numbers

Over at digitalBush he optimizes his code by skipping the even numbers in the loops. I have chosen a bit of another approach, to avoid allocating memory for it. It just takes a bit of keeping track of my indexes.

Basically I want to  start with three and then for a counter i = {1,2,….,N/2} represent every odd number p = {3,5,7,….,N}. That can be done as p = 2i+1. And that is what I have done in the code. It makes the code a bit more complex, but saves half the memory, and thus it can treat even larger sets.

Furthermore we start our inner loop at p2 = (2i+1)(2i+1) = 4i2 + 4i + 1, which will have the index 2i(i+1), which is where we start the search of the inner loop. By increasing p=2i+1 indexes in every iteration of the inner loop, I am jumping 2p, and thus only taking the odd multiples of p. Since multiplying with an even number will give an even result, and therefore not a prime.

Sieve of Atkin

Another Sieve method to generate primes is the Sieve of Atkin. It should be faster than the Sieve of Eratosthenes. I have made a reference implementation of it, but I can’t wrap my head around how to optimize it, so I have never been able to optimize it to the same degree as the Sieve of Eratosthenes. However, I have included the reference implementation in the source code, so you can play around with it if you like.

Wrapping up

This post took a bit of a different approach. I have worked hard to really optimize the code. Since I believe that the Sieve method for finding primes will be used later. So making a reusable bit of fast code, will make later problems easier to solve.

I took three 3 different approaches and tried to optimize them.  The result of the three are

Prime sum of all primes below 2000000 is 142913828922
Solution took 203,125 ms using Trial Division
Prime sum of all primes below 2000000 is 142913828922
Solution took 31,25 ms using Sieve of Eratosthenes
Prime sum of all primes below 2000000 is 142913828922
Solution took 31,25 ms using Sieve of Atkin

The difference between the two sieve methods are not really noticeable for such small numbers, but if we start calculating all primes below one billion, the differences in execution time will show. And as usual you can download the source code.

If you can come up with further optimizations of the sieve methods, I would appreciate you to leave a comment.  I am always eager to learn more and speeding things up further.

Posted by Kristian in Project Euler, 31 comments

A 1000 Pythagorean triplets – Problem 9

Today’s problem in Project Euler, is related to probably THE most well known theorem of all times.  Pythagoras theorem stating that Geometric interpretation of Pythagoras

The square of the hypotenuse is the sum of squares of the two other sides.

It can also be stated in a more geometrical way as

In any right triangle, the area of the square whose side is the hypotenuse (the side opposite the right angle) is equal to the sum of the areas of the squares whose sides are the two legs (the two sides that meet at a right angle

Which can also be shown in a graphical sense, on the figure to the right, where the blue area equals the orange area.

But enough with the background info, the problem reads

A Pythagorean triplet is a set of three natural numbers, a < b < c, for which,

a2 + b2 = c2

For example, 32 + 42 = 9 + 16 = 25 = 52.

There exists exactly one Pythagorean triplet for which a + b + c = 1000.

Find the product abc.

Continue reading →

Posted by Kristian in Project Euler, 41 comments