This problem caused me quite a lot of trouble, and from what I gather afterwards no one seems to have found a really nice solution for this. The problem reads

The primes 3, 7, 109, and 673, are quite remarkable. By taking any two primes and concatenating them in any order the result will always be prime. For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of these four primes, 792, represents the lowest sum for a set of four primes with this property.

Find the lowest sum for a set of five primes for which any two primes concatenate to produce another prime.

I found two ways to approach this a brute force method and a method involving intersecting sets. After a bit of optimization they are about the same speed.

## Brute force method

One thing I realised when solving this exercise was that the most expensive part of the program was to concatenate two primes and check if the result was a prime as well. So we don’t want to limit the amount of times we need to do this operation and avoid doing it more than once for the same primes.

I have assumed an upper limit of 30.000 was sufficient when I started to solve this problem, so I started out by making an array of all primes below 30.000 with a sieving method which I wont comment on here, but you can check out problem 10.

The next thing I wanted to do was to ensure that checking if two primes yielded a prime when concatenated wasn’t done more than once per pair. For this we need a data structure, and to avoid a 3245×3245 array to store these values in, I made an array of HashSets where the Hashset on index 0 is a list of all primes which can be concatenated with the first prime. The hash set on index k is the list of all primes larger than k which can be concatenated with k and yield a prime.

The method for generating such a set is rather simple and looks like

private HashSet MakePairs(int a) { HashSet pairs = new HashSet(); for (int b = a + 1; b < primes.Length; b++) { if (isPrime(concat(primes[a], primes[b])) && isPrime(concat(primes[b], primes[a]))) pairs.Add(primes[b]); } return pairs; }

Where functions such as concat and isPrime is written for earlier problems, and you can check them out in the source code if you like.

The main method for the brute force is method is pretty bulky and consists of 5 nested for loops. There is nothing graceful in it.

int result = int.MaxValue; primes = ESieve(30000); HashSet[] pairs = new HashSet[primes.Length]; for (int a = 1; a < primes.Length; a++) { if (primes[a] * 5 >= result) break; if (pairs[a] == null) pairs[a] = MakePairs(a); for (int b = a + 1; b < primes.Length; b++) { if (primes[a] + primes[b] * 4 >= result) break; if (!pairs[a].Contains(primes[b])) continue; if (pairs[b] == null) pairs[b] = MakePairs(b); for (int c = b + 1; c < primes.Length; c++) { if (primes[a] + primes[b] + primes * 3 >= result) break; if (!pairs[a].Contains(primes) || !pairs[b].Contains(primes)) continue; if (pairs == null) pairs = MakePairs(c); for (int d = c + 1; d < primes.Length; d++) { if (primes[a] + primes[b] + primes + primes[d] * 2 >= result) break; if (!pairs[a].Contains(primes[d]) || !pairs[b].Contains(primes[d]) || !pairs.Contains(primes[d])) continue; if (pairs[d] == null) pairs[d] = MakePairs(d); for (int e = d + 1; e < primes.Length; e++) { if (primes[a] + primes[b] + primes + primes[d] + primes[e] >= result) break; if (!pairs[a].Contains(primes[e]) || !pairs[b].Contains(primes[e]) || !pairs.Contains(primes[e]) || !pairs[d].Contains(primes[e])) continue; if (result > primes[a] + primes[b] + primes + primes[d] + primes[e]) result = primes[a] + primes[b] + primes + primes[d] + primes[e]; Console.WriteLine("{0} + {1} + {2} + {3} + {4} = {5}", primes[a], primes[b], primes, primes[d], primes[e], result); break; } } } } }

There are a few things to note here.

First thing is that we can’t stop once we find a solution, since we need to find the smallest sum, and that is indeed not the first sum we find so we have to keep looking.

The second thing to note is that I don’t call makePairs() before I need to, and that actually speeds the code up, which means that not all prime numbers will need to be checked.

The last thing I want to note is the break conditions, once we have found one solution we don’t need to search for prime sets where the sum is larger than the already found best solution.

Running the code gives the following result

7 + 1237 + 2341 + 12409 + 18433 = 34427 13 + 5197 + 5701 + 6733 + 8389 = 26033 Lowest sum of 5 primes 26033 Solution took 2722 ms

It finds two sets and then spends the majority of the time searching for other solutions without finding more. It is a pretty computational heavy exercise and solving it in under 3 seconds is not that impressive. I could have limited the upper bound on prime numbers to 10.000 and still found the solution, but keeping it at 30.000 ensures that I do indeed find the smallest sum.

## Set Intersections

The code for the second approach is fairly similar to the first approach, but the method is different.

Once again I need to generate the sets of pairs I described in the brute force method. The method then goes on to take the first set and find the smallest prime in that set. Using 3 as an example we get the set {19, 61, 97, 109, 127, 229, 283,…}. So we take 19 and generate the corresponding set and intersects the two. The result is all primes that can be concatenated with 3 and 19. We can then continue this operation until we have either

- An empty set, which means that the chosen primes do not fulfil the property
- A set of 4 primes which yield a non empty set. The smallest entry in this set is then the 5th prime we are looking for to generate a sum

The code I wrote for this strategy looks a lot like the first one. But with some more house hold code, since I need to keep track of the set I am working on to be able to back track once I find an empty set.

int result = int.MaxValue; primes = ESieve(30000); HashSet[] pairs = new HashSet[primes.Length]; for (int a = 1; a < primes.Length; a++) { if (primes[a] * 5 >= result) break; if (pairs[a] == null) pairs[a] = MakePairs(a); SortedSet testSet = new SortedSet(pairs[a]); for (int b = a + 1; b < primes.Length; b++) { if (primes[a] + primes[b] * 4 >= result) break; if (!testSet.Contains(primes[b])) continue; if (pairs[b] == null) pairs[b] = MakePairs(b); SortedSet tempA = new SortedSet(testSet); testSet.IntersectWith(pairs[b]); if (testSet.Count == 0) { testSet = tempA; continue; } for (int c = b + 1; c < primes.Length; c++) { if (primes[a] + primes[b] + primes * 3 >= result) break; if (!testSet.Contains(primes)) continue; if (pairs == null) pairs = MakePairs(c); SortedSet tempB = new SortedSet(testSet); testSet.IntersectWith(pairs); if (testSet.Count == 0) { testSet = tempB; continue; } for (int d = c + 1; d < primes.Length; d++) { if (primes[a] + primes[b] + primes + primes[d] * 2 >= result) break; if (!testSet.Contains(primes[d])) continue; if (pairs[d] == null) pairs[d] = MakePairs(d); SortedSet tempC = new SortedSet(testSet); testSet.IntersectWith(pairs[d]); if (testSet.Count == 0) { testSet = tempC; continue; } int e = testSet.Min; if (primes[a] + primes[b] + primes + primes[d] + e < result) result = primes[a] + primes[b] + primes + primes[d] + e; Console.WriteLine("{0} + {1} + {2} + {3} + {4} = {5}", primes[a], primes[b], primes, primes[d], e, result); testSet = tempC; } testSet = tempB; } testSet = tempA; } }

Not pretty, and the result is actually as slow as the brute force method. However writing this method helped me to immensely improve the brute force method which started out with a run time of 10 seconds or so.

The result of this code is

7 + 1237 + 2341 + 12409 + 18433 = 34427 13 + 5197 + 5701 + 6733 + 8389 = 26033 Lowest sum of 5 primes 26033 Solution took 2645 ms

## Wrapping up

This problem was rather easy to make a brute force attempt for once you have found a reasonable upper limit on the primes we want to investigate. However, the problem is computationally very expensive and getting it trimmed to yield just a reasonable performance has been pretty difficult.

I would like to see some faster code, but I haven’t figured out how to do that. The other side is that the current piece of code scales poorly. But the problem is solved in no less than two ways.

As always you can find the source code for download. I would as always love to hear from you if you have comments, questions or other solution strategies.

In lack of better ideas for the image for this post, I chose something from one of my favourite cartoons xkcd. I can recommend checking it out.

Hi Kristian, love this site.

My solution to this problem keeps returning the list-

3 37 1237 1699 8713 – sum being 11689. Any ideas?

I know it must be wrong… And i’m probably missing something very obvious…

Apologies if this is glaringly stupid!

Hi Luc.

There is one combination which is prime when concatenated: 1237 and 1699.

That is, 12371699 is not prime.

But all other combinations, according to my program, are prime.

Hope this helps. If not, you could show us your source code.

Ah yes of course. I was not thorough enough with my checks.

Thank you so much.

Hi Luc

Thanks for the compliment and thanks for asking the question. I don’t think it is stupid at all since I have tried infinite number times to be stuck on a problem and sometimes the solution is simple, but I still can’t see it.

I have been away all night meditating, but it seems Bjarki helped you well on your way, so I guess the problem is solved already.

/Kristian

Thank you Kristian,

The mistake in my solution related to a slight indexing error. As a result it was giving me answers that were almost correct. It’s very frustrating when the core logic is correct but a sloppy error can cause such errors!

Anyway, I find your site extremely helpful, interesting, educating and a great place to go when I’m stuck on project euler! Thanks again,

Luc

I have a question for you: how did you come up with the limit of 30,000 ? You say at some point that “keeping it at 30.000 ensures that I do indeed find the smallest sum.”. Why is that so ?

The limit of 30.000 was an assumption to begin with as the article mentions. Since we find a solution where the sum if less than 30.000 we know that any solution with a prime larger then 30.000 will give us a solution with a sum which is higher.

If I had set the limit to 10.000, I could not have been sure that there isn’t a solution with a prime above 10.000 that would have given me a smaller total sum.

/Kristian

I think a slightly optimized version of the brute force solution is to add primes as you go, checking to see if each prime you add is part of the set with 4 other primes. This means that instead of having 5 nested loops, you only need 4, since if you haven’t found the set before adding the last prime, it must contain that last prime if it does. Furthermore, you can add that last prime to your test set before starting the loops so you don’t repeat the smaller sequences of primes every time. Again, not sure if this actually optimizes the solution or is merely a different way of soliving it, but my program ran in 15 seconds.

This sounds like a very interesting approach. As you say, I am not sure it is actually more optimal to do it that way, but certainly a different way to do it.

Do you have the code for it, I would love to see it?

Certainly! Excuse any sloppiness. I don’t have a lot of experience programming and mostly do it for fun.

My isAnswer method checks that a given vector satisfies the prime question regardless of how many integers it holds

Hi, don’t you need a much larger sieve to test if the concatenated numbers prime?

How did u confirm that 18433’34427 and 34427’18433 are prime?

I mean 12409’18433 and 18433’12409…

Yes I would if I used the sieve to test the primality of the concatenated numbers. However, I guessed that would be rather inefficient, so instead I used a Rabin Miller primality test, something I did’t really touch on in the blog post but you can see it in the source code.

Great post as usual.

I made all the possible mistakes while solving this problem. I was checking the primality of concatenated numbers repeatedly. Once I figured that out, I used the grid method (by creating a 2D bool array). Populating the grid itself was taking more than 50 seconds. Once I implemented hashset, I forgot to continue and was still taking a lot of time.

The nice little check where we compare the result with the numbers is still pretty cool (it improves my time by a big margin). I learned so many little tricks to squeeze the every bit out of my program after looking at your post.

I started with 10000 primes and it was taking forever to loop through them. I still don’t know how did you come to the 30000 limit but that speeds up things quite a bit (10000 primes take 2 mins 2 seconds)

Thanks again for helping out poor souls like me.

I am now at a point where I have implemented my code very similar to you but I still take 23 seconds to get the result.Your machine takes 2722 ms (2.7 seconds). You have a heck a machine I say.

I took a bit of a different approach and only mention it because you said that looking for another possible solution after the first set of five primes was discovered took the bulk of the time.

What I did was keep a list of the sets of size two, three, and four primes as I went along. No nested loop. Each was a function call. I used a “next prime” and an “is it a prime” algorithm rather than computing a list of primes up front. And I did not need to go with something like the 30,000 max prime limit like you used.

I did not go the hash table route either. A sort of pseudo code in middle of the initial search looks like.

Compute the next larger prime.

Check the list of the sets of size four. Can the next prime be added to the any of the sets of four to make a set of five – if so add it.

Check the list of the sets of size three. Can the next prime be added to any of those sets – if so add it and put the new set in the list of sets of size four. You have to leave the original set in the list of sets of size three just in case there is another prime that will work with that set of three.

Check the list of the sets of size two. Can the next prime be added to any of those sets – if so add it and put it in the list of sets of size three.

Can the new prime be combined with a smaller (I saved those as I went along) to form a set of size two – if so add it to the list of sets of size two.

Loop through those steps until I find a set of size five. The sum as you found is 26033. This is basically a while loop with the stopping criterion nothing to stop it form going forever. I change that limit once the set of five prime is found.

The downside at this point is I am doing the same pairing multiple times as I work my way down the list of set. The primes 3 and 7 are in sets of size two, three, and four. So I do those each time they appear in the set. Your hash table setup can avoid this.

Once the set of five primes is found I know how much further I need to go. First I change the limit on the while loop to stop me at the largest prime that might work – 26033. A smaller number actually can be used and is a bit better choice. I can can restrict some of the calculations – much like you did. I basically loop through the sets as above with one change. When looking at the sets of size three for example if sum of the three primes plus twice the value of the new prime is greater than 26033 don’t do any additional work on that prime for that set.

I can still save move time if once I discover that say a set of three primes I have previously identified will not work with an new prime due to constraint on the sum I never have to look at that set again. So I should have deleted from my list.

Is there a reason why for the third loop (with running index c): You have

if (primes[a] + primes[b] + primes * 3 >= result) break;

if (!testSet.Contains(primes)) continue;

instead of

if (primes[a] + primes[b] + primes * 3 >= result) break;

if (!testSet.Contains(primes)) continue;

Isn’t primes an array?