Problem 124 of Project Euler deals with radical functions, something I have never heard of before. So that always makes it interesting. It reads
The radical of n, rad(n), is the product of distinct prime factors of n. For example, 504 = 2^{3} x 3^{2} x 7, so rad(504) = 2 x 3 x 7 = 42.
If we calculate rad(n) for 1 ≤ n ≤ 10, then sort them on rad(n), and sorting on n if the radical values are equal, we get:
Unsorted

Sorted


n 
rad(n) 
n 
rad(n) 
k 

1

1

1

1

1


2

2

2

2

2


3

3

4

2

3


4

2

8

2

4


5

5

3

3

5


6

6

9

3

6


7

7

5

5

7


8

2

6

6

8


9

3

7

7

9


10

10

10

10

10

Let E(k) be the kth element in the sorted n column; for example, E(4) = 8 and E(6) = 9.
If rad(n) is sorted for 1 ≤ n ≤ 100000, find E(10000).
At first I thought I could do it by generating number with the smallest radical number, such that I started by taking 2, 4, 8, 16… which are all multiple of 2. Followed by 3,9, 27….. But then I realized that I had to keep track of that I should take the numbers with a radical of 5, followed by 2*3 followed by 7, and soon it started to seem a lot less like a good idea. So I abandoned the idea of just generating the numbers in the correct order until I reached E(10000).
Sieving all the numbers
Instead I started to look into a method such that I could generate all the numbers through a sieving method, like we have done several times before. And then sort it afterwards. The sieving turns out to be rather easy. I have made a small extra class in order to sort it, which I will get back to in a few paragraphs. But the sieving looks like
Radical[] radicals = new Radical[limit]; for (int i = 0; i < radicals.Length; i++) { radicals[i] = new Radical(1, i); } for (int i = 2; i < limit; i++) { if (radicals[i].rad == 1) { radicals[i].rad = i; for (int j = i + i; j < limit; j += i) { radicals[j].rad *= i; } } }
We are starting at 2 and go through every multiple of 2 and multiplying the radical with that. After that we take 3 and since the current value of 3 is 1, it means that it is indeed a prime, so we do the same here. We then continue until we have generated all radicals.
Sorting
C# has a nice array sort, which should be usable on the problem. However, we have two requirements, we need to sort by radical and then by number. Which is why I made a small helper class which holds both and then implement the IComparable interface, such that I can make my own compare function for the sorting algorithm. This class looks like
private class Radical : IComparable { public int number, rad; public Radical(int rad, int number) { this.number = number; this.rad = rad; } int IComparable.CompareTo(object x) { Radical other = x as Radical; if (this.rad > other.rad) return 1; if (this.rad < other.rad) return 1; return this.number > other.number ? 1 : 1; } }
Not exactly rocket science, but it solves the problem of sorting. Once the array is sorted, the rest is trivial, since it is just picking entry 10000.
Wrapping up
The solution takes 55ms, which I am not too keen about. However, I haven’t found a better solution. I get the result
E(10000) = 21417
You can see the complete source code here. I am quite keen to learn your solution approach if it is different.
Cheated using factor function in Matlab:
function [ answer ] = problem124()
for i=1:100000
rad(1,i) = prod(unique(factor(i)));
rad(2,i) = i;
end
rad = sortrows(rad')';
answer = rad(2,10000);
end
The code ran little bit slow (~15 seconds), may be due to factor function implementation in Matlab, but it works for me.
Yup, Matlab is an awesome tool for this kind of problem, though not really known for it’s speed 🙂
One nice thing about Matlab though, is the profiling tool which can help you identify the costly operations.
My original solution was pretty much the same as yours. But now when I was reading your solution, I came up with an idea that, in theory, should be faster.
It’s very similar to generating integers of the form [latex]2^i3^j[/latex] in ascending order using a (min) priority queue. You start by pushing 1 onto the queue ([latex]2^03^0[/latex]), and then in a loop pop the smallest value from the priority queue, let’s call it [latex]x[/latex], and then push both [latex]2x[/latex] and [latex]3x[/latex].
For this problem, we have a priority queue. Each item is a struct containing the actual number, the radical of that number, and finally the index of the prime that is the largest prime factor of the number. First off, we find all the prime numbers up to 100000 using an optimized sieve. Then for each of the primes, we push that prime onto the queue. The radical function of that prime is of course the prime itself, and the index of the largest prime factor of that prime is of course the index of the prime itself. The priority queue is minbased and uses the radical value as the key (and the number itself if the radical values are equal). Then we create a loop, similar to the loop above, that pops the next value from the queue and then does a loop through each prime with an index greater than or equal to the index of the largest prime factor in the current number, and pushes the current number * the current prime number we are on, and of course with the other values adjusted appropriately.
The 10000th element we pop from the queue is then our answer.
The advantage of this solution is that we are kind of only sorting the first 10000 elements, instead of the whole 100000 elements (as is done in the naive solution).
So the solution looks as follows:
To my surprise, it ran in the exact time as my naive solution (~30ms on average), so I tried it with other inputs, to see how the two solutions would differ.
Test 1 (large n, small k):
n = 10000000
k = 100
Naive solution: 4470ms
Using PQ: 530ms
Test 2 (large n, moderately large k):
n = 10000000
k = 10000
Naive solution: 4470ms
Using PQ: 1260ms
Test 3 (large n, large k):
n = 10000000
k = 10000000
Naive solution: 4470ms
Using PQ: 20100ms
These tests went exactly as I thought they would go. The solution with the priority queue is slower than the naive solution (by a factor of ~45), but it’s advantage is that it doesn’t do more than is needed (that is, only processes the [latex]k[/latex] first values).
So my conclusion is that, since, in the problem, [latex]k[/latex] is smaller than [latex]n[/latex], it can be used as an advantage to create a faster solution. To get the running time down even more probably just means more optimization.
But fun problem and good article.
Actually my first solution attempt, as you can see from the description is very similar to the one you describe. However, I couldn’t manage to code it. So nicely done. I am a bit surprised that it is that much slower, but I guess involving a queue and pushing/popping from that gives some overhead.
Using python and the factor function from http://blog.dreamshire.com/2009/03/26/94/
Leads to a solution that runs in under 2 seconds and doesn’t use a prime sieve. I’m not sure it scales up too well though
I used the sieve approach like you. I don’t think there’s a faster method for smallish N. Rather than sorting, however, I took this as a good opportunity to learn about the quickselect algorithm, which is like quicksort but only branches down the one side that brings you closer to finding the kth element. In c++, mine runs in about 5ms.
For large N (e.g. 10^18), you need to generate radicals sequentially using a sieve, then you can count how many n’s there are for a given radical using loops and logarithms (e.g. for a radical consisting of one prime p1, count(n)=log_p1(N/p1)+1 ). Eventually you can determine what the target radical is, then you can generate the n’s in sequential order using some loops and a priority queue.
For N=10^18, I get E(1000000)= 98403614720000000 in 30ms.
I did it in C with a prime sieve multiplying like you did and stdlib’s qsort, so far equivalent to your solution, but instead of having a data structure containing two numbers, I had two copies of one integer array, sorted one of them, looked at sorted array position [10000] and counted backwards the number of equal values, then I did one forloop accumulating += (unsorted==sorted[10000]) stopping at the count found in the first pass.
Therefore I got away with using simpler data structures and not having to care about the sorting being “stable”, but at the cost of some post processing calculations.