Project Euler 267: Billionaire

It has been a long while since I solved any Project Euler problem. For some reason I read an article about it and someone references Problem 267, so I decided to take a look at it, and it sucked me in. The problem reads

You are given a unique investment opportunity.

Starting with £1 of capital, you can choose a fixed proportion, f, of your capital to bet on a fair coin toss repeatedly for 1000 tosses.

Your return is double your bet for heads and you lose your bet for tails.

For example, if f = 1/4, for the first toss you bet £0.25, and if heads comes up you win £0.5 and so then have £1.5. You then bet £0.375 and if the second toss is tails, you have £1.125.

Choosing f to maximize your chances of having at least £1,000,000,000 after 1,000 flips, what is the chance that you become a billionaire?

All computations are assumed to be exact (no rounding), but give your answer rounded to 12 digits behind the decimal point in the form 0.abcdefghijkl.


binomialtreeThis problem actually combines several things which I know just a little about. The first thing, is that the problem can be expressed in terms of binomial trees. Which is something that is quite used for options pricing.

For this example I have chosen f= 1/4. We start on the left side of the tree with 1, and if we get heads we go up from there such that we now have 1.5, and if we get get tail we go down to 0.75.

For the second throw we can the either win or loose again, and depending on the first throw we end up in one of three positions. 2 heads, 1 head and 1 tail, or 2 tails.

What you can see is that we can now get all the outcomes sorted in how much we end up with. If we have a thousand throws and a given f, we will find that somewhere along the list of the 1000 different outcomes we will reach our limit of 1,000,000,000. All numbers above that will also fulfil the goal. So what we are looking for is a proportion f that pushes the limit as far down the list as possible. In other words an f that minimizes the number of heads we need.

In general we can express the amount in each of the end states after a thousand throws as

where is the starting capital of 1, u is the proportion we earn if we get a heads, and d is the proportion we earn if we get a tail. h is the number of heads.
We can express u and d as a function of f. In case of u, we get the two times f plus our initial capital before the throw. meaning that , in case of d, we loose f, meaning that we have left.
So for each of the 1000 possible outcomes we can express our capital S as

How does this help us? Well for a given f there is a minimum number of heads we need in order to earn more than S >= 1,000,000,000. And the fewer heads we need in order to earn at least our billion pounds the larger the probability of doing so. So let us see if we can isolate h and f.

So what we have now is that we need to find an f that minimizes the left hand side of the function. By the way, let us denote the left hand side L(f)

It is quite possible that you could differentiate this and find a minimum. I haven’t been able to do so easily, so I decided to implement a numeric algorithm for this instead. For this problem we can make a simple gradient descent, which isn’t as complicated as Wikipedia might make it look like.

In fact we can implement it as

double f = 0.5;
double alpha = 1e-4;
double fold = 0;

while(Math.Abs(f-fold) > 1e-4){
    double fdev = (L(f + 1e-6) - L(f)) / 1e-6;
    fold = f;
    f -= alpha * fdev;

private double L(double f) {
    return (9 * Math.Log(10) - 1000 * Math.Log(1 - f)) / (Math.Log(1 + 2 * f) - Math.Log(1-f));

Let me explain the algorithm. On line 11-13 we have an evaulation of the function we want to minimize.
On line 6 we calculate the gradient of the function using the numeric approximation


where h is a really small number. In our case 10-6.  On line8 we then move f a step in the direction the gradient point, this is modifed by some factor alpha. Usually I start out with 10-4 as alpha. If alpha is too big the algorithm never converges, and if it is too small it will take a long time. However, 10-4 is usually a good value. We continue to update f until the step we are taking gets below a certain threshold. In our case 10-4 again.

Doing this, gives us that an f value of 0.146, which in turn gives us that we need 431.25 heads in order to get at least 1,000,000,000 after 1000 throws. Since the number of heads needs to be an integer that means we need 432 heads in order to reach our limit.

Now that we have the minimum number of heads we need in order to reach our limit we can calculate the chance of getting that.  This can be calculated as number of combinations that gives us 432 or more heads divided by the total number of combinations. In the denominator we have  combinations.

So let us start by counting combinations that will give us 432 or more heads. For exactly 432 heads we can use “1000 choose 432”, which is a standard binomial coefficient. We can then calculate this for all numbers up to 1000 choose 1000.

So from before we have the minimum f, and then we can implement

BigInteger totalComb = BigInteger.Pow(2, 1000);
BigInteger winningComb = 0;

for(int h = (int) Math.Ceiling(L(f)); h <= 1000; h++){
    winningComb += Choose(1000, h);

This is a really large number, so we need to use big integer. So far we have the numerator and denominator in absolute numbers, so now we can calculate the chance of winning in the format the question asks.

Wrapping up

It took me 112ms to get the solution, and the majority of the time is spent in the Big integer calculations. You can of course download the source code here.

The blog image is created by Philip Taylor and shared under Creative commons

Posted by Kristian


Scott Dennison

In the large equation section where you isolate h and f, the second and third line are the same.
In addition, please could you add an extra line, for people like me in the future, who forgot the rules ln(m^n) = n*ln(m), and ln(m*n) = ln(m) + ln(n).

Other than that, nice explanation 🙂 I still don’t understand gradient descent so I will research that more before I attempt this.

Jean-Marie Hachey

Hi Kristian,

Good to see you back on a new problem.

Running your algo gives me the following results:
1) The chance of winning is maximized at 0,146982289954167
2) The chance of winning is 0,999992836186714

My interpretation:
After betting on a fair coin toss repeatedly for 1000 tosses
in the conditions described in the article:

Chances of winning 14,698…% of the time are 99,999…%

Do you agreee with this interpretation ?

Thanks for this great article and algo.

This problem can be solved without using bigint. n choose k and 2**1000 can both be computed in floating point; you don’t need all of the signifiicant digits you get using large integers. Since you convert the numbers back to floating point anyway, your program would break if the problem asked for more than 1023 tosses (the max exponent in a floating point number.)

@scott: The third line is remove, but I am unsure where I should add the extra rules. If nothing else, I guess your comment covers it 🙂

The gradient descent idea is pretty simple. I like to visualise it in a three dimensional graph, such that the function creates a landscape for us to travel around in.

We want to find the lowest point in the landscape. We then pick some starting point. From that starting point we figure out which way goes downhill the most. We then take a step in that direction. At the new found point we then do the same until we reach a point where we stopped moving. That should be a minimum. This only finds the global minimum if the function is convex and we take small enough steps.

@david: Yes I realised that afterwards, but I wasn’t sure and therefore I choose bigInt in order to ensure precision for as long as possible.

Bjarki Ágúst

Great to see another post from you! I just solved the problem myself, but in a slightly less mathematical way, I guess.

First, to compute the probability of becoming a billionaire using some specified [latex]f[/latex], I used dynamic programming to compute the following recurrence:

double P(double f, int wins, int losses) {
    double money = pow(1 + 2*f, wins) * pow(1 - f, losses);

    if (wins + losses == 1000)
        return money >= 1e9 ? 1.0 : 0.0;

    return 0.5 * P(f, wins+1, losses) + 0.5 * P(f, wins, losses+1);

Notice that the form of the recurrence is almost the same as that of the binomial coefficients, possibly giving some intuition into the connection.

After looping through a couple of values for [latex]f[/latex] and computing [latex]P(f,0,0)[/latex], I noticed that the function was first increasing and then decreasing (i.e. looks like a [latex]2^\textrm{nd}[/latex] order polynomial). I could easily find the maximum of this function in a couple of iterations using ternary search. It’s basically the same as binary search, but you sample two values on the interval each time, instead of just one (I’m pretty sure that this generalizes, the maximum/minimum of a [latex]k^\textrm{th}[/latex] order polynomial can be found using [latex](k+1)[/latex]-ary search). The code for that is roughly as follows:

double f0 = 0.0,
       f3 = 1.0;

for (int i = 0; i < 10; i++) {
    double f1 = f0 + 1 * (f3 - f0) / 3,
           f2 = f0 + 2 * (f3 - f0) / 3;

    if (P(f1, 0, 0) < P(f2, 0, 0))
        f0 = f1;
        f3 = f2;

printf("%0.12lf\n", P(f0, 0, 0));

The loop can then be iterated as often as necessary to get better precision. Ten iterations seem to be enough for this problem.

As always, I really like your mathematical analysis. I haven’t used Gradient descent before, but will check it out. In my opinion, ternary search seems to be a better fit for this particular problem, which is natural since Gradient descent is pretty much just a generalization.

Looking forward to your next post! 🙂

Gregory E. Fisher

Nice one! The best part is that using some mathematical equations you are also providing some the solution in some programming Language that is very great and in future it help other definitely…..

Hi Kristian,

It was great going through your blogs on Euler. Will you be blogging about it again?

It is my hope that I will begin again some time, but there has been a whole lot going on in my life lately and it has not left me with time or energy to start blogging again.

Janko Jerinic

Hey Kristian,

No need to overdo it with the BigIntegers 🙂 As you noticed, they can be a bit slow. Besides, calculating every single binomial coefficient is wasteful, sincethey are recursively connected:

(N choose K+1) = (N choose K) * (N – K) / (K + 1);

Also, you might as well calculate the first 432 coefficient, rather then last 568, and just invert the result.

Here’s the code for the calculation of probability. Number of heads is given as nHeads:
double probability = 0.0;
double totalCombs = Math.Pow(2, 1000); // Fine because double is 1e+308
double binCoeff = 1;
for (int i = 0; i < nHeads; i++)
probability += binCoeff/totalCombs;
binCoeff *= (1000.0 – i) / (i + 1.0);

probability = 1 – probability;

So, with that in place, the running time of the algorithm is substantially reduced, to just a few milliseconds:

Optimum bet ratio is 0.1468844488, requiring only 432 heads.
Total probability is: 0.999992836187
Solution took 1.2904 ms


I am amazed by this. I mean you have a low chance of winning as you could throw
tails all day. Do you guys work in this area. Euler was from the 17 hundreds
I feel like I’m from the Stone Age, the internet was great
You can see all kinds of information but to you super brains
I’m not in the same ball park. Feel like a alien really. Complete novice
I just worked out logic. Otherwise I was just improvising

Paulo Mendes

Instead of gradient descent, you could have used a golden section search in part 1.

Regarding gradient descent, the Numerical Recipes books (old editions are online for free) discuss them in more depth and focus on some important practical aspects that Wikipedia doesn’t broach.

Leave a Reply