Problem 573: Unfair Race
View on Project EulerProject Euler Problem 573 Solution
EulerSolve provides an optimized solution for Project Euler Problem 573, Unfair Race, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For a given \(n\), the quantity to compute is an expected value \(E(n)\) coming from the unfair-race model. The key simplification used by the implementation is that this expectation can be written as the finite sum $E(n)=\sum_{k=1}^{n}\binom{n}{k}\left(\frac{k}{n}\right)^k\left(\frac{n-k}{n}\right)^{n-k}.$ The target input is \(n=10^6\), so a direct evaluation based on factorials or repeated binomial-coefficient construction would spend most of its time rebuilding the same information. The program instead walks through the terms sequentially and updates each one from the previous one by a compact multiplicative ratio. Mathematical Approach The whole method rests on turning the exact summand into a first-term-plus-ratio recurrence. That replaces heavy combinatorial arithmetic by a single linear pass....
Detailed mathematical approach
Problem Summary
For a given \(n\), the quantity to compute is an expected value \(E(n)\) coming from the unfair-race model. The key simplification used by the implementation is that this expectation can be written as the finite sum
$E(n)=\sum_{k=1}^{n}\binom{n}{k}\left(\frac{k}{n}\right)^k\left(\frac{n-k}{n}\right)^{n-k}.$
The target input is \(n=10^6\), so a direct evaluation based on factorials or repeated binomial-coefficient construction would spend most of its time rebuilding the same information. The program instead walks through the terms sequentially and updates each one from the previous one by a compact multiplicative ratio.
Mathematical Approach
The whole method rests on turning the exact summand into a first-term-plus-ratio recurrence. That replaces heavy combinatorial arithmetic by a single linear pass.
Step 1: Isolate the Contribution of a Fixed \(k\)
Define
$T_k=\binom{n}{k}\left(\frac{k}{n}\right)^k\left(\frac{n-k}{n}\right)^{n-k}\qquad (1\le k\le n).$
Then
$E(n)=\sum_{k=1}^{n}T_k.$
The last term is especially simple:
$T_n=\binom{n}{n}\left(\frac{n}{n}\right)^n\left(\frac{0}{n}\right)^0=1.$
So it is convenient to separate it and write
$E(n)=1+\sum_{k=1}^{n-1}T_k.$
The first nontrivial term is
$T_1=\binom{n}{1}\left(\frac{1}{n}\right)\left(\frac{n-1}{n}\right)^{n-1}=\left(\frac{n-1}{n}\right)^{n-1}.$
This matches exactly the starting value used by the implementations.
Step 2: Derive the Neighbor-to-Neighbor Ratio
Instead of recomputing \(T_{k+1}\) from scratch, compare it with \(T_k\):
$\frac{T_{k+1}}{T_k}=\frac{\binom{n}{k+1}}{\binom{n}{k}}\cdot\frac{(k+1)^{k+1}}{k^k}\cdot\frac{(n-k-1)^{n-k-1}}{(n-k)^{n-k}}.$
Using
$\frac{\binom{n}{k+1}}{\binom{n}{k}}=\frac{n-k}{k+1},$
the factors simplify cleanly to
$\frac{T_{k+1}}{T_k}=\left(\frac{k+1}{k}\right)^k\left(\frac{n-k-1}{n-k}\right)^{n-k-1}\qquad (1\le k\le n-2).$
Therefore the whole series can be generated from the recurrence
$T_{k+1}=T_k\left(\frac{k+1}{k}\right)^k\left(\frac{n-k-1}{n-k}\right)^{n-k-1}.$
This is the central algebraic reduction: one exact starting term plus one exact ratio produces every remaining summand.
Step 3: Evaluate the Ratio in the Log Domain
For large \(n\), both factors in the ratio are close to \(1\). Multiplying powers that are individually near \(1\) is numerically safer in logarithmic form:
$\log\frac{T_{k+1}}{T_k}=k\log\left(1+\frac{1}{k}\right)+(n-k-1)\log\left(1-\frac{1}{n-k}\right).$
Exponentiating gives back the ratio:
$\frac{T_{k+1}}{T_k}=\exp\left(k\log\left(1+\frac{1}{k}\right)+(n-k-1)\log\left(1-\frac{1}{n-k}\right)\right).$
This is why the implementations use a numerically stable \(\log(1+x)\) routine rather than forming the powers directly at every step. It avoids unnecessary cancellation when \(k\) or \(n-k\) is large.
Step 4: Structural Observations
Several useful properties follow immediately from the closed form.
Every term is positive, so the running sum grows monotonically.
The summand is symmetric:
$T_k=T_{n-k}\qquad (1\le k\le n-1),$
because
$\binom{n}{k}=\binom{n}{n-k}$
and swapping \(k\) with \(n-k\) leaves the remaining factors unchanged. The implementations still keep the simpler full sweep from left to right, but this symmetry is a good consistency check when deriving the formula.
Step 5: Worked Example for \(n=4\)
For \(n=4\), the four summands are
$T_1=\binom{4}{1}\left(\frac{1}{4}\right)\left(\frac{3}{4}\right)^3=\left(\frac{3}{4}\right)^3=\frac{27}{64},$
$T_2=\binom{4}{2}\left(\frac{2}{4}\right)^2\left(\frac{2}{4}\right)^2=6\cdot\frac{1}{16}=\frac{24}{64},$
$T_3=\binom{4}{3}\left(\frac{3}{4}\right)^3\left(\frac{1}{4}\right)=\frac{27}{64},$
$T_4=1.$
Hence
$E(4)=\frac{27}{64}+\frac{24}{64}+\frac{27}{64}+1=\frac{142}{64}=\frac{71}{32}=2.21875.$
This matches the numerical checkpoint used by the implementations and shows concretely how the summation formula behaves.
How the Code Works
The C++, Python, and Java implementations all follow the same plan. They first handle the trivial case \(n\le 1\), for which the expected value is \(1\). For \(n>1\), they initialize the running term with
$T_1=\left(\frac{n-1}{n}\right)^{n-1},$
and initialize the running total with the already known terminal contribution \(1\) plus this first nontrivial term.
They then loop through \(k=1,2,\dots,n-2\). In each iteration, the implementation computes the logarithm of the ratio from Step 3, exponentiates it, multiplies the current term by that ratio, and adds the result to the total. This produces the sequence \(T_2,T_3,\dots,T_{n-1}\) in order without ever rebuilding a factorial or a binomial coefficient.
The same code path also confirms small benchmark values such as \(E(3)=17/9\), \(E(4)=2.21875\), \(E(5)=2.5104\), and \(E(10)=3.66021568\). Once those checkpoints agree, the implementations evaluate the required large input and print the result rounded to four decimal places.
Complexity Analysis
The algorithm performs one loop over \(k=1\) to \(n-2\), and each iteration uses only a constant amount of floating-point arithmetic. Therefore the running time is \(O(n)\). The method stores only the current term, the accumulated total, and a few temporary values, so the memory usage is \(O(1)\). This is precisely why the recurrence is practical even for \(n=10^6\).
Footnotes and References
- Problem page: https://projecteuler.net/problem=573
- Expected value: Wikipedia - Expected value
- Binomial distribution: Wikipedia - Binomial distribution
- Numerical stability: Wikipedia - Numerical stability
- Natural logarithm: Wikipedia - Natural logarithm