Problem 430: Range Flips
View on Project EulerProject Euler Problem 430 Solution
EulerSolve provides an optimized solution for Project Euler Problem 430, Range Flips, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We have \(N\) disks, all initially white. One move chooses two endpoints \(A,B \in \{1,\dots,N\}\) independently and uniformly, then flips every disk in the closed interval between them. If a disk is flipped an odd number of times it ends black; if it is flipped an even number of times it ends white. The quantity of interest is \(E(N,M)\), the expected number of white disks after \(M\) moves. For the target input \(N=10^{10}\) and \(M=4000\), a direct simulation or a direct \(O(N)\) summation is far too slow, so the solution reduces the whole problem to a one-disk probability formula plus a carefully bounded truncation of the final sum. Mathematical Approach 1. One-Turn Behavior of a Fixed Disk Fix a disk position \(i\). In one move, disk \(i\) is not flipped exactly when both chosen endpoints lie strictly to its left or both lie strictly to its right. Because ordered pairs \((A,B)\) are chosen uniformly from \(N^2\) possibilities, the one-turn no-flip probability is $q_i=\frac{(i-1)^2+(N-i)^2}{N^2}.$ Therefore the one-turn flip probability is $p_i=1-q_i.$ It is convenient to combine these into the signed coefficient $r_i=q_i-p_i=2q_i-1=\frac{2\big((i-1)^2+(N-i)^2\big)-N^2}{N^2}.$ This is exactly the quantity raised to the \(M\)-th power by the implementations. 2....
Detailed mathematical approach
Problem Summary
We have \(N\) disks, all initially white. One move chooses two endpoints \(A,B \in \{1,\dots,N\}\) independently and uniformly, then flips every disk in the closed interval between them. If a disk is flipped an odd number of times it ends black; if it is flipped an even number of times it ends white.
The quantity of interest is \(E(N,M)\), the expected number of white disks after \(M\) moves. For the target input \(N=10^{10}\) and \(M=4000\), a direct simulation or a direct \(O(N)\) summation is far too slow, so the solution reduces the whole problem to a one-disk probability formula plus a carefully bounded truncation of the final sum.
Mathematical Approach
1. One-Turn Behavior of a Fixed Disk
Fix a disk position \(i\). In one move, disk \(i\) is not flipped exactly when both chosen endpoints lie strictly to its left or both lie strictly to its right. Because ordered pairs \((A,B)\) are chosen uniformly from \(N^2\) possibilities, the one-turn no-flip probability is
$q_i=\frac{(i-1)^2+(N-i)^2}{N^2}.$
Therefore the one-turn flip probability is
$p_i=1-q_i.$
It is convenient to combine these into the signed coefficient
$r_i=q_i-p_i=2q_i-1=\frac{2\big((i-1)^2+(N-i)^2\big)-N^2}{N^2}.$
This is exactly the quantity raised to the \(M\)-th power by the implementations.
2. From Flip Parity to the Probability of White
For disk \(i\), introduce a sign variable \(S_t \in \{+1,-1\}\), where \(+1\) means white after \(t\) moves and \(-1\) means black. Initially \(S_0=+1\). Each move multiplies the sign by \(+1\) with probability \(q_i\) and by \(-1\) with probability \(p_i\), so
$\mathbb{E}[S_{t+1}\mid S_t]=r_i\,S_t.$
Taking expectations repeatedly gives the simple recurrence
$\mathbb{E}[S_t]=r_i^t,$
because \(\mathbb{E}[S_0]=1\). Now the white-indicator of disk \(i\) after \(M\) moves is
$\mathbf{1}_{i,\text{white}}=\frac{1+S_M}{2},$
hence
$\Pr(\text{disk } i \text{ is white after } M \text{ moves})=\frac{1+r_i^M}{2}.$
This is the core closed form. It converts a random sequence of interval flips into a deterministic per-position contribution.
3. Summing All Disks by Linearity of Expectation
Expected values add even when the disk colors are not independent. Therefore
$E(N,M)=\sum_{i=1}^{N}\Pr(\text{disk } i \text{ is white after } M \text{ moves})$
and the previous formula yields
$E(N,M)=\sum_{i=1}^{N}\frac{1+r_i^M}{2}=\frac{N}{2}+\frac{1}{2}\sum_{i=1}^{N} r_i^M.$
So the entire problem is reduced to evaluating the sum of \(r_i^M\) over all positions.
4. Symmetry and Monotonicity Near the Edges
The coefficients are symmetric:
$r_i=r_{N+1-i}.$
This follows directly from the formula, since replacing \(i\) by \(N+1-i\) swaps the two squared terms.
On the left half of the board, the sequence is monotone decreasing. A direct subtraction gives
$r_{i+1}-r_i=\frac{4(2i-N)}{N^2}.$
Hence \(r_{i+1} \lt r_i\) whenever \(i \lt N/2\). The importance of this monotonicity is algorithmic: once a threshold is chosen, the last significant edge position can be found by binary search instead of by scanning all the way from the edge to the center.
For large \(M\), any value with \(\lvert r_i\rvert \lt 1\) becomes tiny after taking the \(M\)-th power. Since the disks in the middle have coefficients much farther from \(1\) than the disks near the edges, almost the entire contribution comes from the two boundaries.
5. Truncating the Middle with a Rigorous Error Bound
Choose a threshold \(0 \lt \rho \lt 1\), and let \(L\) be the largest index on the left side with \(r_L \gt \rho\). By symmetry, the same number of disks is retained on the right side. The approximation is then
$E(N,M)\approx \frac{N}{2}+\sum_{i=1}^{L} r_i^M,$
because the factor \(1/2\) in the exact formula is canceled by the two symmetric edge blocks.
The omitted middle block contains \(N-2L\) disks. In the large-\(N\) regime where the accelerated method is used, those terms satisfy \(\lvert r_i\rvert \le \rho\), so the neglected contribution is bounded by
$\left|E(N,M)-\left(\frac{N}{2}+\sum_{i=1}^{L} r_i^M\right)\right|\le \frac{1}{2}(N-2L)\rho^M.$
The implementations start from a target tail size \(\varepsilon=10^{-4}\) and choose
$\rho=\left(\frac{2\varepsilon}{N}\right)^{1/M},$
so that the theoretical omitted contribution is tiny. They then compute the explicit upper bound above and compare it with \(0.005\), the threshold relevant for safe rounding to two decimal places. If needed, the threshold is tightened slightly and the edge sum is recomputed.
6. Small Worked Example
The checkpoint \(E(3,1)=10/9\) follows immediately from the formula. For \(N=3\), the edge disks have
$q_1=q_3=\frac{4}{9},\qquad r_1=r_3=-\frac{1}{9},$
while the middle disk has
$q_2=\frac{2}{9},\qquad r_2=-\frac{5}{9}.$
Therefore
$E(3,1)=\frac{3}{2}+\frac{1}{2}\left(-\frac{1}{9}-\frac{5}{9}-\frac{1}{9}\right)=\frac{10}{9}.$
Squaring the same coefficients gives
$E(3,2)=\frac{3}{2}+\frac{1}{2}\left(\frac{1}{81}+\frac{25}{81}+\frac{1}{81}\right)=\frac{5}{3}.$
These are exactly the small exact checkpoints used by the implementations before the large-input calculation.
How the Code Works
The C++, Python, and Java implementations all follow the same structure. For small inputs they evaluate the exact formula
$E(N,M)=\frac{N}{2}+\frac{1}{2}\sum_{i=1}^{N} r_i^M$
directly, which is useful for validation and checkpoint testing. For very large inputs they switch to the accelerated method: compute the coefficient for a single disk, find the edge cutoff by binary search, sum only the retained edge terms, and use symmetry to reconstruct the total contribution.
The expensive part is the edge summation, so the implementation partitions that retained range across multiple workers and combines the partial sums at the end. The arithmetic itself remains simple: each retained disk contributes one power \(r_i^M\), and the final estimate is protected by the tail bound described above.
Complexity Analysis
The exact method is \(O(N)\) time and \(O(1)\) memory, which is fine for checkpoints but impossible for \(N=10^{10}\). The accelerated method uses \(O(\log N)\) work to find the cutoff and \(O(L)\) work to sum the retained edge terms, where \(L \ll N\). Memory usage stays \(O(1)\) apart from a small number of partial accumulators used for parallel execution.
Footnotes and References
- Problem page: https://projecteuler.net/problem=430
- Expected value and linearity: Wikipedia - Expected value
- Two-state Markov chains: Wikipedia - Markov chain
- Binomial parity identity and repeated Bernoulli trials: Wikipedia - Binomial theorem