Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 426: Box-Ball System

View on Project Euler

Project Euler Problem 426 Solution

EulerSolve provides an optimized solution for Project Euler Problem 426, Box-Ball System, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Set \(s_0=290797\), define $s_{k+1}=s_k^2 \bmod 50515093,\qquad t_k=(s_k \bmod 64)+1,$ and build the initial box-ball configuration as the infinite binary word $W=1^{t_0}0^{t_1}1^{t_2}0^{t_3}\cdots 1^{t_{10^7}}0^\infty,$ where \(1\) denotes an occupied box and \(0\) an empty box. One BBS turn moves each ball once, from left to right, to the nearest empty box on its right. After sufficiently many turns the configuration separates into stable occupied blocks with lengths \(L_1,\dots,L_m\), and the target quantity is $\sum_{j=1}^{m} L_j^2.$ Mathematical Approach Step 1: Replace the Dynamics by Repeated \(10\)-Elimination A standard invariant of the box-ball system is obtained by repeatedly deleting every adjacent pattern \(10\), with all deletions in a given round performed simultaneously and with the infinite tail of zeros already present on the right. Let \(N_r\) be the number of pairs deleted in elimination round \(r\). The stable soliton lengths form the conjugate partition of these row counts. Equivalently, $N_r=\#\{j:L_j\ge r\}.$ This identity is the key reduction: once the numbers \(N_r\) are known, the final occupied block lengths are known as well....

Detailed mathematical approach

Problem Summary

Set \(s_0=290797\), define

$s_{k+1}=s_k^2 \bmod 50515093,\qquad t_k=(s_k \bmod 64)+1,$

and build the initial box-ball configuration as the infinite binary word

$W=1^{t_0}0^{t_1}1^{t_2}0^{t_3}\cdots 1^{t_{10^7}}0^\infty,$

where \(1\) denotes an occupied box and \(0\) an empty box. One BBS turn moves each ball once, from left to right, to the nearest empty box on its right. After sufficiently many turns the configuration separates into stable occupied blocks with lengths \(L_1,\dots,L_m\), and the target quantity is

$\sum_{j=1}^{m} L_j^2.$

Mathematical Approach

Step 1: Replace the Dynamics by Repeated \(10\)-Elimination

A standard invariant of the box-ball system is obtained by repeatedly deleting every adjacent pattern \(10\), with all deletions in a given round performed simultaneously and with the infinite tail of zeros already present on the right. Let \(N_r\) be the number of pairs deleted in elimination round \(r\).

The stable soliton lengths form the conjugate partition of these row counts. Equivalently,

$N_r=\#\{j:L_j\ge r\}.$

This identity is the key reduction: once the numbers \(N_r\) are known, the final occupied block lengths are known as well. The number of blocks of exact length \(r\) is therefore

$N_r-N_{r+1}.$

Geometrically, each final block of length \(\ell\) contributes one cell to each of the first \(\ell\) elimination rows, so the Ferrers diagram of the stable blocks has row lengths \(N_1,N_2,\dots\).

Step 2: A One-Pass Stack Interpretation

The implementations compute the elimination rounds without physically deleting symbols. Scan the binary word from left to right and keep a stack of unmatched occupied boxes. For each unmatched \(1\), store the largest elimination round of any matched pair nested inside it.

When an occupied box is read, push \(0\): at that moment no inner pair has been formed yet. When an empty box is read, it matches the most recent unmatched occupied box. If the stored maximum for that occupied box is \(d\), then all nested pairs vanish by round \(d\), so the outer pair becomes adjacent exactly one round later. Its elimination round is therefore

$d+1.$

After recording one more pair in round \(d+1\), this value must influence the nearest unmatched occupied box on the left, because that box now contains a nested structure whose maximum round may have increased. Thus the new value is propagated leftward by a max-update on the stack top.

This is the same recurrence as parenthesis matching with nesting depth: a pair disappears one round after everything strictly inside it has disappeared. At the end of the explicit word, the infinite zero tail is handled by draining the remaining stack in the same way.

Step 3: Work Directly with the Run-Length Encoding

The initial state is given as alternating run lengths, so the algorithm never materializes a full turn-by-turn BBS trajectory. For an occupied run of length \(\ell\), it pushes \(\ell\) zeros onto the stack. For an empty run of length \(\ell\), it performs up to \(\ell\) closures, stopping early if no unmatched occupied box remains.

Because every \(t_k\) lies in \(\{1,\dots,64\}\), each generated run costs only \(O(1)\) work, and the complete computation is just a streaming pass over the pseudorandom sequence.

Step 4: Convert Row Counts into the Required Sum

Once the counts \(N_r\) are available, the sum of squares follows from the elementary identity

$\ell^2=\sum_{r=1}^{\ell}(2r-1).$

Summing over all stable blocks and exchanging the order of summation gives

$\sum_{j=1}^{m}L_j^2=\sum_{j=1}^{m}\sum_{r=1}^{L_j}(2r-1)=\sum_{r\ge 1}(2r-1)\#\{j:L_j\ge r\}.$

Now substitute \(N_r=\#\{j:L_j\ge r\}\) to obtain the exact formula used by the implementations:

$\boxed{\sum_{j=1}^{m}L_j^2=\sum_{r\ge 1}(2r-1)N_r.}$

Worked Examples

Consider the small run encoding \([2,2,2,1,2]\). It represents the infinite word

$110011011\,0^\infty.$

Repeated \(10\)-elimination removes \(3\) pairs in the first round, \(2\) in the second, and \(1\) in the third. Hence

$N_1=3,\qquad N_2=2,\qquad N_3=1.$

The exact multiplicities are

$N_1-N_2=1,\qquad N_2-N_3=1,\qquad N_3-N_4=1,$

so the stable occupied blocks have lengths \([1,2,3]\).

For the \(11\)-run sample from the problem statement, the stable lengths are \([1,3,10,24,51,75]\). Therefore \(N_r\) is piecewise constant:

$N_r=\begin{cases} 6,& r=1,\\ 5,& 2\le r\le 3,\\ 4,& 4\le r\le 10,\\ 3,& 11\le r\le 24,\\ 2,& 25\le r\le 51,\\ 1,& 52\le r\le 75,\\ 0,& r\ge 76. \end{cases}$

Substituting these row counts into the boxed identity yields

$\sum_{r\ge 1}(2r-1)N_r=8912,$

which matches the sample value exactly.

How the Code Works

The C++, Python, and Java implementations all stream the pseudorandom generator, alternate between occupied and empty runs, and maintain the same stack invariant. Every closure contributes one unit to the appropriate elimination round, and a final drain accounts for the infinite empty tail. After that single pass, the implementations evaluate \(\sum_{r\ge 1}(2r-1)N_r\) directly from the stored round counts. No explicit simulation of many BBS turns is performed.

Complexity Analysis

Let \(B\) be the total number of occupied boxes in the encoded initial state. Each occupied box is pushed onto the stack once and popped once, so the total stack work is \(O(B)\). The round-count array has length equal to the largest elimination round encountered. Thus the memory usage is \(O(H+D)\), where \(H\) is the maximum active stack height and \(D\) the maximum round index; both are bounded by \(B\).

Because every run length satisfies \(1\le t_k\le 64\), we also have \(B=O(R)\) for \(R=10^7+1\) generated runs. Therefore the full algorithm is linear in the size of the encoded input and dramatically faster than any direct simulation over many BBS turns.

Footnotes and References

  1. Problem page: https://projecteuler.net/problem=426
  2. D. Takahashi and J. Satsuma, A Soliton Cellular Automaton, Journal of the Physical Society of Japan, 59(10), 3514-3519, 1990.
  3. Box-ball system overview: Wikipedia — Box-ball system
  4. Partition conjugation and Ferrers diagrams: Wikipedia — Partition (number theory)
  5. Run-length encoding: Wikipedia — Run-length encoding

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 425 · All Project Euler solutions · Next: Problem 427

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮