Problem 922: Young's Game A
View on Project EulerProject Euler Problem 922 Solution
EulerSolve provides an optimized solution for Project Euler Problem 922, Young's Game A, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary The target quantity is \(R(8,64)\) modulo \(10^9+7\). The implementations do not explore full Young positions cell by cell. Instead, they rely on the decomposition used by the game: a position splits into \(m\) independent diagonal components, and one component is encoded by positive integers \(a\) and \(b\) together with a nimber \(g\), subject to $a\ge 1,\qquad b\ge 1,\qquad 0\le g\le w-a-b-1.$ Such a component contributes a game value of the form $d+\ast g,\qquad d=b-a.$ For an \(m\)-component position, the integer parts add normally and the nimber parts combine by xor. So the whole position is controlled by two global invariants: $D=\sum_{i=1}^m d_i,\qquad X=g_1\oplus g_2\oplus \cdots \oplus g_m.$ The implementations count exactly the states satisfying the final criterion \(D>0\), or \(D=0\) with \(X\ne 0\). Once that reduction is recognized, the problem becomes a counting problem over \((D,X)\) rather than a direct search over Young diagrams. Mathematical Approach The derivation has two stages. First we catalogue every possible contribution of a single diagonal component. Then we combine \(m\) such components with a dynamic program that uses ordinary addition in the signed-difference coordinate and xor in the nimber coordinate. The local object: one diagonal component Fix \(w\)....
Detailed mathematical approach
Problem Summary
The target quantity is \(R(8,64)\) modulo \(10^9+7\). The implementations do not explore full Young positions cell by cell. Instead, they rely on the decomposition used by the game: a position splits into \(m\) independent diagonal components, and one component is encoded by positive integers \(a\) and \(b\) together with a nimber \(g\), subject to
$a\ge 1,\qquad b\ge 1,\qquad 0\le g\le w-a-b-1.$
Such a component contributes a game value of the form
$d+\ast g,\qquad d=b-a.$
For an \(m\)-component position, the integer parts add normally and the nimber parts combine by xor. So the whole position is controlled by two global invariants:
$D=\sum_{i=1}^m d_i,\qquad X=g_1\oplus g_2\oplus \cdots \oplus g_m.$
The implementations count exactly the states satisfying the final criterion \(D>0\), or \(D=0\) with \(X\ne 0\). Once that reduction is recognized, the problem becomes a counting problem over \((D,X)\) rather than a direct search over Young diagrams.
Mathematical Approach
The derivation has two stages. First we catalogue every possible contribution of a single diagonal component. Then we combine \(m\) such components with a dynamic program that uses ordinary addition in the signed-difference coordinate and xor in the nimber coordinate.
The local object: one diagonal component
Fix \(w\). The solution code explicitly enumerates all triples \((a,b,g)\) satisfying the admissibility conditions above. If we rewrite \(g+1=h\), then \(h\ge 1\) and
$a+b+h\le w.$
So the total number of local triples is
$\sum_{a=1}^{w-2}\sum_{b=1}^{w-a-1}(w-a-b)=\binom{w}{3}.$
For the target width \(w=64\), this local catalogue has \(\binom{64}{3}=41664\) entries. The important point is that the global computation does not need to remember the full triple. Only the signed difference \(d=b-a\) and the nimber \(g\) survive into later stages.
The local counting table \(C_w(d,g)\)
Many different pairs \((a,b)\) can produce the same local value \(d+\ast g\). The implementations therefore build a multiplicity table
$C_w(d,g)=\#\left\{(a,b)\in\mathbb Z_{>0}^2:\ b-a=d,\ a+b+g\le w-1\right\}.$
Every admissible triple contributes 1 to exactly one entry of this table. If we set \(u=w-g-1\), then the constraints become \(b-a=d\) and \(a+b\le u\). Solving them gives the closed form
$C_w(d,g)=\max\left(0,\left\lfloor\frac{w-g-1-|d|}{2}\right\rfloor\right).$
This explains several features of the code at once. The distribution is symmetric in \(d\), the support is finite, and the largest possible nimber is \(w-3\). For the target case \(w=64\), that means \(g\in\{0,1,\dots,61\}\), so a 64-state xor dimension is enough.
Compressing the whole position to \((D,X)\)
Let \(P_t(D,X)\) be the number of ways to choose \(t\) components whose total integer part is \(D\) and whose total nimber is \(X\). The empty choice gives the initial state
$P_0(0,0)=1,$
and every other state is zero at \(t=0\). When a new component with local value \(\delta+\ast g\) is appended, its integer part adds and its nimber xors, so the exact recurrence is
$P_{t+1}(D,X)=\sum_{\delta}\sum_g P_t(D-\delta,\ X\oplus g)\,C_w(\delta,g).$
This is the central formula of the solution. It replaces a huge space of board positions by a compact state space indexed only by a signed sum and an xor value.
Extracting the winning-value condition
After \(m\) rounds, we do not sum every state. The final selection rule used by the implementations is direct in this representation: keep every state with positive integer part, and from the row \(D=0\) keep only the nonzero xor states. Therefore
$R(m,w)=\sum_{D>0}\sum_X P_m(D,X)+\sum_{X\ne 0} P_m(0,X).$
The unique state \((D,X)=(0,0)\) is the exact zero value and must be excluded. In the code, this is implemented by summing every row with \(D>0\) completely and, for the single row \(D=0\), subtracting the entry at xor state 0.
Why the Walsh-Hadamard transform is the right tool
The slow part of the recurrence is the xor-convolution over \(X\). A direct implementation would compare every previous xor state with every new xor state. Instead, the implementations apply the XOR Walsh-Hadamard transform to every row in the xor dimension. After that transform, xor-convolution becomes pointwise multiplication in frequency space:
$\widehat{P}_{t+1}(D,\omega)=\sum_{\delta}\widehat{P}_t(D-\delta,\omega)\,\widehat{C}_w(\delta,\omega).$
Now each frequency \(\omega\) can be processed independently, and the only remaining convolution is the ordinary one along the signed-sum axis. This is why the code first transforms the local table, then transforms the current dynamic-programming table at each round, performs one ordinary convolution per frequency, and finally applies the inverse transform.
Worked Example: \(R(2,4)=7\)
When \(w=4\), the admissible triples are
$ (a,b,g)\in\{(1,1,0),(1,1,1),(1,2,0),(2,1,0)\}. $
So the only local values are
$0,\qquad \ast 1,\qquad 1,\qquad -1.$
Equivalently, the only nonzero local counts are
$C_4(0,0)=1,\qquad C_4(0,1)=1,\qquad C_4(1,0)=1,\qquad C_4(-1,0)=1.$
For \(m=2\), we form ordered pairs of these four local values and keep only totals with \(D>0\), or with \(D=0\) and nonzero nimber. The counted pairs are
$ (1,1),\ (1,0),\ (0,1),\ (1,\ast 1),\ (\ast 1,1),\ (0,\ast 1),\ (\ast 1,0). $
There are exactly 7 such pairs, matching the small checkpoint used by the implementations. This tiny case shows the whole method in miniature: build the local catalogue, combine components through \((D,X)\), and apply the final positivity rule.
How the Code Works
Precomputing the local catalogue
The C++, Python, and Java implementations first enumerate all admissible triples \((a,b,g)\) and accumulate their multiplicities in the table \(C_w(d,g)\). To keep array indexing simple, they reserve a slightly wider signed-difference range than the mathematical support really needs. The same idea is used later for the global signed-sum axis, so every possible total \(D\) can be stored with a fixed offset.
Running each dynamic-programming round in transform space
The current state distribution starts from the single empty state \((D,X)=(0,0)\). At each of the \(m\) rounds, every xor row is transformed, the transformed local table is convolved with the transformed state table independently for each of the 64 frequencies, and then an inverse Walsh-Hadamard transform brings the result back to xor space. The recurrence is identical in all three languages. The C++ implementation additionally parallelizes the independent frequency loops, but the mathematics is unchanged.
Summing the final rows
After the last round, each row corresponds to one possible total \(D\). Summing across that row gives \(\sum_X P_m(D,X)\). Rows with positive \(D\) are added in full. For the unique row \(D=0\), the entry with xor value 0 is removed. That exactly implements the formula for \(R(m,w)\) above.
Complexity Analysis
Building the local catalogue costs \(O(w^3)\) time, because the admissible triples \((a,b,g)\) are enumerated explicitly. The transformed dynamic program uses a local difference table with \(2w-3\) allocated rows and a global signed-sum table with \(2m(w-2)+1\) allocated rows, each with 64 xor entries.
In each of the \(m\) rounds, the algorithm performs Walsh-Hadamard transforms on all global rows and then runs 64 ordinary convolutions against the local table. This gives a running time of
$O\bigl(w^3+m(2m(w-2)+1)\cdot 64\log 64+m(2m(w-2)+1)(2w-3)\cdot 64\bigr).$
The memory usage is linear in the number of stored states:
$O\bigl((2m(w-2)+1+2w-3)\cdot 64\bigr).$
Because the xor dimension is fixed at 64 for the actual problem, the practical growth is dominated by the signed-sum axis rather than by any exponential explosion in the number of Young positions.
Footnotes and References
- Project Euler problem page: https://projecteuler.net/problem=922
- Young diagram: Wikipedia - Young diagram
- Combinatorial game theory: Wikipedia - Combinatorial game theory
- Sprague-Grundy theorem: Wikipedia - Sprague-Grundy theorem
- Nimber: Wikipedia - Nimber
- Hadamard transform: Wikipedia - Hadamard transform