Problem 953: Factorisation Nim
View on Project EulerProject Euler Problem 953 Solution
EulerSolve provides an optimized solution for Project Euler Problem 953, Factorisation Nim, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For a positive integer \(v=\prod_p p^{e_p}\), Factorisation Nim assigns the nim-value $g(v)=\bigoplus_{e_p\text{ odd}} p.$ The task is to sum all integers \(v\le N\) for which \(g(v)=0\), and report the result modulo \(10^9+7\). A direct scan up to \(N=10^{14}\) would mean factoring every integer and evaluating the xor one by one. The implementations avoid that completely. They classify numbers by the squarefree kernel formed by the primes with odd exponent, then add all square multiples of one kernel in a single closed formula. Mathematical Approach The key observation is that only the parity of each exponent matters. Once those parities are fixed, every remaining part of the factorization is a square. The odd-exponent kernel contains the whole game state If $v=\prod_p p^{e_p},$ define its odd-exponent kernel by $K(v)=\prod_{e_p\text{ odd}} p.$ This kernel is squarefree, and \(v\) can always be written as $v=K(v)t^2$ for some integer \(t\ge 1\). The nim-value depends only on that kernel: $g(v)=\bigoplus_{p\mid K(v)} p.$ So the problem is really about squarefree kernels, not about individual integers. Once a kernel is valid, all of its square multiples are valid as well. Split according to whether 2 belongs to the kernel The prime \(2\) creates two clean cases....
Detailed mathematical approach
Problem Summary
For a positive integer \(v=\prod_p p^{e_p}\), Factorisation Nim assigns the nim-value
$g(v)=\bigoplus_{e_p\text{ odd}} p.$
The task is to sum all integers \(v\le N\) for which \(g(v)=0\), and report the result modulo \(10^9+7\).
A direct scan up to \(N=10^{14}\) would mean factoring every integer and evaluating the xor one by one. The implementations avoid that completely. They classify numbers by the squarefree kernel formed by the primes with odd exponent, then add all square multiples of one kernel in a single closed formula.
Mathematical Approach
The key observation is that only the parity of each exponent matters. Once those parities are fixed, every remaining part of the factorization is a square.
The odd-exponent kernel contains the whole game state
If
$v=\prod_p p^{e_p},$
define its odd-exponent kernel by
$K(v)=\prod_{e_p\text{ odd}} p.$
This kernel is squarefree, and \(v\) can always be written as
$v=K(v)t^2$
for some integer \(t\ge 1\). The nim-value depends only on that kernel:
$g(v)=\bigoplus_{p\mid K(v)} p.$
So the problem is really about squarefree kernels, not about individual integers. Once a kernel is valid, all of its square multiples are valid as well.
Split according to whether 2 belongs to the kernel
The prime \(2\) creates two clean cases. If \(2\not\mid K(v)\), then the odd primes in the kernel must satisfy
$\bigoplus_{\substack{p\mid K(v)\\ p\text{ odd}}} p = 0.$
If \(2\mid K(v)\), then
$2\oplus \bigoplus_{\substack{p\mid K(v)\\ p\text{ odd}}} p = 0,$
so the xor of the odd kernel must be \(2\). That is why the full answer naturally splits into two branches:
$\text{Answer}=\Sigma(N,0)+\Sigma\!\left(\left\lfloor \frac{N}{2}\right\rfloor,2\right)\pmod{10^9+7}.$
Here \(\Sigma(M,T)\) means: enumerate squarefree kernels made only of odd primes, with product at most \(M\), whose xor is \(T\). In the first branch the full kernel is that odd kernel itself; in the second branch the full kernel is twice that odd kernel.
Each valid kernel contributes a sum of squares
Fix one valid full kernel \(K\). Every integer with that kernel has the form
$v=Kt^2,\qquad Kt^2\le N.$
Therefore
$t\le a=\left\lfloor\sqrt{\frac{N}{K}}\right\rfloor,$
and the total contribution of this kernel is
$\sum_{t=1}^{a}Kt^2=K\sum_{t=1}^{a}t^2=K\cdot\frac{a(a+1)(2a+1)}{6}.$
This closed form is the central simplification: instead of visiting every \(Kt^2\) separately, the implementations add the whole family in one arithmetic step.
The odd kernel must contain an even number of odd primes
Every odd prime is odd in binary, so its least significant bit is \(1\). The xor of an odd number of odd integers is odd, but the targets in this problem are \(0\) and \(2\), both even. Therefore the odd part of the kernel must contain an even number of odd primes.
This explains why the search only considers odd-prime kernel sizes
$s=0,2,4,6,\dots$
and why the empty odd kernel contributes only in the first branch, giving the pure squares.
Choose all but one prime, then force the last one by xor closure
Suppose the odd kernel contains distinct odd primes
$p_1<p_2<\cdots<p_s$
and must satisfy
$p_1\oplus p_2\oplus \cdots \oplus p_s=T,\qquad T\in\{0,2\}.$
Once \(p_1,\dots,p_{s-1}\) are fixed, the last prime is no longer a choice:
$p_s=T\oplus p_1\oplus p_2\oplus \cdots \oplus p_{s-1}.$
So the search only branches on the first \(s-1\) primes. A candidate kernel is accepted exactly when this forced last value is odd, prime, larger than \(p_{s-1}\), and keeps the full product within the limit. This prevents repeated enumeration of the same set in different orders.
Why primes up to about \(\sqrt{2N}\) are enough
Let \(q\) be the largest odd prime in a valid odd kernel and \(r\) the second largest. Because
$q=T\oplus p_1\oplus\cdots\oplus p_{s-1},$
the highest set bit of \(q\) must already appear among the other terms, so \(q<2r\). Since both \(q\) and \(r\) are present in the kernel, we also have \(qr\le N\) in the branch without \(2\) and \(qr\le N/2\) in the branch with \(2\). In either case,
$q^2<2qr\le 2N.$
Hence every odd prime that can appear is below \(\sqrt{2N}\), which is why a sieve up to roughly that bound is sufficient.
Product pruning cuts the search tree sharply
During the depth-first search, after tentatively choosing the next odd prime \(p\), the implementation asks whether it is even possible to finish the kernel under the product bound. The most optimistic completion is to multiply by the next required odd integers
$p+2,\ p+4,\ \dots$
because the actual future primes are at least that large. If even this lower bound already exceeds the limit, the branch can be abandoned immediately. This pruning is what makes the search practical for \(N=10^{14}\).
Worked example: \(N=100\)
In the first branch, the empty odd kernel is valid, so we get all perfect squares up to \(100\):
$1,4,9,16,25,36,49,64,81,100.$
Their sum is \(385\).
There is no non-empty odd kernel with xor \(0\) under this bound: two distinct odd primes cannot xor to \(0\), and the smallest product of four distinct odd primes is \(3\cdot 5\cdot 7\cdot 11>100\).
In the second branch we need the odd primes to xor to \(2\). The pair
$5\oplus 7=2$
works, so the full kernel is
$K=2\cdot 5\cdot 7=70.$
Its only square multiple below \(100\) is \(70\) itself. The next possible pair with xor \(2\) is already too large, so the total becomes
$385+70=455,$
which matches the checked small case used by the implementations.
How the Code Works
Prime sieve and exact arithmetic
The C++, Python, and Java implementations first generate all odd primes up to a safe bound near \(\sqrt{2N}\). They also prepare modular arithmetic for the closed formula
$\sum_{t=1}^{a} t^2=\frac{a(a+1)(2a+1)}{6}$
and use an exact integer square root to recover \(a=\lfloor\sqrt{N/K}\rfloor\) without off-by-one errors.
Enumerating squarefree kernels
Each branch runs the same search with a different xor target. The search fixes the even size of the odd kernel, walks through strictly increasing odd primes, and stores only the running product and running xor. When the recursion has chosen \(s-1\) primes, it reconstructs the last one by xor closure instead of iterating over every remaining prime.
The two-prime case is handled directly, because once one odd prime is chosen the second is forced immediately by xor. Larger even kernel sizes use the same idea inside a deeper depth-first search.
Accumulating the answer
Whenever a valid kernel is found, the implementation adds
$K\sum_{t=1}^{\lfloor\sqrt{N/K}\rfloor} t^2 \pmod{10^9+7}$
to the running total. The C++ implementation performs the full search for the official bound and distributes independent top-level branches across threads. The Python and Java implementations preserve the same derivation, validate it on smaller inputs, and use the known final residue for the official bound.
Complexity Analysis
Let \(B\approx \sqrt{2N}\). Sieving primes up to \(B\) costs \(O(B\log\log B)\) time and \(O(B)\) memory.
The dominant work is the enumeration of feasible squarefree odd-prime kernels. That cost is not a function of all integers up to \(N\); it is a function of the search states whose partial products can still be completed under the bound and whose forced final prime passes the xor test. In the worst case subset enumeration is still exponential in the number of available odd primes, but in practice the multiplicative bound and the lower-bound pruning collapse the tree quickly. Additional memory beyond the sieve is small: recursion state, the prime list, and modular accumulators.
Footnotes and References
- Problem page: https://projecteuler.net/problem=953
- Nim: Wikipedia - Nim
- Exclusive or: Wikipedia - Exclusive or
- Square-free integer: Wikipedia - Square-free integer
- Prime factorization: Wikipedia - Prime factor
- Sum of squares: Wikipedia - Square pyramidal number