Problem 945: XOR-Equation C
View on Project EulerProject Euler Problem 945 Solution
EulerSolve provides an optimized solution for Project Euler Problem 945, XOR-Equation C, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For the bound \(N=10^7\), we must count all pairs \((a,b)\) with \(0\le a\le b\le N\) for which the carryless quadratic form $a\otimes a \oplus (2\otimes a\otimes b) \oplus b\otimes b$ is itself a carryless square. Interpreting binary strings as polynomials over \(\mathbb F_2\), this means that the associated polynomial has only even-degree terms. A direct search over all \(\Theta(N^2)\) pairs is impossible at this scale. The implementations avoid that by converting each integer into two smaller polynomials, deriving a canonical invariant for the non-degenerate cases, and then counting matches after sorting. Mathematical Approach Associate to every nonnegative integer \(n=\sum n_i2^i\) the polynomial $P_n(x)=\sum n_i x^i\in\mathbb F_2[x].$ Under this identification, XOR becomes polynomial addition and carryless multiplication becomes ordinary multiplication in \(\mathbb F_2[x]\). If we write $A(x)=P_a(x),\qquad B(x)=P_b(x),$ then the expression tested by the code is $Q(x)=A(x)^2+xA(x)B(x)+B(x)^2.$ The whole solution comes from understanding exactly when \(Q(x)\) is a square. Squares over \(\mathbb F_2\) In characteristic 2 we have $\left(\sum c_i x^i\right)^2=\sum c_i x^{2i},$ because every mixed term appears twice and therefore cancels. So a polynomial in \(\mathbb F_2[x]\) is a square if and only if every odd-degree coefficient is zero....
Detailed mathematical approach
Problem Summary
For the bound \(N=10^7\), we must count all pairs \((a,b)\) with \(0\le a\le b\le N\) for which the carryless quadratic form
$a\otimes a \oplus (2\otimes a\otimes b) \oplus b\otimes b$
is itself a carryless square. Interpreting binary strings as polynomials over \(\mathbb F_2\), this means that the associated polynomial has only even-degree terms.
A direct search over all \(\Theta(N^2)\) pairs is impossible at this scale. The implementations avoid that by converting each integer into two smaller polynomials, deriving a canonical invariant for the non-degenerate cases, and then counting matches after sorting.
Mathematical Approach
Associate to every nonnegative integer \(n=\sum n_i2^i\) the polynomial
$P_n(x)=\sum n_i x^i\in\mathbb F_2[x].$
Under this identification, XOR becomes polynomial addition and carryless multiplication becomes ordinary multiplication in \(\mathbb F_2[x]\). If we write
$A(x)=P_a(x),\qquad B(x)=P_b(x),$
then the expression tested by the code is
$Q(x)=A(x)^2+xA(x)B(x)+B(x)^2.$
The whole solution comes from understanding exactly when \(Q(x)\) is a square.
Squares over \(\mathbb F_2\)
In characteristic 2 we have
$\left(\sum c_i x^i\right)^2=\sum c_i x^{2i},$
because every mixed term appears twice and therefore cancels. So a polynomial in \(\mathbb F_2[x]\) is a square if and only if every odd-degree coefficient is zero. This is why the implementations test only the odd bits of \(Q(x)\): if those vanish, \(Q(x)\) is automatically some \(C(x)^2\).
Even and odd bit polynomials
Introduce \(t=x^2\) and split each polynomial into its even and odd positions:
$A(x)=E_a(t)+x\,O_a(t),\qquad B(x)=E_b(t)+x\,O_b(t),$
with \(E_a,O_a,E_b,O_b\in\mathbb F_2[t]\). Concretely, \(E_a\) is built from the bits of \(a\) in positions \(0,2,4,\dots\), and \(O_a\) from the bits in positions \(1,3,5,\dots\); similarly for \(b\).
Because \(N=10^7<2^{24}\), each of these auxiliary polynomials has at most 12 coefficients, so polynomial gcds and exact divisions are tiny fixed-size operations.
The odd-part identity
The square terms \(A(x)^2\) and \(B(x)^2\) already contain only even powers. Therefore the odd powers of \(Q(x)\) come entirely from the middle term \(xA(x)B(x)\). Expanding with \(t=x^2\),
$A(x)B(x)=E_aE_b+x(E_aO_b+O_aE_b)+t\,O_aO_b,$
so
$xA(x)B(x)=xE_aE_b+t(E_aO_b+O_aE_b)+xt\,O_aO_b.$
The odd part of \(Q(x)\) is therefore
$x\bigl(E_a(t)E_b(t)+t\,O_a(t)O_b(t)\bigr).$
Hence \((a,b)\) is valid exactly when
$E_a(t)E_b(t)=t\,O_a(t)O_b(t).$
This identity is the central mathematical object in the solution.
Canonical classes for the general case
Assume first that all four polynomials \(E_a,O_a,E_b,O_b\) are nonzero. Then the criterion can be rewritten as
$\frac{E_a(t)}{t\,O_a(t)}=\frac{O_b(t)}{E_b(t)}$
inside the rational-function field \(\mathbb F_2(t)\). The implementations normalize the two sides separately by cancelling polynomial gcds:
$\left(\frac{E_a}{d_a},\frac{tO_a}{d_a}\right),\qquad d_a=\gcd(E_a,tO_a),$
$\left(\frac{O_b}{d_b},\frac{E_b}{d_b}\right),\qquad d_b=\gcd(O_b,E_b).$
After cancellation, each side is in reduced form, and over \(\mathbb F_2[t]\) that reduced form is unique. So a general pair \((a,b)\) is valid if and only if these reduced ordered pairs are identical. This turns an equation in two variables into a key-matching problem.
Degenerate branches
The zero-component cases do not fit the ratio language and must be counted separately.
If \(E_a=0\) but \(O_a\neq 0\), then
$E_aE_b=t\,O_aO_b$
forces \(O_b=0\). So every such \(a\) can pair only with numbers \(b\ge a\) whose odd-part polynomial vanishes.
If \(O_a=0\) but \(E_a\neq 0\), then the same identity forces \(E_b=0\). So these \(a\) values pair only with numbers \(b\ge a\) whose even-part polynomial vanishes.
The last special case is \(a=0\). Then
$Q(x)=B(x)^2,$
which is always a square, so every \(b\in[0,N]\) contributes a valid pair \((0,b)\).
Worked example
Take \(a=3\) and \(b=6\). In binary,
$a=11_2,\qquad b=110_2,$
so
$A(x)=1+x,\qquad B(x)=x+x^2.$
With \(t=x^2\), the split is
$E_a=1,\ O_a=1,\qquad E_b=t,\ O_b=1.$
The criterion becomes
$E_aE_b=1\cdot t=t,\qquad t\,O_aO_b=t\cdot 1\cdot 1=t,$
so the pair is valid.
Direct expansion confirms it:
$A(x)^2=1+x^2,\qquad xA(x)B(x)=x^2+x^4,\qquad B(x)^2=x^2+x^4,$
hence
$Q(x)=1+x^2,$
which has only even powers and is therefore a square.
From algebra to counting
Every general \(a\) contributes one normalized key coming from \((E_a,tO_a)\), and every general \(b\) contributes one normalized key coming from \((O_b,E_b)\). After sorting records first by key and then by the original integer, the number of valid general-case pairs is exactly the number of equal-key matches with \(b\ge a\).
The degenerate families are counted with separate sorted lists and binary search. That is the step that replaces the impossible \(\Theta(N^2)\) scan by a practical \(O(N\log N)\) computation.
How the Code Works
Building polynomial records
The C++, Python, and Java implementations scan the interval \([0,N]\), split each integer into its even and odd bit polynomials, and place it either into a general record list or into one of the zero-component buckets. Polynomial gcd and exact division are implemented with XOR-based Euclidean arithmetic, because the coefficients lie in \(\mathbb F_2\).
Handling the special families
The implementations keep sorted lists of all \(b\) values with zero odd part and all \(b\) values with zero even part. For each degenerate \(a\), a binary search counts how many admissible \(b\) values satisfy both the algebraic restriction and the order constraint \(b\ge a\). The unconditional family \(a=0\) contributes \(N+1\) pairs immediately.
Merging equal-key groups
The general records are sorted by normalized key and then by value. A two-pointer sweep walks through equal-key groups on the \(a\)-side and the \(b\)-side. Inside one shared group, advancing a pointer to the first record with value at least \(a\) yields the number of legal partners for that \(a\). This is just a merge-style count on already reduced algebraic classes.
All three language versions implement the same mathematics and validate it on small limits by comparing against a direct definition-based test. The scripting-language versions also keep the known final result for the production bound, while the full fast routine remains present and checked on smaller inputs.
Complexity Analysis
With the 24-bit ceiling fixed, splitting bits and performing polynomial gcd/division take constant time per integer. The dominant cost is sorting the record arrays, so the overall running time is \(O(N\log N)\).
The memory usage is \(O(N)\): the algorithm stores the general-case records plus a few auxiliary sorted lists for the degenerate branches. Compared with the naive \(O(N^2)\) pair test, this is the decisive improvement.
Footnotes and References
- Problem page: https://projecteuler.net/problem=945
- Carry-less product: Wikipedia - Carry-less product
- Finite field arithmetic: Wikipedia - Finite field arithmetic
- Polynomial greatest common divisor: Wikipedia - Polynomial greatest common divisor
- Frobenius endomorphism: Wikipedia - Frobenius endomorphism