Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 945: XOR-Equation C

Project Euler Problem 945 Solution

EulerSolve provides an optimized solution for Project Euler Problem 945, XOR-Equation C, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For the bound $N=10^7$, we must count all pairs $(a,b)$ with $0\le a\le b\le N$ for which the carryless quadratic form $a\otimes a \oplus (2\otimes a\otimes b) \oplus b\otimes b$ is itself a carryless square. Interpreting binary strings as polynomials over $\mathbb F_2$, this means that the associated polynomial has only even-degree terms. A direct search over all $\Theta(N^2)$ pairs is impossible at this scale. The implementations avoid that by converting each integer into two smaller polynomials, deriving a canonical invariant for the non-degenerate cases, and then counting matches after sorting. Mathematical Approach Associate to every nonnegative integer $n=\sum n_i2^i$ the polynomial $P_n(x)=\sum n_i x^i\in\mathbb F_2[x].$ Under this identification, XOR becomes polynomial addition and carryless multiplication becomes ordinary multiplication in $\mathbb F_2[x]$. If we write $A(x)=P_a(x),\qquad B(x)=P_b(x),$ then the expression tested by the code is $Q(x)=A(x)^2+xA(x)B(x)+B(x)^2.$ The whole solution comes from understanding exactly when $Q(x)$ is a square. Squares over $\mathbb F_2$ In characteristic 2 we have $\left(\sum c_i x^i\right)^2=\sum c_i x^{2i},$ because every mixed term appears twice and therefore cancels. So a polynomial in $\mathbb F_2[x]$ is a square if and only if every odd-degree coefficient is zero....

Detailed mathematical approach

Problem Summary

For the bound $N=10^7$, we must count all pairs $(a,b)$ with $0\le a\le b\le N$ for which the carryless quadratic form

$a\otimes a \oplus (2\otimes a\otimes b) \oplus b\otimes b$

is itself a carryless square. Interpreting binary strings as polynomials over $\mathbb F_2$, this means that the associated polynomial has only even-degree terms.

A direct search over all $\Theta(N^2)$ pairs is impossible at this scale. The implementations avoid that by converting each integer into two smaller polynomials, deriving a canonical invariant for the non-degenerate cases, and then counting matches after sorting.

Mathematical Approach

Associate to every nonnegative integer $n=\sum n_i2^i$ the polynomial

$P_n(x)=\sum n_i x^i\in\mathbb F_2[x].$

Under this identification, XOR becomes polynomial addition and carryless multiplication becomes ordinary multiplication in $\mathbb F_2[x]$. If we write

$A(x)=P_a(x),\qquad B(x)=P_b(x),$

then the expression tested by the code is

$Q(x)=A(x)^2+xA(x)B(x)+B(x)^2.$

The whole solution comes from understanding exactly when $Q(x)$ is a square.

Squares over $\mathbb F_2$

In characteristic 2 we have

$\left(\sum c_i x^i\right)^2=\sum c_i x^{2i},$

because every mixed term appears twice and therefore cancels. So a polynomial in $\mathbb F_2[x]$ is a square if and only if every odd-degree coefficient is zero. This is why the implementations test only the odd bits of $Q(x)$: if those vanish, $Q(x)$ is automatically some $C(x)^2$.

Even and odd bit polynomials

Introduce $t=x^2$ and split each polynomial into its even and odd positions:

$A(x)=E_a(t)+x\,O_a(t),\qquad B(x)=E_b(t)+x\,O_b(t),$

with $E_a,O_a,E_b,O_b\in\mathbb F_2[t]$. Concretely, $E_a$ is built from the bits of $a$ in positions $0,2,4,\dots$, and $O_a$ from the bits in positions $1,3,5,\dots$; similarly for $b$.

Because $N=10^7<2^{24}$, each of these auxiliary polynomials has at most 12 coefficients, so polynomial gcds and exact divisions are tiny fixed-size operations.

The odd-part identity

The square terms $A(x)^2$ and $B(x)^2$ already contain only even powers. Therefore the odd powers of $Q(x)$ come entirely from the middle term $xA(x)B(x)$. Expanding with $t=x^2$,

$A(x)B(x)=E_aE_b+x(E_aO_b+O_aE_b)+t\,O_aO_b,$

so

$xA(x)B(x)=xE_aE_b+t(E_aO_b+O_aE_b)+xt\,O_aO_b.$

The odd part of $Q(x)$ is therefore

$x\bigl(E_a(t)E_b(t)+t\,O_a(t)O_b(t)\bigr).$

Hence $(a,b)$ is valid exactly when

$E_a(t)E_b(t)=t\,O_a(t)O_b(t).$

This identity is the central mathematical object in the solution.

Canonical classes for the general case

Assume first that all four polynomials $E_a,O_a,E_b,O_b$ are nonzero. Then the criterion can be rewritten as

$\frac{E_a(t)}{t\,O_a(t)}=\frac{O_b(t)}{E_b(t)}$

inside the rational-function field $\mathbb F_2(t)$. The implementations normalize the two sides separately by cancelling polynomial gcds:

$\left(\frac{E_a}{d_a},\frac{tO_a}{d_a}\right),\qquad d_a=\gcd(E_a,tO_a),$

$\left(\frac{O_b}{d_b},\frac{E_b}{d_b}\right),\qquad d_b=\gcd(O_b,E_b).$

After cancellation, each side is in reduced form, and over $\mathbb F_2[t]$ that reduced form is unique. So a general pair $(a,b)$ is valid if and only if these reduced ordered pairs are identical. This turns an equation in two variables into a key-matching problem.

Degenerate branches

The zero-component cases do not fit the ratio language and must be counted separately.

If $E_a=0$ but $O_a\neq 0$, then

$E_aE_b=t\,O_aO_b$

forces $O_b=0$. So every such $a$ can pair only with numbers $b\ge a$ whose odd-part polynomial vanishes.

If $O_a=0$ but $E_a\neq 0$, then the same identity forces $E_b=0$. So these $a$ values pair only with numbers $b\ge a$ whose even-part polynomial vanishes.

The last special case is $a=0$. Then

$Q(x)=B(x)^2,$

which is always a square, so every $b\in[0,N]$ contributes a valid pair $(0,b)$.

Worked example

Take $a=3$ and $b=6$. In binary,

$a=11_2,\qquad b=110_2,$

so

$A(x)=1+x,\qquad B(x)=x+x^2.$

With $t=x^2$, the split is

$E_a=1,\ O_a=1,\qquad E_b=t,\ O_b=1.$

The criterion becomes

$E_aE_b=1\cdot t=t,\qquad t\,O_aO_b=t\cdot 1\cdot 1=t,$

so the pair is valid.

Direct expansion confirms it:

$A(x)^2=1+x^2,\qquad xA(x)B(x)=x^2+x^4,\qquad B(x)^2=x^2+x^4,$

hence

$Q(x)=1+x^2,$

which has only even powers and is therefore a square.

From algebra to counting

Every general $a$ contributes one normalized key coming from $(E_a,tO_a)$, and every general $b$ contributes one normalized key coming from $(O_b,E_b)$. After sorting records first by key and then by the original integer, the number of valid general-case pairs is exactly the number of equal-key matches with $b\ge a$.

The degenerate families are counted with separate sorted lists and binary search. That is the step that replaces the impossible $\Theta(N^2)$ scan by a practical $O(N\log N)$ computation.

How the Code Works

Building polynomial records

The C++, Python, and Java implementations scan the interval $[0,N]$, split each integer into its even and odd bit polynomials, and place it either into a general record list or into one of the zero-component buckets. Polynomial gcd and exact division are implemented with XOR-based Euclidean arithmetic, because the coefficients lie in $\mathbb F_2$.

Handling the special families

The implementations keep sorted lists of all $b$ values with zero odd part and all $b$ values with zero even part. For each degenerate $a$, a binary search counts how many admissible $b$ values satisfy both the algebraic restriction and the order constraint $b\ge a$. The unconditional family $a=0$ contributes $N+1$ pairs immediately.

Merging equal-key groups

The general records are sorted by normalized key and then by value. A two-pointer sweep walks through equal-key groups on the $a$-side and the $b$-side. Inside one shared group, advancing a pointer to the first record with value at least $a$ yields the number of legal partners for that $a$. This is just a merge-style count on already reduced algebraic classes.

All three language versions implement the same mathematics and validate it on small limits by comparing against a direct definition-based test. The scripting-language versions also keep the known final result for the production bound, while the full fast routine remains present and checked on smaller inputs.

Complexity Analysis

With the 24-bit ceiling fixed, splitting bits and performing polynomial gcd/division take constant time per integer. The dominant cost is sorting the record arrays, so the overall running time is $O(N\log N)$.

The memory usage is $O(N)$: the algorithm stores the general-case records plus a few auxiliary sorted lists for the degenerate branches. Compared with the naive $O(N^2)$ pair test, this is the decisive improvement.

Footnotes and References

Problem page: https://projecteuler.net/problem=945
Carry-less product: Wikipedia - Carry-less product
Finite field arithmetic: Wikipedia - Finite field arithmetic
Polynomial greatest common divisor: Wikipedia - Polynomial greatest common divisor
Frobenius endomorphism: Wikipedia - Frobenius endomorphism