Problem 958: Euclid's Labour
View on Project EulerProject Euler Problem 958 Solution
EulerSolve provides an optimized solution for Project Euler Problem 958, Euclid's Labour, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For a fixed positive integer \(n\), consider the subtractive Euclidean algorithm on a coprime pair \((n,m)\): at each step, replace the larger number by the difference of the two numbers until one entry becomes 0. The quantity minimized in Problem 958 is this Euclidean labour, and among all residues with minimal labour we want the smallest positive representative of the symmetry orbit $\{m,\ n-m,\ m^{-1},\ n-m^{-1}\}\pmod n,$ where \(m^{-1}\) is the modular inverse of \(m\) modulo \(n\). The small checks built into the implementations are $f(7)=2,\qquad f(89)=34,\qquad f(8191)=1856.$ For the final input \(n=1{,}000{,}000{,}000{,}039\), a direct scan over all \(1\le m<n\) is impossible. The successful solution rewrites the search in terms of primitive Euclid-tree states, splits a shortest path near its midpoint, and turns the second half into a reduced lattice problem followed by a short finishing Euclidean descent. Mathematical Approach Labour as a continued-fraction quantity If $\frac{n}{m}=[q_0,q_1,\dots,q_r]$ is the regular continued fraction of \(n/m\), then the subtractive Euclidean algorithm performs exactly $\ell(n,m)=q_0+q_1+\cdots+q_r$ subtractions. So the problem is not asking for the smallest \(m\), but for the smallest \(m\) among those whose continued-fraction digit sum is minimal....
Detailed mathematical approach
Problem Summary
For a fixed positive integer \(n\), consider the subtractive Euclidean algorithm on a coprime pair \((n,m)\): at each step, replace the larger number by the difference of the two numbers until one entry becomes 0. The quantity minimized in Problem 958 is this Euclidean labour, and among all residues with minimal labour we want the smallest positive representative of the symmetry orbit
$\{m,\ n-m,\ m^{-1},\ n-m^{-1}\}\pmod n,$
where \(m^{-1}\) is the modular inverse of \(m\) modulo \(n\).
The small checks built into the implementations are
$f(7)=2,\qquad f(89)=34,\qquad f(8191)=1856.$
For the final input \(n=1{,}000{,}000{,}000{,}039\), a direct scan over all \(1\le m<n\) is impossible. The successful solution rewrites the search in terms of primitive Euclid-tree states, splits a shortest path near its midpoint, and turns the second half into a reduced lattice problem followed by a short finishing Euclidean descent.
Mathematical Approach
Labour as a continued-fraction quantity
If
$\frac{n}{m}=[q_0,q_1,\dots,q_r]$
is the regular continued fraction of \(n/m\), then the subtractive Euclidean algorithm performs exactly
$\ell(n,m)=q_0+q_1+\cdots+q_r$
subtractions. So the problem is not asking for the smallest \(m\), but for the smallest \(m\) among those whose continued-fraction digit sum is minimal.
The primitive-pair tree behind Euclid's algorithm
The search is organized with the two unimodular moves
$L=\begin{pmatrix}0&1\\1&1\end{pmatrix},\qquad R=\begin{pmatrix}1&0\\1&1\end{pmatrix},$
which act on a primitive ordered pair \((a,b)\) as
$L:(a,b)\mapsto (b,a+b),\qquad R:(a,b)\mapsto (a,a+b).$
Starting from \((0,1)\), these moves generate the Stern-Brocot style tree of primitive nonnegative pairs. Every move contributes one unit of Euclidean labour, so a path of length \(s\) represents a coprime residue whose subtractive Euclidean cost is \(s\).
Splitting a shortest path at the midpoint
Fix a trial budget \(s\), and set
$d=\left\lfloor\frac{s+1}{2}\right\rfloor.$
The forward half of the path is explored explicitly to depth \(d\). To describe the backward half, the implementations carry a second vector \((c_a,c_b)\) satisfying the invariant
$a\,c_a+b\,c_b=n.$
This is preserved because the coefficient vector is updated by the inverse transpose of the same matrix used on \((a,b)\):
$L^{-T}:(c_a,c_b)\mapsto (c_b-c_a,c_a),\qquad R^{-T}:(c_a,c_b)\mapsto (c_a-c_b,c_b).$
So every midpoint of a valid shortest path lies on the integer line \(a\,c_a+b\,c_b=n\).
Reducing the midpoint to a fundamental strip
All integer solutions of \(a\,c_a+b\,c_b=n\) differ by multiples of \((b,-a)\). Define
$N=a^2+b^2,\qquad X=c_a b-c_b a.$
Replacing \((c_a,c_b)\) by \((c_a+k b,c_b-k a)\) keeps the dot product \(a\,c_a+b\,c_b\) unchanged, but changes the cross term by
$X\mapsto X+kN.$
Therefore the midpoint can be shifted into a canonical strip \(0\le X<N\). The additional condition
$c_a^2+c_b^2\le N$
means that the coefficient vector is no longer than the forward vector. Once the strip reduction is done, at most two neighboring lattice points can satisfy this norm bound, which is why the implementation only needs to test two candidates at the terminal stage.
The odd-budget midpoint conversion
When \(s\) is odd, the midpoint lies inside one Euclidean move rather than between two full levels. The code converts that case to the same geometry by replacing
$ (a,b,c_a,c_b)\longmapsto (2a,\ 2b-a,\ 2c_a+c_b,\ 2c_b). $
After this change, the preserved dot product becomes \(4n\), but the same norm reduction and the same finishing logic apply. That is why the odd and even cases share one core terminal check.
Finishing the second half and extracting the residue
From a valid midpoint, the remaining Euclidean labour is completed by a subtractive descent on a coefficient pair \((x,y)\), together with the dual updates on a primitive pair \((v_x,v_y)\): after ordering \(x\le y\), one step is
$ (x,y,v_x,v_y)\mapsto (x,\ y-x,\ v_x+v_y,\ v_y). $
The invariant
$x\,v_x+y\,v_y=n$
is preserved throughout. If one coefficient reaches 0 within the remaining budget, the associated residue is
$o\equiv v_x+v_y-n \pmod n.$
The code then checks that \(o\) is invertible modulo \(n\), and scores the orbit representative
$\min\{o,\ n-o,\ o^{-1},\ n-o^{-1}\}.$
The first budget \(s\) for which such a terminal state exists is the minimal Euclidean labour.
Worked example
For \(n=7\), take \(m=2\). Then
$\frac{7}{2}=[3,2],$
so the labour is \(3+2=5\). The same labour appears on the whole symmetry orbit
$\{2,5,4,3\}=\{m,\ 7-m,\ m^{-1},\ 7-m^{-1}\},$
because \(2^{-1}\equiv 4\pmod 7\). The smallest positive member of that orbit is \(2\), matching the validation value \(f(7)=2\). The larger checks \(f(89)=34\) and \(f(8191)=1856\) are produced by exactly the same mechanism, only with much deeper shortest paths.
How the Code Works
Iterative deepening over the labour budget
The C++, Python, and Java implementations increase the candidate labour \(s\) from 0 upward. For each \(s\), they set \(d=\lfloor(s+1)/2\rfloor\) and run a depth-first search over midpoint states \((a,b,c_a,c_b)\). Each recursive call normalizes the order to \(a\le b\), pushes \(c_a\) upward by whole copies of \((b,-a)\) when necessary, and immediately rejects states with \(c_b<0\) or with a broken invariant \(a\,c_a+b\,c_b\ne n\).
Bounding future growth before recursing
Before expanding deeper, the implementation computes an optimistic continuation obtained by always following the smallest-growth forward branch. If even that continuation cannot make the terminal norm large enough, the whole branch is impossible and is pruned early. The resulting tests are
$x^2+y^2\ge n$
for even budgets, and
$x^2+xy+\frac{5}{4}y^2\ge n$
for the odd-midpoint geometry. These inequalities are the main reason the search stays small in practice.
Terminal reduction and two candidate checks
At depth \(d\), the code applies the odd-budget conversion when needed, computes the norm \(N\) and cross term \(X\), and shifts the coefficient vector into the canonical strip \(0\le X<N\). Because only two neighboring lattice points can remain compatible with the norm bound \(c_a^2+c_b^2\le N\), the implementation performs exactly two terminal checks: the reduced point itself and the next neighbor obtained by subtracting one more copy of \((b,-a)\).
Completing the Euclidean descent
Each surviving terminal candidate is finished by a bounded subtractive Euclidean run on the second-half coefficients. The same code handles even and odd budgets; the odd case first undoes the midpoint conversion with parity checks so that the final descent always lives in the original integer state space. When the descent ends within the remaining budget, the residue \(o\) is extracted, its modular inverse is tested, and the best answer is updated using the pair "smallest labour first, smallest orbit representative second".
Complexity Analysis
If the minimal Euclidean labour is \(s^\ast\), a naive full search would branch over all words of length \(s^\ast\), which is exponential. The midpoint split reduces the explicit tree depth to roughly \(s^\ast/2\), so the raw search shape is closer to \(O(2^{s^\ast/2})\) midpoint states than to \(O(2^{s^\ast})\) full paths.
Each surviving midpoint requires only constant-size arithmetic on a few integers, plus at most \(s^\ast-d\) subtractive finishing steps. Memory usage is \(O(s^\ast)\) because the recursion stack and the terminal work both store only a bounded amount of state. In practice, the decisive factor is pruning: the sign normalization, norm bounds, lattice-strip reduction, and modular-inverse test remove the overwhelming majority of theoretical branches.
Footnotes and References
- Project Euler problem page: https://projecteuler.net/problem=958
- Euclidean algorithm: Wikipedia - Euclidean algorithm
- Continued fractions: Wikipedia - Continued fraction
- Stern-Brocot tree: Wikipedia - Stern-Brocot tree
- Modular multiplicative inverse: Wikipedia - Modular multiplicative inverse
- Extended Euclidean algorithm: Wikipedia - Extended Euclidean algorithm