Problem 702: Jumping Flea
View on Project EulerProject Euler Problem 702 Solution
EulerSolve provides an optimized solution for Project Euler Problem 702, Jumping Flea, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Project Euler 702 asks for the exact value of a quantity \(S(N)\) coming from the jumping-flea process. For the real input \(N=123456789\), any direct simulation is far too large, so the solution used by the implementation rewrites the problem as a short dyadic decomposition plus a recursively reduced correction term. The important point is that the geometry of the statement has already been compressed into arithmetic. Instead of tracking every move, the computation only needs powers of two around \(N\), one final gap term \(2^\lambda-N\), and a memoized Euclidean-style kernel on reduced integer pairs. Mathematical Approach Let $\lambda=\lfloor\log_2 N\rfloor+1,$ so that $2^{\lambda-1}\le N<2^\lambda.$ The implementation evaluates \(S(N)\) by splitting it into a closed bulk contribution and a family of correction terms. Step 1: Isolate the Dyadic Scale The top-level identity used by the implementation is $S(N)=\frac{N(3N+1)}{2}(\lambda+1)-\sum_{d=2}^{\lambda} C(N,2^d)+2C(N,2^\lambda-N).$ The factor \(\lambda\) is simply the bit length of \(N\). This formula says that the whole answer is controlled by the powers of two that bracket \(N\), plus one last correction measuring the distance from \(N\) to the next power of two. That already changes the scale of the task: instead of a huge search, we only need \(\lambda-1=O(\log N)\) correction evaluations....
Detailed mathematical approach
Problem Summary
Project Euler 702 asks for the exact value of a quantity \(S(N)\) coming from the jumping-flea process. For the real input \(N=123456789\), any direct simulation is far too large, so the solution used by the implementation rewrites the problem as a short dyadic decomposition plus a recursively reduced correction term.
The important point is that the geometry of the statement has already been compressed into arithmetic. Instead of tracking every move, the computation only needs powers of two around \(N\), one final gap term \(2^\lambda-N\), and a memoized Euclidean-style kernel on reduced integer pairs.
Mathematical Approach
Let
$\lambda=\lfloor\log_2 N\rfloor+1,$
so that
$2^{\lambda-1}\le N<2^\lambda.$
The implementation evaluates \(S(N)\) by splitting it into a closed bulk contribution and a family of correction terms.
Step 1: Isolate the Dyadic Scale
The top-level identity used by the implementation is
$S(N)=\frac{N(3N+1)}{2}(\lambda+1)-\sum_{d=2}^{\lambda} C(N,2^d)+2C(N,2^\lambda-N).$
The factor \(\lambda\) is simply the bit length of \(N\). This formula says that the whole answer is controlled by the powers of two that bracket \(N\), plus one last correction measuring the distance from \(N\) to the next power of two.
That already changes the scale of the task: instead of a huge search, we only need \(\lambda-1=O(\log N)\) correction evaluations.
Step 2: Split Off the Explicit Quadratic Part
The correction term is written as
$C(u,v)=(v-1)(v-2)-\Phi(u,v),$
where \(\Phi(u,v)\) is the genuinely recursive part. The quadratic piece \((v-1)(v-2)\) is immediate, so the computational difficulty is concentrated entirely in \(\Phi\).
Before doing anything else, the first argument is reduced modulo the second. Only the residue of \(u\) modulo \(v\) matters. If that reduced value is \(0\) or \(1\), then the recursion stops and
$\Phi(u,v)=0.$
This base case is the mechanism that makes the descent finite.
Step 3: Apply the Euclidean Quotient Decomposition
Assume the reduced first argument is \(a\), with \(2\le a<v\). Write
$q=\left\lfloor\frac{v}{a}\right\rfloor,\qquad r=v\bmod a.$
Then all three implementations use the exact identity
$\Phi(u,v)=\frac{q(q+1)a(a-1)}{4}+(q+1)\Phi(a,r)-q\,\Phi(a,a-r).$
The first term summarizes an entire quotient block in closed form. The remaining two terms are smaller subproblems. Both new moduli, \(r\) and \(a-r\), are strictly smaller than \(a\), so the recursion shrinks in the same spirit as the Euclidean algorithm.
Step 4: Why the Recurrence Stays Exact
The polynomial factor
$\frac{q(q+1)a(a-1)}{4}$
is always an integer. The reason is that \(q(q+1)\) is even and \(a(a-1)\) is also even, so their product is divisible by \(4\).
No floating-point approximation is needed anywhere. Every recursive branch stays in integer arithmetic, which is why memoization can store exact values and why the final result agrees across the C++, Python, and Java implementations.
Step 5: Why Memoization Removes Most of the Work
The same reduced pair can appear from several top-level correction terms. Once \(\Phi(a,v)\) has been computed, every later request for the same pair is just a table lookup.
This matters because the outer formula repeatedly asks for nearby dyadic moduli. Memoization collapses a large recursion tree into a much smaller directed acyclic graph of distinct residue pairs.
Worked Example: \(N=5\)
For \(N=5\), we have \(\lambda=\lfloor\log_2 5\rfloor+1=3\). Therefore
$S(5)=\frac{5(3\cdot5+1)}{2}(3+1)-C(5,4)-C(5,8)+2C(5,3).$
The closed bulk term is
$\frac{5\cdot16}{2}\cdot4=160.$
Now compute the corrections.
For \(C(5,4)\), reducing \(5\bmod 4\) gives \(1\), so \(\Phi(5,4)=0\). Hence
$C(5,4)=(4-1)(4-2)=6.$
For \(C(5,3)\), the reduced first argument is \(2\). Then \(q=\lfloor 3/2\rfloor=1\) and \(r=1\), so
$\Phi(5,3)=\frac{1\cdot2\cdot2\cdot1}{4}+2\Phi(2,1)-\Phi(2,1)=1.$
Therefore
$C(5,3)=(3-1)(3-2)-1=1.$
For \(C(5,8)\), we have \(a=5\), \(q=1\), \(r=3\). Thus
$\Phi(5,8)=\frac{1\cdot2\cdot5\cdot4}{4}+2\Phi(5,3)-\Phi(5,2)=10+2\cdot1-0=12,$
because \(5\bmod 2=1\), so \(\Phi(5,2)=0\). Therefore
$C(5,8)=(8-1)(8-2)-12=42-12=30.$
Putting the pieces together,
$S(5)=160-6-30+2\cdot1=126,$
which matches the checkpoint used by the implementation.
How the Code Works
The C++, Python, and Java implementations all follow the same structure. They first compute \(\lambda\), then form the bulk term \(\frac{N(3N+1)}{2}(\lambda+1)\), then subtract the dyadic corrections \(C(N,2^d)\) for \(2\le d\le\lambda\), and finally add \(2C(N,2^\lambda-N)\).
Each correction evaluation begins by reducing the first argument modulo the second. If the reduced value is \(0\) or \(1\), the answer is immediate. Otherwise the implementation computes the quotient \(q\), the remainder \(r\), applies the recurrence above, and stores the result in a memo table keyed by the reduced pair.
The compiled implementations widen intermediate products for the quadratic term before converting back to a 64-bit result, while Python uses arbitrary-precision integers automatically. Algorithmically, though, the three versions are the same, and they agree with the published checkpoints \(S(3)=42\), \(S(5)=126\), \(S(123)=167178\), and \(S(12345)=3185041956\).
Complexity Analysis
The outer decomposition uses only \(O(\log N)\) correction terms because \(\lambda\) is the bit length of \(N\). The real cost is determined by the number of distinct reduced pairs that appear during the recursive descent.
If \(M\) denotes the number of distinct memoized states, then the running time is \(O(M)\) after hash-table lookups, and the memory usage is also \(O(M)\). In practice \(M\) stays small because each uncached step replaces the current modulus by values below the current residue, giving Euclidean-style shrinkage rather than combinatorial explosion.
Footnotes and References
- Problem page: https://projecteuler.net/problem=702
- Euclidean algorithm: Wikipedia - Euclidean algorithm
- Division algorithm: Wikipedia - Division algorithm
- Binary logarithm: Wikipedia - Binary logarithm
- Memoization: Wikipedia - Memoization