Problem 758: Buckets of Water
View on Project EulerProject Euler Problem 758 Solution
EulerSolve provides an optimized solution for Project Euler Problem 758, Buckets of Water, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For coprime positive integers \(a\le b\), the quantity studied here is built from Bézout relations between \(a\) and \(b\). In the Project Euler instance, the inputs are Mersenne numbers \(M_u=2^u-1\) and \(M_v=2^v-1\), where \(u=r^5\) and \(v=s^5\) for distinct primes \(r\lt s\lt 1000\). The goal is to evaluate the contribution for every such prime pair and sum all contributions modulo \(10^9+7\). Directly constructing \(2^u-1\) and \(2^v-1\) is hopeless because the exponents are enormous. The key observation is that the Euclidean algorithm on Mersenne numbers can be simulated on the exponents alone, while every coefficient that matters is tracked only modulo the final modulus. Mathematical Approach Write \(M_n=2^n-1\). For two coprime positive integers \(a\le b\), choose the least positive inverse \(x\) of \(a\) modulo \(b\): $a x \equiv 1 \pmod{b},\qquad 1\le x \lt b.$ Then $y=\frac{a x - 1}{b}$ is a positive integer and satisfies \(a x - b y = 1\). The complementary positive solution to the opposite sign equation is \((b-x,\ a-y)\), so the implementations evaluate $P(a,b)=2\min\left(x+y-1,\ (b-x)+(a-y)-1\right).$ The problem is therefore to compute \(P(M_u,M_v)\) quickly for very large exponents \(u\) and \(v\). Step 1: Replace gigantic integers by Mersenne exponents Let \(a=M_u\) and \(b=M_v\) with \(u\lt v\)....
Detailed mathematical approach
Problem Summary
For coprime positive integers \(a\le b\), the quantity studied here is built from Bézout relations between \(a\) and \(b\). In the Project Euler instance, the inputs are Mersenne numbers \(M_u=2^u-1\) and \(M_v=2^v-1\), where \(u=r^5\) and \(v=s^5\) for distinct primes \(r\lt s\lt 1000\). The goal is to evaluate the contribution for every such prime pair and sum all contributions modulo \(10^9+7\).
Directly constructing \(2^u-1\) and \(2^v-1\) is hopeless because the exponents are enormous. The key observation is that the Euclidean algorithm on Mersenne numbers can be simulated on the exponents alone, while every coefficient that matters is tracked only modulo the final modulus.
Mathematical Approach
Write \(M_n=2^n-1\). For two coprime positive integers \(a\le b\), choose the least positive inverse \(x\) of \(a\) modulo \(b\):
$a x \equiv 1 \pmod{b},\qquad 1\le x \lt b.$
Then
$y=\frac{a x - 1}{b}$
is a positive integer and satisfies \(a x - b y = 1\). The complementary positive solution to the opposite sign equation is \((b-x,\ a-y)\), so the implementations evaluate
$P(a,b)=2\min\left(x+y-1,\ (b-x)+(a-y)-1\right).$
The problem is therefore to compute \(P(M_u,M_v)\) quickly for very large exponents \(u\) and \(v\).
Step 1: Replace gigantic integers by Mersenne exponents
Let \(a=M_u\) and \(b=M_v\) with \(u\lt v\). If \(u\) and \(v\) are coprime, then the classical identity
$\gcd(M_u,M_v)=M_{\gcd(u,v)}$
gives
$\gcd(M_u,M_v)=M_1=1.$
This is exactly the situation here, because distinct prime fifth powers remain coprime:
$\gcd(r^5,s^5)=1\qquad (r\ne s).$
So every target pair is valid for the Bézout setup, even though the numbers themselves are astronomically large.
Step 2: Run the Euclidean algorithm on exponents
If \(A=qB+r\) with \(0\le r \lt B\), then
$2^A-1=2^r\left(2^{qB}-1\right)+\left(2^r-1\right).$
After factoring \(2^{qB}-1=(2^B-1)\sum_{j=0}^{q-1}2^{jB}\), this becomes
$M_A = Q\,M_B + M_r,$
with quotient
$Q = 2^r\sum_{j=0}^{q-1}2^{jB}.$
Therefore the remainder when dividing \(M_A\) by \(M_B\) is exactly \(M_r\). The full remainder sequence for \((M_A,M_B)\) is obtained by applying ordinary Euclid to the exponent pair \((A,B)\). This is the compression that makes the problem tractable.
Step 3: Compute Euclidean quotients without big integers
Each Euclidean step needs the quotient \(Q\) only modulo \(10^9+7\). Using the formula above, we only need
$Q \equiv 2^r\sum_{j=0}^{q-1}\left(2^B\right)^j \pmod{10^9+7}.$
The implementation evaluates this geometric sum by divide and conquer, simultaneously tracking a power and the corresponding partial sum. As a result, every quotient update is handled with modular arithmetic and logarithmic recursion depth, not with gigantic integers.
Feeding these modular quotients into the extended Euclidean recurrence yields the inverse of a Mersenne remainder modulo another Mersenne number, but represented only modulo \(10^9+7\).
Step 4: Recover the inverse of the Mersenne remainder
Let
$w = v \bmod u,$
and let \(t\) be the inverse of \(w\) modulo \(u\):
$w t \equiv 1 \pmod{u},\qquad 1\le t \lt u.$
Now consider
$R = 1 + 2^w + 2^{2w} + \cdots + 2^{(t-1)w}.$
Then
$\begin{aligned} (2^w-1)R &= 2^{tw}-1 \\ &\equiv 2^1-1 \equiv 1 \pmod{2^u-1}, \end{aligned}$
so \(R\) is an inverse of \(M_w\) modulo \(M_u\). Because \(M_v\equiv M_w \pmod{M_u}\), the same value also acts as an inverse of \(M_v\) modulo \(M_u\).
The opposite Bézout sign branch is obtained by replacing \(R\) with \(M_u-R\). The implementation chooses between these two branches with the criterion
$2t\le u,$
which is equivalent to choosing between the \(t\)-term geometric sum and its \((u-t)\)-term complement inside \(M_u=1+2+\cdots+2^{u-1}\).
Step 5: Convert the inverse into the required Bézout sum
Let \(d\) be the branch selected in the previous step. Then one of the two relations
$M_v d = M_u x \pm 1.$
holds, depending on the chosen sign. Once \(x\) is recovered, the contribution is
$P(M_u,M_v)=2(x+d-1).$
This is the same minimum as in the generic formula for \(P(a,b)\); the only difference is that the implementation works with the coefficient attached to \(M_v\), because that is the quantity naturally produced by the Mersenne inverse computation.
Step 6: Worked example with small exponents
Take \(u=3\) and \(v=5\). Then
$M_3=7,\qquad M_5=31,\qquad w=5\bmod 3=2.$
The inverse of \(w=2\) modulo \(u=3\) is \(t=2\), since
$2\cdot 2 \equiv 1 \pmod{3}.$
Hence
$R = 1 + 2^2 = 5,$
and indeed
$M_2 R = 3\cdot 5 = 15 \equiv 1 \pmod{7}.$
Because \(2t=4>3\), the smaller branch uses the complement
$d = M_3 - R = 7-5=2.$
Now
$x=\frac{M_5 d + 1}{M_3}=\frac{31\cdot 2 + 1}{7}=9,$
so
$P(M_3,M_5)=2(x+d-1)=2(9+2-1)=20.$
This matches the small verification case built into the implementation.
Step 7: Final summation over prime fifth powers
If \(p_1,p_2,\dots,p_k\) are the primes below \(1000\), the final value is
$\sum_{1\le i \lt j\le k} P\left(M_{p_i^5},M_{p_j^5}\right)\pmod{10^9+7}.$
The entire mathematical task is therefore to evaluate each pair contribution via exponent-level Euclid and accumulate the results modulo the final prime modulus.
How the Code Works
The C++, Python, and Java implementations all expose the same computation, but the actual number-theoretic work is performed by the C++ implementation. It begins with several small checkpoints: first on ordinary coprime integer pairs, and then on small Mersenne exponents where a direct comparison against the generic Bézout formula is still possible.
After the checkpoints, the implementation generates all primes below \(1000\), raises each of them to the fifth power, and iterates over every unordered pair. For each pair \((u,v)\), it computes \(M_u \bmod 10^9+7\) and \(M_v \bmod 10^9+7\), runs the Euclidean algorithm on the exponent pair, and reconstructs the required quotient information only modulo \(10^9+7\).
The key accelerator is the divide-and-conquer evaluation of geometric sums. Instead of forming the true quotient between two huge Mersenne numbers, the implementation directly produces that quotient modulo \(10^9+7\), which is exactly what the extended Euclidean recurrence needs. The branch decision based on the inverse of \(w\) modulo \(u\) then determines which Bézout sign gives the smaller contribution.
Finally, the contribution of every prime pair is added into a running sum modulo \(10^9+7\). The Python and Java implementations are thin launchers around that same compiled computation, so all three language entries stay synchronized on both the mathematics and the final output.
Complexity Analysis
There are \(168\) primes below \(1000\), so the outer loop evaluates \(\binom{168}{2}=14028\) unordered pairs. For one pair, the exponent-level Euclidean algorithm takes \(O(\log \max(u,v))\) divisions, just as ordinary Euclid does on integers. Each division performs a constant number of modular exponentiations and one divide-and-conquer geometric-sum evaluation, both logarithmic in the relevant exponent sizes. Therefore the total running time is \(O(P^2\operatorname{polylog} U)\) with \(P=168\) and \(U=\max p^5\), while memory usage is \(O(P)\) for the prime list plus \(O(\log U)\) recursion depth.
Footnotes and References
- Problem page: https://projecteuler.net/problem=758
- Bézout's identity: Wikipedia — Bézout's identity
- Extended Euclidean algorithm: Wikipedia — Extended Euclidean algorithm
- Mersenne number: Wikipedia — Mersenne number
- Modular multiplicative inverse: Wikipedia — Modular multiplicative inverse
- Geometric series: Wikipedia — Geometric series