Problem 643: $2$-Friendly
View on Project EulerProject Euler Problem 643 Solution
EulerSolve provides an optimized solution for Project Euler Problem 643, $2$-Friendly, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary A pair \((a,b)\) with \(1 \le a \lt b \le N\) is called 2-friendly when \(\gcd(a,b)\) is a power of two strictly larger than \(1\). If \(f(N)\) denotes the number of such pairs, the task is to compute $f(10^{11}) \pmod{10^9+7}.$ A direct scan over all pairs would be quadratic in \(N\), so the solution has to reorganize the counting problem around the exact gcd. Mathematical Approach We want to count all pairs whose gcd has the form \(2^t\) with \(t \ge 1\). The decisive observation is that once the exact gcd is fixed, what remains is a coprime pair-counting problem. Step 1: Separate the Exact Power-of-Two gcd Suppose \(\gcd(a,b)=2^t\) for some \(t \ge 1\). Then we can write $a=2^t x,\qquad b=2^t y,$ with $1 \le x \lt y \le \left\lfloor \frac{N}{2^t} \right\rfloor.$ Because the gcd was assumed to be exactly \(2^t\), the reduced pair must satisfy $\gcd(x,y)=1.$ Conversely, every coprime pair \((x,y)\) in that range produces exactly one original pair \((2^t x,2^t y)\) whose gcd is exactly \(2^t\). So there is no overcounting between different values of \(t\). Step 2: Count Coprime Pairs with Euler's Totient Function For a fixed upper bound \(m\), define $C(m)=\#\{(x,y):1 \le x \lt y \le m,\ \gcd(x,y)=1\}.$ If we fix the larger coordinate \(y\), then the valid values of \(x\) are precisely the integers \(1 \le x \lt y\) that are coprime to \(y\)....
Detailed mathematical approach
Problem Summary
A pair \((a,b)\) with \(1 \le a \lt b \le N\) is called 2-friendly when \(\gcd(a,b)\) is a power of two strictly larger than \(1\). If \(f(N)\) denotes the number of such pairs, the task is to compute
$f(10^{11}) \pmod{10^9+7}.$
A direct scan over all pairs would be quadratic in \(N\), so the solution has to reorganize the counting problem around the exact gcd.
Mathematical Approach
We want to count all pairs whose gcd has the form \(2^t\) with \(t \ge 1\). The decisive observation is that once the exact gcd is fixed, what remains is a coprime pair-counting problem.
Step 1: Separate the Exact Power-of-Two gcd
Suppose \(\gcd(a,b)=2^t\) for some \(t \ge 1\). Then we can write
$a=2^t x,\qquad b=2^t y,$
with
$1 \le x \lt y \le \left\lfloor \frac{N}{2^t} \right\rfloor.$
Because the gcd was assumed to be exactly \(2^t\), the reduced pair must satisfy
$\gcd(x,y)=1.$
Conversely, every coprime pair \((x,y)\) in that range produces exactly one original pair \((2^t x,2^t y)\) whose gcd is exactly \(2^t\). So there is no overcounting between different values of \(t\).
Step 2: Count Coprime Pairs with Euler's Totient Function
For a fixed upper bound \(m\), define
$C(m)=\#\{(x,y):1 \le x \lt y \le m,\ \gcd(x,y)=1\}.$
If we fix the larger coordinate \(y\), then the valid values of \(x\) are precisely the integers \(1 \le x \lt y\) that are coprime to \(y\). Their number is \(\varphi(y)\). Therefore
$C(m)=\sum_{y=2}^{m}\varphi(y).$
Now introduce the summatory totient function
$\Phi(m)=\sum_{k=1}^{m}\varphi(k).$
Since \(\varphi(1)=1\), we get the compact identity
$C(m)=\Phi(m)-1.$
Step 3: Sum over All Powers of Two
For a fixed \(t\), the admissible pairs with gcd \(2^t\) are counted by
$C\left(\left\lfloor \frac{N}{2^t} \right\rfloor\right)=\Phi\left(\left\lfloor \frac{N}{2^t} \right\rfloor\right)-1.$
Hence
$f(N)=\sum_{t \ge 1}\left(\Phi\left(\left\lfloor \frac{N}{2^t} \right\rfloor\right)-1\right).$
Only finitely many terms are nonzero, because once \(\left\lfloor N/2^t \right\rfloor \le 1\), there is no room for a pair with \(x \lt y\).
Step 4: Derive a Fast Recurrence for \(\Phi(n)\)
The implementations do not sum \(\varphi(k)\) up to \(n\) from scratch for every query. Instead they use the classical divisor identity
$\sum_{d \mid m}\varphi(d)=m.$
Summing this from \(m=1\) to \(m=n\) gives
$\sum_{m=1}^{n} m=\sum_{m=1}^{n}\sum_{d \mid m}\varphi(d).$
After exchanging the order of summation, this becomes
$\frac{n(n+1)}{2}=\sum_{q=1}^{n}\Phi\left(\left\lfloor \frac{n}{q} \right\rfloor\right).$
Separating the \(q=1\) term yields the recurrence
$\Phi(n)=\frac{n(n+1)}{2}-\sum_{q=2}^{n}\Phi\left(\left\lfloor \frac{n}{q} \right\rfloor\right).$
The floor value \(\left\lfloor n/q \right\rfloor\) is constant on intervals, so the sum is grouped by quotient blocks rather than processed one index at a time.
Step 5: Use Quotient Grouping and Memoization
If a block starts at \(l\), let
$v=\left\lfloor \frac{n}{l} \right\rfloor,\qquad r=\left\lfloor \frac{n}{v} \right\rfloor.$
Then every \(q\) in the interval \(l \le q \le r\) has the same quotient \(v\), so their total contribution is
$ (r-l+1)\,\Phi(v). $
This reduces the amount of work dramatically. Memoization then stores each large \(\Phi(v)\) value after its first computation, which is important because the outer formula asks for \(\Phi\) at the related arguments \(N/2,N/4,N/8,\dots\).
Worked Example: \(N=10\)
For \(N=10\), the relevant halvings are
$\left\lfloor \frac{10}{2} \right\rfloor=5,\qquad \left\lfloor \frac{10}{4} \right\rfloor=2,\qquad \left\lfloor \frac{10}{8} \right\rfloor=1.$
The last term contributes nothing because \(\Phi(1)-1=0\). So
$f(10)=\left(\Phi(5)-1\right)+\left(\Phi(2)-1\right).$
Now
$\Phi(5)=1+1+2+2+4=10,\qquad \Phi(2)=1+1=2,$
hence
$f(10)=9+1=10.$
The ten pairs are
$(2,4),(2,6),(2,8),(2,10),(4,6),(4,10),(6,8),(6,10),(8,10),(4,8).$
The first nine have gcd \(2\), and the last one has gcd \(4\).
How the Code Works
The C++, Python, and Java implementations follow the same structure. They first precompute \(\varphi(n)\) up to a fixed cutoff of five million with a linear sieve and store the prefix sums modulo \(10^9+7\). That makes every small \(\Phi(n)\) query an \(O(1)\) table lookup.
For larger \(n\), the implementation evaluates the recurrence
$\Phi(n)=\frac{n(n+1)}{2}-\sum_{q=2}^{n}\Phi\left(\left\lfloor \frac{n}{q} \right\rfloor\right)$
using quotient grouping, so each interval of equal floor value is handled in one step. The result is memoized, which prevents the same large summatory-totient value from being recomputed.
Finally, the main solve loop starts with \(m=\lfloor N/2 \rfloor\), repeatedly halves \(m\), and accumulates \(\Phi(m)-1\) until \(m \lt 2\). Every arithmetic step is reduced modulo \(10^9+7\).
Complexity Analysis
Let \(B=5{,}000{,}000\) be the precomputation cutoff used by the implementations. The linear sieve and prefix table take \(O(B)\) time and \(O(B)\) memory. The outer summation over \(N/2,N/4,N/8,\dots\) has only \(O(\log N)\) terms.
The expensive part is evaluating large \(\Phi(n)\) values, but quotient grouping compresses each summation into blocks of equal floor quotient, and memoization ensures repeated subproblems are solved once. That hybrid strategy is what makes \(N=10^{11}\) practical.
Footnotes and References
- Problem page: https://projecteuler.net/problem=643
- Euler's totient function: Wikipedia — Euler's totient function
- Coprime integers: Wikipedia — Coprime integers
- Greatest common divisor: Wikipedia — Greatest common divisor
- Dirichlet hyperbola method: Wikipedia — Dirichlet hyperbola method