Problem 72: Counting Fractions
View on Project EulerProject Euler Problem 72 Solution
EulerSolve provides an optimized solution for Project Euler Problem 72, Counting Fractions, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We must count all reduced proper fractions \(a/b\) with \(0 \lt a \lt b \le 1{,}000{,}000\) and \(\gcd(a,b)=1\). Equivalently, if \(F_N\) is the Farey sequence of order \(N\), the requested quantity is \(|F_N|-2\), because the two boundary fractions \(0/1\) and \(1/1\) are excluded. The implementations do not enumerate fractions directly. Instead, they group the count by denominator and reduce the problem to one exact summation over Euler's totient function. Mathematical Approach Let \(T(N)\) denote the number of reduced proper fractions with denominator at most \(N\). The decisive observation is that for each fixed denominator, the allowed numerators are counted by a standard arithmetic function, so the whole problem collapses to a sieve followed by a sum. Count Fractions One Denominator at a Time Fix a denominator \(d \ge 2\). The admissible numerators are the integers \(a\) with \(1 \le a \lt d\) and \(\gcd(a,d)=1\). By definition, the number of such integers is Euler's totient: $\varphi(d)=\#\{\,1 \le a \lt d : \gcd(a,d)=1\,\}.$ Therefore the full answer is $T(N)=\sum_{d=2}^{N}\varphi(d).$ That identity is the real content of the problem: every reduced proper fraction appears exactly once, classified by its denominator....
Detailed mathematical approach
Problem Summary
We must count all reduced proper fractions \(a/b\) with \(0 \lt a \lt b \le 1{,}000{,}000\) and \(\gcd(a,b)=1\). Equivalently, if \(F_N\) is the Farey sequence of order \(N\), the requested quantity is \(|F_N|-2\), because the two boundary fractions \(0/1\) and \(1/1\) are excluded.
The implementations do not enumerate fractions directly. Instead, they group the count by denominator and reduce the problem to one exact summation over Euler's totient function.
Mathematical Approach
Let \(T(N)\) denote the number of reduced proper fractions with denominator at most \(N\). The decisive observation is that for each fixed denominator, the allowed numerators are counted by a standard arithmetic function, so the whole problem collapses to a sieve followed by a sum.
Count Fractions One Denominator at a Time
Fix a denominator \(d \ge 2\). The admissible numerators are the integers \(a\) with \(1 \le a \lt d\) and \(\gcd(a,d)=1\). By definition, the number of such integers is Euler's totient:
$\varphi(d)=\#\{\,1 \le a \lt d : \gcd(a,d)=1\,\}.$
Therefore the full answer is
$T(N)=\sum_{d=2}^{N}\varphi(d).$
That identity is the real content of the problem: every reduced proper fraction appears exactly once, classified by its denominator.
Why the Sieve Update Is Correct
If the distinct prime divisors of \(d\) are \(p_1,\dots,p_k\), then
$\varphi(d)=d\prod_{p\mid d}\left(1-\frac{1}{p}\right).$
The implementations use this product formula in sieve form. They start from \(\varphi(d)=d\) for every \(d\). When a prime \(p\) is processed, every multiple of \(p\) must lose exactly the fraction \(\frac{1}{p}\) of the numbers still being counted, so the update becomes
$\varphi(m)\leftarrow \varphi(m)-\frac{\varphi(m)}{p}, \quad p\mid m.$
This is exact integer arithmetic. Before prime \(p\) is handled, the entry for \(m\) has already been adjusted for smaller prime divisors of \(m\), but the factor \(p\) from the original number \(m\) is still present. So the current value remains divisible by \(p\), and no floating-point approximation is involved.
The Prime-Detection Invariant
The loop scans \(2,3,4,\dots,N\). At the moment it reaches \(i\), the equality \(\varphi(i)=i\) means that no smaller prime has modified that entry. A composite number would already have been touched by its smallest prime factor, so an unchanged entry must be prime.
This is the key invariant behind the sieve: unchanged entries identify primes, and each discovered prime updates all of its multiples exactly once.
Worked Example: \(N=8\)
For denominator \(5\), every numerator \(1,2,3,4\) is coprime to \(5\), so \(\varphi(5)=4\). For denominator \(6\), only \(1\) and \(5\) survive because \(2,3,4\) share a common factor with \(6\), so \(\varphi(6)=2\).
Continuing through all denominators from \(2\) to \(8\) gives
$\varphi(2)=1,\ \varphi(3)=2,\ \varphi(4)=2,\ \varphi(5)=4,\ \varphi(6)=2,\ \varphi(7)=6,\ \varphi(8)=4.$
Hence
$T(8)=1+2+2+4+2+6+4=21.$
This small case makes the structure visible and is also the checkpoint used by the implementations that include a built-in sanity test.
Why a 64-Bit Total Is Necessary
Each individual totient value satisfies \(\varphi(d)\le d\le 10^6\), so the table itself fits comfortably in 32-bit integers. The final sum does not: it grows on the order of \(N^2\). As a result, the accumulator must use a 64-bit integer type in C++ and Java, while Python relies on arbitrary-precision integers automatically.
How the Code Works
Initialize the Totient Table
The C++, Python, and Java implementations allocate an array of length \(N+1\) and store the index itself in each position. After this step, the table represents the unreduced starting point \(\varphi(d)=d\).
Sieve Over the Primes
The main loop runs from \(2\) to \(N\). Whenever an entry is still equal to its index, the current number is prime. The implementation then visits every multiple of that prime and applies the exact update \(\varphi(m)\leftarrow \varphi(m)-\varphi(m)/p\).
By the end of the sweep, each table entry has received one multiplicative correction for every distinct prime dividing its index, so the array contains all totient values up to the limit.
Accumulate the Answer
The final pass computes \(\varphi(2)+\varphi(3)+\cdots+\varphi(N)\). Two of the implementations also verify the checkpoint \(T(8)=21\) before the full run, which is a simple guard against an indexing or sieve mistake.
Complexity Analysis
Initializing the table costs \(O(N)\). The sieve work is
$\sum_{p\le N}\frac{N}{p}=O(N\log\log N),$
where the sum is over primes. The final accumulation is another \(O(N)\) pass, so the overall running time is \(O(N\log\log N)\).
The space usage is \(O(N)\) for the totient table. For \(N=10^6\), this is easily practical, which is why the direct summatory-totient method is the natural exact solution.
Footnotes and References
- Problem page: Project Euler 72 - Counting Fractions
- Euler's totient function: Wikipedia - Euler's totient function
- Farey sequence: Wikipedia - Farey sequence
- Coprime integers: Wikipedia - Coprime integers