Problem 931: Totient Graph
View on Project EulerProject Euler Problem 931 Solution
EulerSolve provides an optimized solution for Project Euler Problem 931, Totient Graph, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For each positive integer \(n\), consider the graph whose vertices are the positive divisors of \(n\). There is an upward edge from \(b\) to \(a\) exactly when \(a/b\) is prime, so each edge multiplies a divisor by one prime factor and nothing else. The quantity attached to \(n\) is the total edge weight $T(n)=\sum_{\substack{b\mid n,\ a\mid n\\ a/b\text{ prime}}}\bigl(\varphi(a)-\varphi(b)\bigr).$ The problem asks for the prefix sum $S(N)=\sum_{n=1}^{N} T(n)\pmod{715827883},\qquad N=10^{12}.$ A direct computation would have to build a divisor graph for every \(n\le N\), which is far too expensive. The implementations succeed because the graph sum collapses to a clean prime-power formula, and the global prefix can then be reorganized into prime-power loops plus a prime-prefix query structure. Mathematical Approach The key is to understand one fixed \(n\) exactly, and only then sum over all \(n\le N\). The divisor graph splits by prime direction Write $n=\prod_i p_i^{e_i}.$ Every allowed edge in the graph is obtained by choosing one prime \(p\mid n\) and moving from a divisor \(d\) to \(pd\), provided \(pd\mid n\). So the whole graph decomposes into independent families of edges, one family for each prime dividing \(n\)....
Detailed mathematical approach
Problem Summary
For each positive integer \(n\), consider the graph whose vertices are the positive divisors of \(n\). There is an upward edge from \(b\) to \(a\) exactly when \(a/b\) is prime, so each edge multiplies a divisor by one prime factor and nothing else. The quantity attached to \(n\) is the total edge weight
$T(n)=\sum_{\substack{b\mid n,\ a\mid n\\ a/b\text{ prime}}}\bigl(\varphi(a)-\varphi(b)\bigr).$
The problem asks for the prefix sum
$S(N)=\sum_{n=1}^{N} T(n)\pmod{715827883},\qquad N=10^{12}.$
A direct computation would have to build a divisor graph for every \(n\le N\), which is far too expensive. The implementations succeed because the graph sum collapses to a clean prime-power formula, and the global prefix can then be reorganized into prime-power loops plus a prime-prefix query structure.
Mathematical Approach
The key is to understand one fixed \(n\) exactly, and only then sum over all \(n\le N\).
The divisor graph splits by prime direction
Write
$n=\prod_i p_i^{e_i}.$
Every allowed edge in the graph is obtained by choosing one prime \(p\mid n\) and moving from a divisor \(d\) to \(pd\), provided \(pd\mid n\). So the whole graph decomposes into independent families of edges, one family for each prime dividing \(n\).
Fix one prime \(p\) with exact exponent \(p^e\parallel n\), and write
$n=p^e m,\qquad p\nmid m.$
Then every edge whose ratio is \(p\) has the form
$p^r d \longrightarrow p^{r+1} d,$
with \(0\le r<e\) and \(d\mid m\). So to find the total contribution of the \(p\)-direction, we only need to sum the totient differences on these edges.
Summing one prime direction exactly
Because \(p\nmid d\), Euler's totient function satisfies
$\varphi(p^r d)=\varphi(p^r)\varphi(d).$
There are two cases.
If \(r=0\), then
$\varphi(pd)-\varphi(d)=(p-1)\varphi(d)-\varphi(d)=(p-2)\varphi(d).$
If \(r\ge 1\), then
$\varphi(p^{r+1}d)-\varphi(p^r d)=p^r(p-1)\varphi(d)-p^{r-1}(p-1)\varphi(d)=p^{r-1}(p-1)^2\varphi(d).$
Therefore the full contribution of all \(p\)-edges is
$\sum_{d\mid m}\varphi(d)\left[(p-2)+\sum_{r=1}^{e-1}p^{r-1}(p-1)^2\right].$
The bracket simplifies to
$\begin{aligned} (p-2)+(p-1)^2\sum_{r=1}^{e-1}p^{r-1} &=(p-2)+(p-1)^2\frac{p^{e-1}-1}{p-1} \\ &=(p-2)+(p-1)(p^{e-1}-1) \\ &=p^e-p^{e-1}-1. \end{aligned}$
Now use the classical identity
$\sum_{d\mid m}\varphi(d)=m.$
So the entire \(p\)-direction contributes
$m\bigl(p^e-p^{e-1}-1\bigr)=n-\frac{n}{p}-\frac{n}{p^e}.$
Summing over the distinct prime powers of \(n\) gives the closed form used by the code:
$\boxed{T(n)=\sum_{p^e\parallel n}\left(n-\frac{n}{p}-\frac{n}{p^e}\right).}$
Worked example: \(n=12\)
The divisors of \(12\) are \(1,2,3,4,6,12\). The prime-ratio edges are
$1\to2,\quad 1\to3,\quad 2\to4,\quad 2\to6,\quad 3\to6,\quad 4\to12,\quad 6\to12.$
Their weights are
$0,\ 1,\ 1,\ 1,\ 0,\ 2,\ 2,$
because
$\varphi(1)=1,\ \varphi(2)=1,\ \varphi(3)=2,\ \varphi(4)=2,\ \varphi(6)=2,\ \varphi(12)=4.$
So \(T(12)=7\).
The closed form gives exactly the same result. Since \(12=2^2\cdot 3\),
$T(12)=\left(12-\frac{12}{2}-\frac{12}{2^2}\right)+\left(12-\frac{12}{3}-\frac{12}{3}\right)=3+4=7.$
This example shows what the formula is really doing: it adds the total contribution of each prime direction in the divisor graph.
Reordering the prefix sum by exact prime powers
To compute \(S(N)\), rewrite every \(n\) contributing to the term for \(p^e\) as
$n=p^e m,\qquad p\nmid m.$
For such an \(n\), the corresponding summand equals
$n-\frac{n}{p}-\frac{n}{p^e}=m\bigl(p^e-p^{e-1}-1\bigr).$
Hence
$S(N)=\sum_{p^e\le N}\bigl(p^e-p^{e-1}-1\bigr)\sum_{\substack{m\le N/p^e\\ p\nmid m}} m.$
The inner sum is simple once the multiples of \(p\) are removed. Define the triangular sum
$\operatorname{Tri}(x)=\frac{x(x+1)}{2}.$
Then
$\sum_{\substack{m\le x\\ p\nmid m}} m=\operatorname{Tri}(x)-p\,\operatorname{Tri}\!\left(\left\lfloor\frac{x}{p}\right\rfloor\right),$
because the excluded multiples are \(p,2p,\dots,p\lfloor x/p\rfloor\).
So the whole prefix sum becomes
$\boxed{S(N)=\sum_{p^e\le N}\bigl(p^e-p^{e-1}-1\bigr)\left[\operatorname{Tri}\!\left(\left\lfloor\frac{N}{p^e}\right\rfloor\right)-p\,\operatorname{Tri}\!\left(\left\lfloor\frac{N}{p^{e+1}}\right\rfloor\right)\right].}$
Why primes above \(\sqrt N\) can be batched together
The implementations split the summation at \(\sqrt N\).
If \(p\le \sqrt N\), the code can iterate over \(p,p^2,p^3,\dots\) directly.
If \(p>\sqrt N\), then \(p^2>N\), so only the exponent \(e=1\) can occur. In that case
$x=\left\lfloor\frac{N}{p}\right\rfloor<p,$
which means there is no positive multiple of \(p\) inside \(\{1,\dots,x\}\). Therefore
$\sum_{\substack{m\le x\\ p\nmid m}}m=\operatorname{Tri}(x),$
and the contribution of a large prime is simply
$\bigl(p-2\bigr)\operatorname{Tri}\!\left(\left\lfloor\frac{N}{p}\right\rfloor\right).$
Now group large primes by the constant quotient
$q=\left\lfloor\frac{N}{p}\right\rfloor.$
All primes in the interval
$L=\left\lfloor\frac{N}{q+1}\right\rfloor+1,\qquad R=\left\lfloor\frac{N}{q}\right\rfloor$
share this same \(q\), so their total contribution is
$\operatorname{Tri}(q)\sum_{L\le p\le R}(p-2).$
This reduces the large-prime side to interval queries for
$\pi(x)=\#\{p\le x\},\qquad P(x)=\sum_{p\le x}p.$
Once those two prime-prefix functions are available, every large-prime block can be evaluated as
$\operatorname{Tri}(q)\Bigl(P(R)-P(L-1)-2\bigl(\pi(R)-\pi(L-1)\bigr)\Bigr).$
How the Code Works
Validation comes first
The reference implementation does not assume the closed form blindly. It first builds the divisor graph directly for small \(n\), evaluates the edge sum from the definition, and checks that this agrees with the prime-power formula for all \(n\) in a small validation range. It also verifies several prefix values, including \(S(10)=26\) and \(S(100)=5282\), before moving on to the full \(N=10^{12}\) computation.
Compressed prime-prefix preprocessing
To answer many prime-count and prime-sum queries quickly, the implementation stores the distinct floor-quotients
$\left\lfloor\frac{N}{1}\right\rfloor,\left\lfloor\frac{N}{2}\right\rfloor,\left\lfloor\frac{N}{3}\right\rfloor,\dots$
instead of every integer up to \(N\). There are only \(O(\sqrt N)\) distinct values of this form. On that compressed coordinate set it initializes counts and sums of all integers at least \(2\), then removes composite contributions prime by prime. After that preprocessing, the code can query \(\pi(x)\) and \(P(x)\) for every relevant \(x\) in constant time.
Accumulating the answer
The final sum is assembled in two phases. First, every prime \(p\le\sqrt N\) is traversed through its powers \(p,p^2,p^3,\dots\), and the exact prime-power formula is added. Second, primes above \(\sqrt N\) are processed in quotient blocks with constant \(q=\lfloor N/p\rfloor\), using prime counts and prime sums on intervals.
All arithmetic is reduced modulo \(715827883\). The C++ implementation uses 128-bit intermediates before taking residues, the Python implementation relies on arbitrary-precision integers, and the Java entry point delegates the heavy computation to the compiled C++ solver and returns its result.
Complexity Analysis
The distinct values of \(\lfloor N/t\rfloor\) form an \(O(\sqrt N)\)-sized state space, and both the large-prime grouping and the query structure are built around that compression. Memory usage is therefore \(O(\sqrt N)\).
Time is dominated by the compressed prime-prefix sieve together with the sweep over primes up to \(\sqrt N\) and their powers. The important point is qualitative: the algorithm is far below linear in \(N\) and avoids any attempt to enumerate divisor graphs for all \(n\le 10^{12}\). That is what makes the computation practical.
Footnotes and References
- Problem page: https://projecteuler.net/problem=931
- Euler's totient function: Wikipedia - Euler's totient function
- Divisor: Wikipedia - Divisor
- Prime-counting function: Wikipedia - Prime-counting function
- Dirichlet hyperbola method: Wikipedia - Dirichlet hyperbola method