Problem 448: Average Least Common Multiple
View on Project EulerProject Euler Problem 448 Solution
EulerSolve provides an optimized solution for Project Euler Problem 448, Average Least Common Multiple, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Define $A(n)=\frac{1}{n}\sum_{i=1}^{n}\operatorname{lcm}(i,n),\qquad S(N)=\sum_{k=1}^{N}A(k).$ The goal is to evaluate \(S(99999999019)\bmod 999999017\). A direct computation of every least common multiple is far too slow, so the solution rewrites the average in terms of multiplicative functions and then evaluates the resulting summatory functions with floor-division blocks. Mathematical Approach Step 1: Rewrite One Average by Grouping Equal GCDs Fix \(n\). For each \(i\in\{1,\dots,n\}\), let \(d=\gcd(i,n)\), write \(n=dk\), and write \(i=dj\). Then \(\gcd(j,k)=1\), and $\operatorname{lcm}(i,n)=\frac{in}{\gcd(i,n)}=\frac{djn}{d}=nj.$ So the contribution depends only on the reduced residue \(j\) modulo \(k\). For a fixed divisor \(k\mid n\), the admissible \(j\) are exactly the integers \(1\le j\le k\) with \(\gcd(j,k)=1\). Therefore $\sum_{i=1}^{n}\operatorname{lcm}(i,n)=n\sum_{k\mid n}\sum_{\substack{1\le j\le k\\ \gcd(j,k)=1}} j.$ For \(k>1\), reduced residues come in pairs \(j\) and \(k-j\), so their sum is $\sum_{\substack{1\le j\le k\\ \gcd(j,k)=1}} j=\frac{k\varphi(k)}{2}.$ The special case \(k=1\) contributes \(1\)....
Detailed mathematical approach
Problem Summary
Define
$A(n)=\frac{1}{n}\sum_{i=1}^{n}\operatorname{lcm}(i,n),\qquad S(N)=\sum_{k=1}^{N}A(k).$
The goal is to evaluate \(S(99999999019)\bmod 999999017\). A direct computation of every least common multiple is far too slow, so the solution rewrites the average in terms of multiplicative functions and then evaluates the resulting summatory functions with floor-division blocks.
Mathematical Approach
Step 1: Rewrite One Average by Grouping Equal GCDs
Fix \(n\). For each \(i\in\{1,\dots,n\}\), let \(d=\gcd(i,n)\), write \(n=dk\), and write \(i=dj\). Then \(\gcd(j,k)=1\), and
$\operatorname{lcm}(i,n)=\frac{in}{\gcd(i,n)}=\frac{djn}{d}=nj.$
So the contribution depends only on the reduced residue \(j\) modulo \(k\). For a fixed divisor \(k\mid n\), the admissible \(j\) are exactly the integers \(1\le j\le k\) with \(\gcd(j,k)=1\). Therefore
$\sum_{i=1}^{n}\operatorname{lcm}(i,n)=n\sum_{k\mid n}\sum_{\substack{1\le j\le k\\ \gcd(j,k)=1}} j.$
For \(k>1\), reduced residues come in pairs \(j\) and \(k-j\), so their sum is
$\sum_{\substack{1\le j\le k\\ \gcd(j,k)=1}} j=\frac{k\varphi(k)}{2}.$
The special case \(k=1\) contributes \(1\). Combining both cases gives the classical identity
$\sum_{i=1}^{n}\operatorname{lcm}(i,n)=\frac{n}{2}\left(1+\sum_{k\mid n}k\varphi(k)\right),$
hence
$\boxed{A(n)=\frac{1}{2}\left(1+\sum_{k\mid n}k\varphi(k)\right).}$
Step 2: Turn the Outer Sum into a Divisor Sum
Summing the formula for \(A(n)\) over \(1\le n\le N\) yields
$S(N)=\frac{1}{2}\left(N+\sum_{n=1}^{N}\sum_{d\mid n}d\varphi(d)\right).$
Swap the order of summation: every divisor \(d\) contributes once for each multiple of \(d\) up to \(N\), that is, \(\left\lfloor N/d\right\rfloor\) times. Define
$R(N)=\sum_{d=1}^{N}d\varphi(d)\left\lfloor\frac{N}{d}\right\rfloor.$
Then
$\boxed{S(N)=\frac{N+R(N)}{2}.}$
This is the key reduction: the original lcm-average problem becomes a weighted divisor summatory problem.
Step 3: Evaluate \(R(N)\) by Quotient Blocks
Introduce the weighted totient prefix sum
$P(x)=\sum_{n\le x}n\varphi(n).$
When \(\left\lfloor N/d\right\rfloor\) is constant on an interval \([l,r]\), we can collapse that whole interval into one prefix-difference:
$R(N)=\sum_{[l,r]}\left\lfloor\frac{N}{l}\right\rfloor\bigl(P(r)-P(l-1)\bigr).$
The intervals are determined by the standard rule \(v=\left\lfloor N/l\right\rfloor\), \(r=\left\lfloor N/v\right\rfloor\). There are only \(O(\sqrt{N})\) distinct quotient values, so block decomposition removes the need to scan every \(d\le N\) individually.
Step 4: Express \(P(x)\) with the Möbius Function
Use the identity
$\varphi(n)=\sum_{d\mid n}\mu(d)\frac{n}{d}.$
Multiplying by \(n\) gives
$n\varphi(n)=\sum_{d\mid n}d\mu(d)\left(\frac{n}{d}\right)^2.$
Now sum over \(n\le x\), write \(n=dt\), and separate the variables:
$P(x)=\sum_{d\le x}d\mu(d)\sum_{t\le x/d}t^2.$
Let
$U(m)=\sum_{t=1}^{m}t^2=\frac{m(m+1)(2m+1)}{6}.$
Then
$\boxed{P(x)=\sum_{d\le x}d\mu(d)\,U\!\left(\left\lfloor\frac{x}{d}\right\rfloor\right).}$
So the only missing ingredient is a fast way to evaluate the prefix sum of \(d\mu(d)\).
Step 5: Recurrence for the Weighted Möbius Prefix
Define
$M(x)=\sum_{n\le x}n\mu(n).$
Set \(f(n)=n\mu(n)\). The Dirichlet-convolution identity \(\operatorname{id}*\mu=\varphi\) implies another useful relation:
$\sum_{d\mid m} d\,f\!\left(\frac{m}{d}\right)=m\sum_{e\mid m}\mu(e)= \begin{cases} 1,&m=1,\\ 0,&m>1. \end{cases}$
Summing this over \(1\le m\le x\) and exchanging the order of summation gives
$\sum_{d\le x} d\,M\!\left(\left\lfloor\frac{x}{d}\right\rfloor\right)=1.$
Separating the term \(d=1\) yields the recursion
$\boxed{M(x)=1-\sum_{d=2}^{x} d\,M\!\left(\left\lfloor\frac{x}{d}\right\rfloor\right).}$
Again, \(\left\lfloor x/d\right\rfloor\) is constant on quotient blocks, so this becomes
$M(x)=1-\sum_{[l,r]\subseteq[2,x]}\left(\sum_{d=l}^{r}d\right)M\!\left(\left\lfloor\frac{x}{l}\right\rfloor\right).$
The arithmetic progression sum \(\sum_{d=l}^{r}d=\frac{(l+r)(r-l+1)}{2}\) is what makes each block computable in constant time.
Worked Example: \(n=10\) and \(N=10\)
For \(n=10\), the divisors are \(1,2,5,10\), and
$1\cdot\varphi(1)=1,\qquad 2\cdot\varphi(2)=2,\qquad 5\cdot\varphi(5)=20,\qquad 10\cdot\varphi(10)=40.$
Therefore
$A(10)=\frac{1}{2}(1+1+2+20+40)=32.$
For the full prefix \(N=10\), compute
$R(10)=\sum_{d=1}^{10}d\varphi(d)\left\lfloor\frac{10}{d}\right\rfloor=274,$
so
$S(10)=\frac{10+274}{2}=142.$
This agrees with a direct brute-force evaluation of the first ten averages.
How the Code Works
The C++, Python, and Java implementations all follow the same plan. They precompute \(\mu(n)\), \(\varphi(n)\), and the small prefix sums of \(n\mu(n)\) and \(n\varphi(n)\) up to a threshold \(L\approx N^{2/3}\) using a linear sieve. For arguments above that threshold, they evaluate the two large prefix functions recursively, cache every large result, and always group terms by equal floor-division quotients. The final summation for \(R(N)\) uses the same block structure. Because the modulus is prime, the divisions by \(2\) and \(6\) are carried out through modular inverses.
Complexity Analysis
The sieve up to \(L\approx N^{2/3}\) costs \(O(L)\) time and \(O(L)\) memory. Each uncached large query visits only the distinct intervals on which a floor quotient is constant, rather than every integer one by one. With memoization, the total amount of large-query work stays on the same practical scale as the preprocessing. For the target size \(N\approx 10^{11}\), this reduces the problem from impossible brute force to an \(O(N^{2/3})\)-scale method in time and memory.
Footnotes and References
- Problem page: https://projecteuler.net/problem=448
- Euler's totient function: Wikipedia — Euler's totient function
- Möbius function: Wikipedia — Möbius function
- Dirichlet convolution: Wikipedia — Dirichlet convolution
- Floor-division block technique: cp-algorithms — divisor summatory techniques