Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 401: Sum of Squares of Divisors

Project Euler Problem 401 Solution

EulerSolve provides an optimized solution for Project Euler Problem 401, Sum of Squares of Divisors, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary We need the summatory function of squared divisor sums, $\Sigma_2(N)=\sum_{n=1}^{N}\sigma_2(n),\qquad \sigma_2(n)=\sum_{d\mid n} d^2,$ for the very large input $N=10^{15}$, and we only need the result modulo $10^9$. A direct evaluation of $\sigma_2(n)$ for every $n\le N$ is far too slow, so the implementation reorganizes the summation and then skips large ranges at once. Mathematical Approach Step 1: Exchange the Order of Summation Start from the definition of $\sigma_2(n)$. If we sum over all $n\le N$, every divisor $d$ contributes its square $d^2$ once for each multiple of $d$ up to $N$. Therefore $\Sigma_2(N)=\sum_{n=1}^{N}\sum_{d\mid n} d^2=\sum_{d=1}^{N} d^2\left\lfloor\frac{N}{d}\right\rfloor.$ This identity is the key simplification. Instead of asking for the divisors of each $n$, we ask how many integers up to $N$ are divisible by a fixed $d$. That count is exactly $\left\lfloor N/d\right\rfloor$. Step 2: Group Equal Quotients into Blocks The transformed sum still runs over $d=1,2,\dots,N$, but the value $q(d)=\left\lfloor\frac{N}{d}\right\rfloor$ does not change at every step....

Detailed mathematical approach

Problem Summary

We need the summatory function of squared divisor sums,

$\Sigma_2(N)=\sum_{n=1}^{N}\sigma_2(n),\qquad \sigma_2(n)=\sum_{d\mid n} d^2,$

for the very large input $N=10^{15}$, and we only need the result modulo $10^9$. A direct evaluation of $\sigma_2(n)$ for every $n\le N$ is far too slow, so the implementation reorganizes the summation and then skips large ranges at once.

Mathematical Approach

Step 1: Exchange the Order of Summation

Start from the definition of $\sigma_2(n)$. If we sum over all $n\le N$, every divisor $d$ contributes its square $d^2$ once for each multiple of $d$ up to $N$.

Therefore

$\Sigma_2(N)=\sum_{n=1}^{N}\sum_{d\mid n} d^2=\sum_{d=1}^{N} d^2\left\lfloor\frac{N}{d}\right\rfloor.$

This identity is the key simplification. Instead of asking for the divisors of each $n$, we ask how many integers up to $N$ are divisible by a fixed $d$. That count is exactly $\left\lfloor N/d\right\rfloor$.

Step 2: Group Equal Quotients into Blocks

The transformed sum still runs over $d=1,2,\dots,N$, but the value

$q(d)=\left\lfloor\frac{N}{d}\right\rfloor$

does not change at every step. If the current block starts at $l$, then the common quotient is

$q=\left\lfloor\frac{N}{l}\right\rfloor,$

and the largest index with the same quotient is

$r=\left\lfloor\frac{N}{q}\right\rfloor.$

So one whole block contributes

$q\sum_{d=l}^{r} d^2.$

After adding that block, the next unexplored position is $r+1$. This is the reason the algorithm is fast: it jumps across entire intervals where the floor value is constant.

Step 3: Evaluate Each Block with a Closed Form

Let

$S_2(x)=\sum_{k=1}^{x} k^2=\frac{x(x+1)(2x+1)}{6}.$

Then the square sum over any interval is just a difference of prefixes:

$\sum_{d=l}^{r} d^2=S_2(r)-S_2(l-1).$

Substituting this into the block formula gives

$\Sigma_2(N)=\sum_{[l,r]} \left\lfloor\frac{N}{l}\right\rfloor \bigl(S_2(r)-S_2(l-1)\bigr).$

This is exactly the computation performed by the implementation.

Step 4: Why the Number of Blocks is Only $O(\sqrt N)$

There are two standard ways to see the bound. First, for $d\le \sqrt N$ there can be at most $\sqrt N$ starting positions. Second, once $d>\sqrt N$, the quotient $\left\lfloor N/d\right\rfloor$ is strictly less than $\sqrt N$, so there are at most another $\sqrt N$ distinct quotient values to process.

Hence the number of blocks is at most about $2\sqrt N$, which turns a hopeless linear scan up to $10^{15}$ into a practical computation.

Step 5: Exact Modular Evaluation of $S_2(x)$

The formula

$S_2(x)=\frac{x(x+1)(2x+1)}{6}$

is mathematically simple, but the product $x(x+1)(2x+1)$ is enormous when $x$ is close to $10^{15}$. The C++, Python, and Java implementations therefore divide out the denominator before multiplying.

That cancellation is always valid. One of $x$ and $x+1$ is even, so a factor $2$ can be removed there. Also, among $x$, $x+1$, and $2x+1$, one is divisible by $3$, so the remaining factor $3$ can also be removed exactly. Only after these exact divisions do the implementations reduce modulo $M$ and multiply the factors.

The C++ implementation additionally uses 128-bit intermediates for safe modular products. Python has arbitrary-precision integers, and the Java version follows the same cancellation pattern before modular multiplication.

Worked Example: $N=6$

Using the reordered identity directly,

$\Sigma_2(6)=1^2\left\lfloor\frac{6}{1}\right\rfloor+2^2\left\lfloor\frac{6}{2}\right\rfloor+3^2\left\lfloor\frac{6}{3}\right\rfloor+4^2\left\lfloor\frac{6}{4}\right\rfloor+5^2\left\lfloor\frac{6}{5}\right\rfloor+6^2\left\lfloor\frac{6}{6}\right\rfloor.$

So

$\Sigma_2(6)=1\cdot 6+4\cdot 3+9\cdot 2+16\cdot 1+25\cdot 1+36\cdot 1=113.$

The block method sees the same value in four groups:

$[1,1]\to 6,\qquad [2,2]\to 12,\qquad [3,3]\to 18,\qquad [4,6]\to 77.$

The total is again

$6+12+18+77=113.$

How the Code Works

The implementation keeps an inclusive left boundary $l$ for the current quotient block. It computes the common quotient $q=\lfloor N/l\rfloor$, determines the matching right boundary $r=\lfloor N/q\rfloor$, evaluates the range sum of squares modulo $M$, multiplies by $q$ modulo $M$, adds the contribution to the running answer, and then moves on to $l=r+1$.

No divisor enumeration is performed inside the block. The heavy work has already been absorbed into the closed form for $S_2(x)$, so every iteration uses only a constant number of integer divisions and modular products.

Complexity Analysis

The number of quotient blocks is $O(\sqrt N)$, so the running time is $O(\sqrt N)$. The memory usage is $O(1)$, because the algorithm stores only the current block boundaries, the current quotient, and a few modular temporaries. Compared with naive divisor-based enumeration, this is the essential improvement that makes $N=10^{15}$ feasible.

References

Problem page: https://projecteuler.net/problem=401
Divisor function: Wikipedia — Divisor function
Divisor summatory function: Wikipedia — Divisor summatory function
Faulhaber's formula: Wikipedia — Faulhaber's formula