Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 401: Sum of Squares of Divisors

View on Project Euler

Project Euler Problem 401 Solution

EulerSolve provides an optimized solution for Project Euler Problem 401, Sum of Squares of Divisors, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary We need the summatory function of squared divisor sums, $\Sigma_2(N)=\sum_{n=1}^{N}\sigma_2(n),\qquad \sigma_2(n)=\sum_{d\mid n} d^2,$ for the very large input \(N=10^{15}\), and we only need the result modulo \(10^9\). A direct evaluation of \(\sigma_2(n)\) for every \(n\le N\) is far too slow, so the implementation reorganizes the summation and then skips large ranges at once. Mathematical Approach Step 1: Exchange the Order of Summation Start from the definition of \(\sigma_2(n)\). If we sum over all \(n\le N\), every divisor \(d\) contributes its square \(d^2\) once for each multiple of \(d\) up to \(N\). Therefore $\Sigma_2(N)=\sum_{n=1}^{N}\sum_{d\mid n} d^2=\sum_{d=1}^{N} d^2\left\lfloor\frac{N}{d}\right\rfloor.$ This identity is the key simplification. Instead of asking for the divisors of each \(n\), we ask how many integers up to \(N\) are divisible by a fixed \(d\). That count is exactly \(\left\lfloor N/d\right\rfloor\). Step 2: Group Equal Quotients into Blocks The transformed sum still runs over \(d=1,2,\dots,N\), but the value $q(d)=\left\lfloor\frac{N}{d}\right\rfloor$ does not change at every step....

Detailed mathematical approach

Problem Summary

We need the summatory function of squared divisor sums,

$\Sigma_2(N)=\sum_{n=1}^{N}\sigma_2(n),\qquad \sigma_2(n)=\sum_{d\mid n} d^2,$

for the very large input \(N=10^{15}\), and we only need the result modulo \(10^9\). A direct evaluation of \(\sigma_2(n)\) for every \(n\le N\) is far too slow, so the implementation reorganizes the summation and then skips large ranges at once.

Mathematical Approach

Step 1: Exchange the Order of Summation

Start from the definition of \(\sigma_2(n)\). If we sum over all \(n\le N\), every divisor \(d\) contributes its square \(d^2\) once for each multiple of \(d\) up to \(N\).

Therefore

$\Sigma_2(N)=\sum_{n=1}^{N}\sum_{d\mid n} d^2=\sum_{d=1}^{N} d^2\left\lfloor\frac{N}{d}\right\rfloor.$

This identity is the key simplification. Instead of asking for the divisors of each \(n\), we ask how many integers up to \(N\) are divisible by a fixed \(d\). That count is exactly \(\left\lfloor N/d\right\rfloor\).

Step 2: Group Equal Quotients into Blocks

The transformed sum still runs over \(d=1,2,\dots,N\), but the value

$q(d)=\left\lfloor\frac{N}{d}\right\rfloor$

does not change at every step. If the current block starts at \(l\), then the common quotient is

$q=\left\lfloor\frac{N}{l}\right\rfloor,$

and the largest index with the same quotient is

$r=\left\lfloor\frac{N}{q}\right\rfloor.$

So one whole block contributes

$q\sum_{d=l}^{r} d^2.$

After adding that block, the next unexplored position is \(r+1\). This is the reason the algorithm is fast: it jumps across entire intervals where the floor value is constant.

Step 3: Evaluate Each Block with a Closed Form

Let

$S_2(x)=\sum_{k=1}^{x} k^2=\frac{x(x+1)(2x+1)}{6}.$

Then the square sum over any interval is just a difference of prefixes:

$\sum_{d=l}^{r} d^2=S_2(r)-S_2(l-1).$

Substituting this into the block formula gives

$\Sigma_2(N)=\sum_{[l,r]} \left\lfloor\frac{N}{l}\right\rfloor \bigl(S_2(r)-S_2(l-1)\bigr).$

This is exactly the computation performed by the implementation.

Step 4: Why the Number of Blocks is Only \(O(\sqrt N)\)

There are two standard ways to see the bound. First, for \(d\le \sqrt N\) there can be at most \(\sqrt N\) starting positions. Second, once \(d>\sqrt N\), the quotient \(\left\lfloor N/d\right\rfloor\) is strictly less than \(\sqrt N\), so there are at most another \(\sqrt N\) distinct quotient values to process.

Hence the number of blocks is at most about \(2\sqrt N\), which turns a hopeless linear scan up to \(10^{15}\) into a practical computation.

Step 5: Exact Modular Evaluation of \(S_2(x)\)

The formula

$S_2(x)=\frac{x(x+1)(2x+1)}{6}$

is mathematically simple, but the product \(x(x+1)(2x+1)\) is enormous when \(x\) is close to \(10^{15}\). The C++, Python, and Java implementations therefore divide out the denominator before multiplying.

That cancellation is always valid. One of \(x\) and \(x+1\) is even, so a factor \(2\) can be removed there. Also, among \(x\), \(x+1\), and \(2x+1\), one is divisible by \(3\), so the remaining factor \(3\) can also be removed exactly. Only after these exact divisions do the implementations reduce modulo \(M\) and multiply the factors.

The C++ implementation additionally uses 128-bit intermediates for safe modular products. Python has arbitrary-precision integers, and the Java version follows the same cancellation pattern before modular multiplication.

Worked Example: \(N=6\)

Using the reordered identity directly,

$\Sigma_2(6)=1^2\left\lfloor\frac{6}{1}\right\rfloor+2^2\left\lfloor\frac{6}{2}\right\rfloor+3^2\left\lfloor\frac{6}{3}\right\rfloor+4^2\left\lfloor\frac{6}{4}\right\rfloor+5^2\left\lfloor\frac{6}{5}\right\rfloor+6^2\left\lfloor\frac{6}{6}\right\rfloor.$

So

$\Sigma_2(6)=1\cdot 6+4\cdot 3+9\cdot 2+16\cdot 1+25\cdot 1+36\cdot 1=113.$

The block method sees the same value in four groups:

$[1,1]\to 6,\qquad [2,2]\to 12,\qquad [3,3]\to 18,\qquad [4,6]\to 77.$

The total is again

$6+12+18+77=113.$

How the Code Works

The implementation keeps an inclusive left boundary \(l\) for the current quotient block. It computes the common quotient \(q=\lfloor N/l\rfloor\), determines the matching right boundary \(r=\lfloor N/q\rfloor\), evaluates the range sum of squares modulo \(M\), multiplies by \(q\) modulo \(M\), adds the contribution to the running answer, and then moves on to \(l=r+1\).

No divisor enumeration is performed inside the block. The heavy work has already been absorbed into the closed form for \(S_2(x)\), so every iteration uses only a constant number of integer divisions and modular products.

Complexity Analysis

The number of quotient blocks is \(O(\sqrt N)\), so the running time is \(O(\sqrt N)\). The memory usage is \(O(1)\), because the algorithm stores only the current block boundaries, the current quotient, and a few modular temporaries. Compared with naive divisor-based enumeration, this is the essential improvement that makes \(N=10^{15}\) feasible.

References

  1. Problem page: https://projecteuler.net/problem=401
  2. Divisor function: Wikipedia — Divisor function
  3. Divisor summatory function: Wikipedia — Divisor summatory function
  4. Faulhaber's formula: Wikipedia — Faulhaber's formula

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 400 · All Project Euler solutions · Next: Problem 402

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮