Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 10: Summation of Primes

Project Euler Problem 10 Solution

EulerSolve provides an optimized solution for Project Euler Problem 10, Summation of Primes, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For a given upper bound $L$, compute $S(L)=\sum_{\substack{p\lt L\\ p\text{ prime}}} p.$ The inequality is strict: if $L$ itself is prime, it is still excluded. When $L\le 2$, the sum is $0$ because there are no primes below $2$. The task is not to test one number for primality, but to add all primes below a large bound without repeating the same divisibility work for each candidate. The implementation therefore classifies every integer from $0$ to $L-1$ in one coordinated pass and then sums the entries that survive. Mathematical Approach The solution uses the Sieve of Eratosthenes. Instead of treating each integer independently, it keeps a table of numbers that are still possible primes and progressively deletes the composite ones. The Quantity Being Summed A convenient way to write the target is $S(L)=\sum_{n=2}^{L-1} n\,\mathbf{1}_{n\text{ is prime}}.$ So the real objective is to build the indicator $\mathbf{1}_{n\text{ is prime}}$ for every $n\lt L$. Once that information is known, the sum itself is just a linear scan through the surviving indices. The Sieve State and Its Invariant Begin with a Boolean table indexed by $0,1,\dots,L-1$. Mark $0$ and $1$ as non-prime, and treat every integer from $2$ onward as provisionally prime. Then inspect candidate bases in increasing order....

Detailed mathematical approach

Problem Summary

For a given upper bound $L$, compute

$S(L)=\sum_{\substack{p\lt L\\ p\text{ prime}}} p.$

The inequality is strict: if $L$ itself is prime, it is still excluded. When $L\le 2$, the sum is $0$ because there are no primes below $2$.

The task is not to test one number for primality, but to add all primes below a large bound without repeating the same divisibility work for each candidate. The implementation therefore classifies every integer from $0$ to $L-1$ in one coordinated pass and then sums the entries that survive.

Mathematical Approach

The solution uses the Sieve of Eratosthenes. Instead of treating each integer independently, it keeps a table of numbers that are still possible primes and progressively deletes the composite ones.

The Quantity Being Summed

A convenient way to write the target is

$S(L)=\sum_{n=2}^{L-1} n\,\mathbf{1}_{n\text{ is prime}}.$

So the real objective is to build the indicator $\mathbf{1}_{n\text{ is prime}}$ for every $n\lt L$. Once that information is known, the sum itself is just a linear scan through the surviving indices.

The Sieve State and Its Invariant

Begin with a Boolean table indexed by $0,1,\dots,L-1$. Mark $0$ and $1$ as non-prime, and treat every integer from $2$ onward as provisionally prime. Then inspect candidate bases in increasing order.

The key invariant is this: after all prime bases smaller than $p$ have been processed, every number below $L$ that has a prime factor smaller than $p$ has already been marked composite. Therefore, if the entry for $p$ is still unmarked when the scan reaches it, then $p$ must itself be prime. If it were composite, it would have a least prime factor $r\lt p$, and the pass for $r$ would already have crossed out $p$.

Why Crossing Off Starts at $p^2$

Once a prime $p$ is confirmed, the sieve marks

$p^2,\ p^2+p,\ p^2+2p,\ \dots$

as composite. The first term is $p^2$, not $2p$, for a precise reason. Any smaller multiple $kp$ with $2\le k\lt p$ already has a factor smaller than $p$, so it was removed earlier while processing that smaller factor or one of its prime divisors.

This is where the sieve gains most of its efficiency: the inner pass never restarts from the full list of multiples.

Why It Is Enough to Stop at $\sqrt{L}$

Let $n\lt L$ be composite. Write $n=ab$ with $2\le a\le b$. Then $a^2\le n\lt L$, so $a\lt \sqrt{L}$. In particular, $n$ has a prime factor $r\le a\lt \sqrt{L}$. That prime factor will appear as a sieve base before the outer loop finishes, and its marking pass will cross out $n$.

So once the scan reaches a value with $p^2\ge L$, no composite below $L$ can still be waiting for its first prime factor to be processed. Every remaining unmarked entry is prime, and the algorithm can move directly to summation.

Worked Example: $L=30$

Initially, the candidates are $2,3,4,\dots,29$.

Processing $p=2$ removes

$4,6,8,10,12,14,16,18,20,22,24,26,28.$

Processing $p=3$ removes

$9,12,15,18,21,24,27,$

though some of those were already removed by $2$.

The candidate $4$ is skipped because it is already known to be composite. Next, $p=5$ removes only $25$, since the marking starts at $5^2$. After that, $6^2>30$, so the marking phase is over.

The numbers still marked prime are

$2,3,5,7,11,13,17,19,23,29,$

and therefore

$S(30)=129.$

The same mechanism explains the standard checkpoint $S(10)=2+3+5+7=17$.

How the Code Works

The C++, Python, and Java implementations all use the same three-phase structure.

Initialize the Table and Handle Small Limits

If the bound is at most $2$, the answer is returned immediately as $0$. Otherwise, the implementation allocates a Boolean table of length $L$, sets the entries for $0$ and $1$ to false, and leaves the rest provisionally true.

Strike Composite Multiples in Increasing Base Order

The outer scan starts at $2$ and continues while $p^2\lt L$. Whenever the current entry is still marked prime, the implementation crosses out the arithmetic progression

$p^2,\ p^2+p,\ p^2+2p,\ \dots \lt L.$

Candidates already known to be composite are skipped automatically, so only genuine prime bases launch a marking pass.

Accumulate the Surviving Primes

After the sieve phase, a second linear pass adds every index that remains marked prime. The typed implementations use a 64-bit accumulator because the total for the problem input is far beyond 32-bit range; the Python implementation relies on arbitrary-precision integers.

Complexity Analysis

Initializing the Boolean table costs $O(L)$ time. The marking work is bounded by

$\sum_{\substack{p\le \sqrt{L}\\ p\text{ prime}}}\left(\left\lfloor\frac{L-1-p^2}{p}\right\rfloor+1\right)=O\!\left(L\sum_{p\le\sqrt{L}}\frac{1}{p}\right)=O(L\log\log L).$

The final accumulation pass adds another $O(L)$, so the total running time is $O(L\log\log L)$. The Boolean table uses $O(L)$ memory.

Footnotes and References

Project Euler problem page: Problem 10 - Summation of Primes
Sieve of Eratosthenes: Wikipedia
Algorithmic discussion of the sieve: cp-algorithms
Background on prime numbers: Wikipedia - Prime number