Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 10: Summation of Primes

View on Project Euler

Project Euler Problem 10 Solution

EulerSolve provides an optimized solution for Project Euler Problem 10, Summation of Primes, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For a given upper bound \(L\), compute $S(L)=\sum_{\substack{p\lt L\\ p\text{ prime}}} p.$ The inequality is strict: if \(L\) itself is prime, it is still excluded. When \(L\le 2\), the sum is \(0\) because there are no primes below \(2\). The task is not to test one number for primality, but to add all primes below a large bound without repeating the same divisibility work for each candidate. The implementation therefore classifies every integer from \(0\) to \(L-1\) in one coordinated pass and then sums the entries that survive. Mathematical Approach The solution uses the Sieve of Eratosthenes. Instead of treating each integer independently, it keeps a table of numbers that are still possible primes and progressively deletes the composite ones. The Quantity Being Summed A convenient way to write the target is $S(L)=\sum_{n=2}^{L-1} n\,\mathbf{1}_{n\text{ is prime}}.$ So the real objective is to build the indicator \(\mathbf{1}_{n\text{ is prime}}\) for every \(n\lt L\). Once that information is known, the sum itself is just a linear scan through the surviving indices. The Sieve State and Its Invariant Begin with a Boolean table indexed by \(0,1,\dots,L-1\). Mark \(0\) and \(1\) as non-prime, and treat every integer from \(2\) onward as provisionally prime. Then inspect candidate bases in increasing order....

Detailed mathematical approach

Problem Summary

For a given upper bound \(L\), compute

$S(L)=\sum_{\substack{p\lt L\\ p\text{ prime}}} p.$

The inequality is strict: if \(L\) itself is prime, it is still excluded. When \(L\le 2\), the sum is \(0\) because there are no primes below \(2\).

The task is not to test one number for primality, but to add all primes below a large bound without repeating the same divisibility work for each candidate. The implementation therefore classifies every integer from \(0\) to \(L-1\) in one coordinated pass and then sums the entries that survive.

Mathematical Approach

The solution uses the Sieve of Eratosthenes. Instead of treating each integer independently, it keeps a table of numbers that are still possible primes and progressively deletes the composite ones.

The Quantity Being Summed

A convenient way to write the target is

$S(L)=\sum_{n=2}^{L-1} n\,\mathbf{1}_{n\text{ is prime}}.$

So the real objective is to build the indicator \(\mathbf{1}_{n\text{ is prime}}\) for every \(n\lt L\). Once that information is known, the sum itself is just a linear scan through the surviving indices.

The Sieve State and Its Invariant

Begin with a Boolean table indexed by \(0,1,\dots,L-1\). Mark \(0\) and \(1\) as non-prime, and treat every integer from \(2\) onward as provisionally prime. Then inspect candidate bases in increasing order.

The key invariant is this: after all prime bases smaller than \(p\) have been processed, every number below \(L\) that has a prime factor smaller than \(p\) has already been marked composite. Therefore, if the entry for \(p\) is still unmarked when the scan reaches it, then \(p\) must itself be prime. If it were composite, it would have a least prime factor \(r\lt p\), and the pass for \(r\) would already have crossed out \(p\).

Why Crossing Off Starts at \(p^2\)

Once a prime \(p\) is confirmed, the sieve marks

$p^2,\ p^2+p,\ p^2+2p,\ \dots$

as composite. The first term is \(p^2\), not \(2p\), for a precise reason. Any smaller multiple \(kp\) with \(2\le k\lt p\) already has a factor smaller than \(p\), so it was removed earlier while processing that smaller factor or one of its prime divisors.

This is where the sieve gains most of its efficiency: the inner pass never restarts from the full list of multiples.

Why It Is Enough to Stop at \(\sqrt{L}\)

Let \(n\lt L\) be composite. Write \(n=ab\) with \(2\le a\le b\). Then \(a^2\le n\lt L\), so \(a\lt \sqrt{L}\). In particular, \(n\) has a prime factor \(r\le a\lt \sqrt{L}\). That prime factor will appear as a sieve base before the outer loop finishes, and its marking pass will cross out \(n\).

So once the scan reaches a value with \(p^2\ge L\), no composite below \(L\) can still be waiting for its first prime factor to be processed. Every remaining unmarked entry is prime, and the algorithm can move directly to summation.

Worked Example: \(L=30\)

Initially, the candidates are \(2,3,4,\dots,29\).

Processing \(p=2\) removes

$4,6,8,10,12,14,16,18,20,22,24,26,28.$

Processing \(p=3\) removes

$9,12,15,18,21,24,27,$

though some of those were already removed by \(2\).

The candidate \(4\) is skipped because it is already known to be composite. Next, \(p=5\) removes only \(25\), since the marking starts at \(5^2\). After that, \(6^2>30\), so the marking phase is over.

The numbers still marked prime are

$2,3,5,7,11,13,17,19,23,29,$

and therefore

$S(30)=129.$

The same mechanism explains the standard checkpoint \(S(10)=2+3+5+7=17\).

How the Code Works

The C++, Python, and Java implementations all use the same three-phase structure.

Initialize the Table and Handle Small Limits

If the bound is at most \(2\), the answer is returned immediately as \(0\). Otherwise, the implementation allocates a Boolean table of length \(L\), sets the entries for \(0\) and \(1\) to false, and leaves the rest provisionally true.

Strike Composite Multiples in Increasing Base Order

The outer scan starts at \(2\) and continues while \(p^2\lt L\). Whenever the current entry is still marked prime, the implementation crosses out the arithmetic progression

$p^2,\ p^2+p,\ p^2+2p,\ \dots \lt L.$

Candidates already known to be composite are skipped automatically, so only genuine prime bases launch a marking pass.

Accumulate the Surviving Primes

After the sieve phase, a second linear pass adds every index that remains marked prime. The typed implementations use a 64-bit accumulator because the total for the problem input is far beyond 32-bit range; the Python implementation relies on arbitrary-precision integers.

Complexity Analysis

Initializing the Boolean table costs \(O(L)\) time. The marking work is bounded by

$\sum_{\substack{p\le \sqrt{L}\\ p\text{ prime}}}\left(\left\lfloor\frac{L-1-p^2}{p}\right\rfloor+1\right)=O\!\left(L\sum_{p\le\sqrt{L}}\frac{1}{p}\right)=O(L\log\log L).$

The final accumulation pass adds another \(O(L)\), so the total running time is \(O(L\log\log L)\). The Boolean table uses \(O(L)\) memory.

Footnotes and References

  1. Project Euler problem page: Problem 10 - Summation of Primes
  2. Sieve of Eratosthenes: Wikipedia
  3. Algorithmic discussion of the sieve: cp-algorithms
  4. Background on prime numbers: Wikipedia - Prime number

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 9 · All Project Euler solutions · Next: Problem 11

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮