Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 642: Sum of Largest Prime Factors

View on Project Euler

Project Euler Problem 642 Solution

EulerSolve provides an optimized solution for Project Euler Problem 642, Sum of Largest Prime Factors, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For \(n \ge 2\), let \(P(n)\) denote the largest prime factor of \(n\). We must compute $S(N)=\sum_{n=2}^{N} P(n)$ for \(N=201820182018\), modulo \(10^9\). A direct sieve up to \(N\) is completely impractical, so the solution reorganizes the sum by largest prime factor and evaluates the resulting prime sums with a Min_25-style prime-sum transform plus a sparse recursion over prime powers. Mathematical Approach Every integer contributes exactly one prime, namely its largest prime factor. The key is to separate that terminal prime from the rest of the factorization and then count the remaining cofactor in a structured way. Step 1: Separate the terminal prime If \(P(n)=p\), then \(n\) can be written as $n=p\,m,$ where every prime factor of \(m\) is at most \(p\). Conversely, any product \(p\,m\) with \(p\) prime and all prime factors of \(m\) at most \(p\) has largest prime factor \(p\). Therefore $S(N)=\sum_{p\le N} p\,\Psi\left(\left\lfloor\frac{N}{p}\right\rfloor,p\right),$ where \(\Psi(x,y)\) is the number of positive integers \(\le x\) whose prime factors are all \(\le y\). So the whole problem becomes a weighted count of smooth cofactors....

Detailed mathematical approach

Problem Summary

For \(n \ge 2\), let \(P(n)\) denote the largest prime factor of \(n\). We must compute

$S(N)=\sum_{n=2}^{N} P(n)$

for \(N=201820182018\), modulo \(10^9\). A direct sieve up to \(N\) is completely impractical, so the solution reorganizes the sum by largest prime factor and evaluates the resulting prime sums with a Min_25-style prime-sum transform plus a sparse recursion over prime powers.

Mathematical Approach

Every integer contributes exactly one prime, namely its largest prime factor. The key is to separate that terminal prime from the rest of the factorization and then count the remaining cofactor in a structured way.

Step 1: Separate the terminal prime

If \(P(n)=p\), then \(n\) can be written as

$n=p\,m,$

where every prime factor of \(m\) is at most \(p\). Conversely, any product \(p\,m\) with \(p\) prime and all prime factors of \(m\) at most \(p\) has largest prime factor \(p\).

Therefore

$S(N)=\sum_{p\le N} p\,\Psi\left(\left\lfloor\frac{N}{p}\right\rfloor,p\right),$

where \(\Psi(x,y)\) is the number of positive integers \(\le x\) whose prime factors are all \(\le y\). So the whole problem becomes a weighted count of smooth cofactors.

Step 2: Expand each smooth cofactor uniquely

Write the cofactor in increasing prime order:

$m=r_1^{a_1}r_2^{a_2}\cdots r_t^{a_t},\qquad r_1<r_2<\cdots<r_t\le p.$

Then every \(n\le N\) has a unique representation

$n=r_1^{a_1}r_2^{a_2}\cdots r_t^{a_t}p,$

where \(p=P(n)\). This uniqueness is the reason the recursion can enumerate factorizations without double counting: primes are introduced strictly in increasing order.

If the largest prime appears with exponent \(b\ge 1\), then the cofactor contains \(p^{b-1}\), while the final factor \(p\) is the one that contributes to the sum.

Step 3: Derive the recursion

Let \(p_1<p_2<\cdots\) be the primes. For a fixed index \(j\), define

$\Sigma_j(x)=\sum_{p_j\le q\le x,\ q\text{ prime}} q.$

Now suppose we have already fixed a prefix made of smaller prime powers, and the remaining quotient bound is \(R\). Let \(F(R,j)\) be the total contribution from this state when the next admissible prime is \(p_j\). Then

$F(R,j)=\Sigma_j(R)+\sum_{i\ge j,\ p_i^2\le R}\ \sum_{e\ge 1,\ p_i^{e+1}\le R}\left(F\left(\left\lfloor\frac{R}{p_i^e}\right\rfloor,i+1\right)+p_i\right).$

The first term counts the case where we stop immediately with one terminal prime \(q\ge p_j\). The double sum chooses the next prime \(p_i\), inserts \(p_i^e\) into the smooth part, and leaves one more copy of \(p_i\) as the possible largest prime. The extra \(+p_i\) is exactly the contribution from numbers whose largest prime factor is \(p_i\) itself.

The final answer is

$S(N)=F(N,1).$

Step 4: Fast prime sums with a Min_25-style transform

The recursion constantly needs values of \(\Sigma_j(R)\), so it must be able to query prime sums up to many different bounds very quickly. The implementation prepares two synchronized table families:

one indexed directly by \(x\le \lfloor\sqrt N\rfloor\), and one indexed by quotient values \(\left\lfloor N/x\right\rfloor\).

They start from the naive quantities

$C(x)=x-1,\qquad T(x)=\sum_{k=2}^{x}k=\frac{x(x+1)}{2}-1,$

and then remove composite contributions prime by prime in the standard Min_25 manner. After preprocessing, the tables can return both the number of primes up to \(x\) and the sum of primes up to \(x\):

$\pi(x)=\#\{q\le x:q\text{ prime}\},\qquad \Sigma(x)=\sum_{q\le x,\ q\text{ prime}} q.$

Interval prime sums are obtained by subtraction:

$\Sigma_j(R)=\Sigma(R)-\Sigma(p_j-1).$

Step 5: Why every integer is counted once

Take any \(n\le N\) with factorization

$n=r_1^{a_1}r_2^{a_2}\cdots r_t^{a_t}p^b,\qquad r_1<r_2<\cdots<r_t<p.$

The recursion follows one and only one path: it picks \(r_1^{a_1}\), then \(r_2^{a_2}\), and so on, always moving to larger primes. After all smaller primes have been fixed, the terminal prime \(p\) is counted either by an interval prime-sum term when \(b=1\), or by one of the direct \(+p\) terms when \(b\ge 2\). No other path can create the same factorization because the prime order is strictly increasing.

Worked Example: \(N=10\)

The largest prime factors are

$P(2)=2,\ P(3)=3,\ P(4)=2,\ P(5)=5,\ P(6)=3,\ P(7)=7,\ P(8)=2,\ P(9)=3,\ P(10)=5,$

so

$S(10)=2+3+2+5+3+7+2+3+5=32.$

The recursion sees the same total as follows. The root prime-sum term gives

$\Sigma(10)=2+3+5+7=17.$

Then the branch for \(2\) contributes \(2\) for \(4=2^2\), another \(2\) for \(8=2^3\), and \(3+5\) for \(6=2\cdot 3\) and \(10=2\cdot 5\). The branch for \(3\) contributes \(3\) for \(9=3^2\). Hence

$17+2+2+(3+5)+3=32.$

How the Code Works

The C++, Python, and Java implementations follow the same algorithmic structure. They first set \(M=\lfloor\sqrt N\rfloor\) and build two families of tables, one for direct arguments \(x\le M\) and one for quotient arguments \(\lfloor N/x\rfloor\). These tables are initialized with counts of integers from \(2\) onward and with the corresponding arithmetic-series sums.

Next, the implementation sweeps through the primes up to \(M\). Each newly discovered prime updates both families of tables so that composite contributions created by that prime are removed. After this preprocessing, the implementation can answer prime-count and prime-sum queries for every bound that appears later in the recursion.

The recursive stage then explores prime powers in increasing prime order. For a current bound \(R\), it first adds the sum of all admissible terminal primes in the interval \([p_j,R]\). It then tries each next prime \(p_i\) with \(p_i^2\le R\), repeatedly divides the bound by \(p_i\), recurses on the reduced bound with a stricter lower bound on the next prime, and adds one direct contribution of \(p_i\) for each extra copy of that prime. Every addition is reduced modulo \(10^9\).

The Python implementation is only a thin bridge to the same compiled algorithm, so the mathematics and the execution strategy are identical in all three languages.

Complexity Analysis

Let \(M=\lfloor\sqrt N\rfloor\). The preprocessing stores \(O(M)\) values. Its running time follows the usual Min_25-style profile: it is sublinear in \(N\) and grows roughly like \(O(N^{3/4}/\log N)\) for this kind of frontier update. The recursive prime-power search is sparse because it only continues while \(p^2\le R\), so it is far smaller than a full scan up to \(N\). The overall memory usage is \(O(\sqrt N)\).

Footnotes and References

  1. Project Euler problem page: https://projecteuler.net/problem=642
  2. Largest prime factor: Wikipedia - Largest prime factor
  3. Smooth number: Wikipedia - Smooth number
  4. Prime-counting function: Wikipedia - Prime-counting function
  5. Min_25 sieve overview: OI Wiki - Min_25 sieve

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 641 · All Project Euler solutions · Next: Problem 643

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮