Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 659: Largest Prime

Project Euler Problem 659 Solution

EulerSolve provides an optimized solution for Project Euler Problem 659, Largest Prime, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For each positive integer $k$, define $N_k=4k^2+1,$ and let $P(k)$ denote the largest prime factor of $N_k$. The goal is to compute $S(K)=\sum_{k=1}^{K} P(k)\pmod{10^{18}},\qquad K=10^7.$ Factoring each $N_k$ independently would be far too slow, so the efficient approach is to sieve prime divisors across all $k$ at once. Mathematical Approach The key observation is that a prime divisor of $4k^2+1$ forces $k$ into very specific residue classes. That turns the problem into a structured sieve on arithmetic progressions rather than $10^7$ separate factorizations. Step 1: Restrict the Shape of the Prime Divisors If a prime $p$ divides $4k^2+1$, then $4k^2\equiv -1\pmod p,$ so $-1$ must be a quadratic residue modulo $p$. Since $4k^2+1$ is always odd, $p=2$ never occurs. For an odd prime, the existence of a solution to $x^2\equiv -1\pmod p$ implies $p\equiv 1\pmod 4.$ Therefore only primes congruent to $1 \bmod 4$ can appear as prime divisors of the numbers in this family. Step 2: Convert a Prime Divisor into Residue Classes of $k$ Fix an odd prime $p\equiv 1\pmod 4$....

Detailed mathematical approach

Problem Summary

For each positive integer $k$, define

$N_k=4k^2+1,$

and let $P(k)$ denote the largest prime factor of $N_k$. The goal is to compute

$S(K)=\sum_{k=1}^{K} P(k)\pmod{10^{18}},\qquad K=10^7.$

Factoring each $N_k$ independently would be far too slow, so the efficient approach is to sieve prime divisors across all $k$ at once.

Mathematical Approach

The key observation is that a prime divisor of $4k^2+1$ forces $k$ into very specific residue classes. That turns the problem into a structured sieve on arithmetic progressions rather than $10^7$ separate factorizations.

Step 1: Restrict the Shape of the Prime Divisors

If a prime $p$ divides $4k^2+1$, then

$4k^2\equiv -1\pmod p,$

so $-1$ must be a quadratic residue modulo $p$. Since $4k^2+1$ is always odd, $p=2$ never occurs. For an odd prime, the existence of a solution to

$x^2\equiv -1\pmod p$

implies

$p\equiv 1\pmod 4.$

Therefore only primes congruent to $1 \bmod 4$ can appear as prime divisors of the numbers in this family.

Step 2: Convert a Prime Divisor into Residue Classes of $k$

Fix an odd prime $p\equiv 1\pmod 4$. Once we find a square root $x$ of $-1$ modulo $p$, we have

$x^2\equiv -1\pmod p.$

Because $4k^2=(2k)^2$, the divisibility condition $p\mid 4k^2+1$ is equivalent to

$2k\equiv \pm x \pmod p.$

Since $2$ is invertible modulo every odd prime, this becomes

$k\equiv \pm x\cdot 2^{-1}\pmod p.$

So each eligible prime contributes at most two arithmetic progressions in $k$. The implementations use Tonelli-Shanks to obtain the modular square root and then sweep exactly those two progressions.

Step 3: Remove the Full Prime Power

When a progression reaches an index $k$, the current value attached to that index is checked for divisibility by $p$. If $p$ divides it, the algorithm divides by $p$ repeatedly until the factor is gone:

$N_k=p^{v_p(N_k)}\cdot M_k,\qquad p\nmid M_k.$

That repeated division matters because values such as $325=5^2\cdot 13$ contain higher powers. Processing the complete $p$-adic valuation ensures the remaining cofactor is always correct after the sweep for $p$.

Step 4: Track the Largest Processed Prime Factor

The prime sweep runs in increasing order. For each $k$, the implementation stores two pieces of information:

$\text{current remainder of }N_k,\qquad \text{largest processed prime that has divided }N_k.$

Because the primes are handled from small to large, whenever a new prime divides the current remainder it automatically becomes the largest processed prime seen so far for that $k$. After all eligible small primes have been removed, the largest prime factor is simply the larger of those two stored quantities.

Step 5: Why Scanning Primes Only up to $2K$ Is Enough

The implementations generate primes only up to $2K$. This is sufficient because every prime factor $p\le 2K$ of any $N_k$ is removed during the sieve. After that, the remaining cofactor has no prime divisor $\le 2K$.

Suppose the leftover cofactor were composite and greater than $1$. Then it would have at least two prime factors, both strictly larger than $2K$, so it would be at least

$(2K+1)^2=4K^2+4K+1,$

which is larger than every value in the family because

$4k^2+1\le 4K^2+1.$

This contradiction shows that the leftover is either $1$ or one prime. Hence the final answer for each $k$ is

$P(k)=\max(\text{largest processed prime for }k,\ \text{leftover cofactor for }k).$

Worked Example: The First Ten Terms

For $K=10$, the values are small enough to inspect directly:

$\begin{aligned} 4(1)^2+1&=5 &&\Rightarrow P(1)=5,\\ 4(2)^2+1&=17 &&\Rightarrow P(2)=17,\\ 4(3)^2+1&=37 &&\Rightarrow P(3)=37,\\ 4(4)^2+1&=65=5\cdot 13 &&\Rightarrow P(4)=13,\\ 4(5)^2+1&=101 &&\Rightarrow P(5)=101,\\ 4(6)^2+1&=145=5\cdot 29 &&\Rightarrow P(6)=29,\\ 4(7)^2+1&=197 &&\Rightarrow P(7)=197,\\ 4(8)^2+1&=257 &&\Rightarrow P(8)=257,\\ 4(9)^2+1&=325=5^2\cdot 13 &&\Rightarrow P(9)=13,\\ 4(10)^2+1&=401 &&\Rightarrow P(10)=401. \end{aligned}$

Therefore

$S(10)=5+17+37+13+101+29+197+257+13+401=1070.$

This is also a useful checkpoint for the implementation. The sieve interpretation is visible here: the prime $5$ hits the residue classes $k\equiv 1,4\pmod 5$, so among the first ten indices it strips factors from $k=1,4,6,9$.

How the Code Works

The C++, Python, and Java implementations all follow the same pipeline. First they build all primes up to $2K$ with a linear sieve. Then they initialize an array containing the values $4k^2+1$ and a second array that stores the largest prime factor already confirmed during the sweep.

For each prime $p\equiv 1\pmod 4$, the implementation computes a square root of $-1$ modulo $p$, converts that root into the two valid residue classes of $k$, and walks those arithmetic progressions with step $p$. At each visited index, if the current remainder is divisible by $p$, the code divides out all powers of $p$ and updates the stored largest processed prime.

After every eligible prime up to $2K$ has been handled, each index $k$ has a remaining cofactor that is either $1$ or prime. The implementation takes the maximum of that leftover value and the largest processed prime, adds it to the running total, and reduces the sum modulo $10^{18}$.

Complexity Analysis

Generating all primes up to $2K$ with a linear sieve costs $O(K)$ time and $O(K)$ memory. The residue-class sweeps contribute about

$\sum_{\substack{p\le 2K\\ p\equiv 1\!\!\!\pmod 4}} \frac{2K}{p},$

which has the usual near-harmonic sieve growth, so the overall running time is $O(K\log\log K)$ in practice. The repeated divisions by prime powers do not change that asymptotic bound, and the dominant storage remains the two arrays of length $K+1$, so the memory usage is $O(K)$.

Footnotes and References

Problem page: https://projecteuler.net/problem=659
Tonelli-Shanks algorithm: Wikipedia — Tonelli-Shanks algorithm
Quadratic residue: Wikipedia — Quadratic residue
Euler's criterion: Wikipedia — Euler's criterion
Modular square root: Wikipedia — Modular square root