Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 216: Investigating the Primality of 2n²-1

Project Euler Problem 216 Solution

EulerSolve provides an optimized solution for Project Euler Problem 216, Investigating the Primality of 2n²-1, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For integers $n \ge 2$, define $t_n=2n^2-1.$ Problem 216 asks for the number of indices $2 \le n \le 50{,}000{,}000$ for which $t_n$ is prime. A naive solution would evaluate fifty million numbers of size about $5\times10^{15}$ and run a primality test on each of them. The implementations take a more number-theoretic route: instead of proving primality one value at a time, they sieve the indices $n$ for which some smaller prime is known to divide $2n^2-1$. The survivors are exactly the prime values. Mathematical Approach The key observation is that divisibility of $2n^2-1$ by a fixed prime turns into a modular square-root problem, so the sequence can be attacked with a residue-class sieve on $n$. Why sieving only up to $\sqrt{2N^2-1}$ is enough Let $N=50{,}000{,}000$. If $t_n$ is composite, then it has a prime divisor $p$ with $p \le \sqrt{t_n} \le \sqrt{2N^2-1}.$ Therefore every composite value $2n^2-1$ is detected by some prime at most $\sqrt{2N^2-1}$. Conversely, if no prime in that range divides $t_n$, then $t_n$ has no proper prime factor and must itself be prime. This is the invariant that makes the sieve correct. Since $2n^2-1$ is always odd, the prime $2$ can never divide it, so only odd primes need to be examined. Which primes can divide $2n^2-1$?...

Detailed mathematical approach

Problem Summary

For integers $n \ge 2$, define

$t_n=2n^2-1.$

Problem 216 asks for the number of indices $2 \le n \le 50{,}000{,}000$ for which $t_n$ is prime.

A naive solution would evaluate fifty million numbers of size about $5\times10^{15}$ and run a primality test on each of them. The implementations take a more number-theoretic route: instead of proving primality one value at a time, they sieve the indices $n$ for which some smaller prime is known to divide $2n^2-1$. The survivors are exactly the prime values.

Mathematical Approach

The key observation is that divisibility of $2n^2-1$ by a fixed prime turns into a modular square-root problem, so the sequence can be attacked with a residue-class sieve on $n$.

Why sieving only up to $\sqrt{2N^2-1}$ is enough

Let $N=50{,}000{,}000$. If $t_n$ is composite, then it has a prime divisor $p$ with

$p \le \sqrt{t_n} \le \sqrt{2N^2-1}.$

Therefore every composite value $2n^2-1$ is detected by some prime at most $\sqrt{2N^2-1}$. Conversely, if no prime in that range divides $t_n$, then $t_n$ has no proper prime factor and must itself be prime. This is the invariant that makes the sieve correct.

Since $2n^2-1$ is always odd, the prime $2$ can never divide it, so only odd primes need to be examined.

Which primes can divide $2n^2-1$?

If an odd prime $p$ divides $t_n$, then

$2n^2\equiv1\pmod p,$

hence

$n^2\equiv2^{-1}\pmod p.$

For odd $p$, the inverse of $2$ modulo $p$ is

$2^{-1}\equiv\frac{p+1}{2}\pmod p,$

because $2\cdot\frac{p+1}{2}=p+1\equiv1\pmod p$.

So the congruence has a solution exactly when $2^{-1}$, and therefore also $2$, is a quadratic residue modulo $p$. By the classical formula

$\left(\frac{2}{p}\right)=(-1)^{(p^2-1)/8},$

this happens precisely for

$p\equiv1\text{ or }7\pmod8.$

That congruence filter removes half of the odd primes immediately. Any prime with $p\equiv3$ or $5\pmod8$ cannot divide a value of the form $2n^2-1$.

From modular roots to arithmetic progressions

For each eligible prime $p$, solve

$n^2\equiv\frac{p+1}{2}\pmod p.$

If one solution is $r$, then the second is $-r\pmod p$. Every integer $n$ satisfying

$n\equiv r\pmod p \qquad\text{or}\qquad n\equiv-r\pmod p$

makes $p\mid(2n^2-1)$. Therefore the composite values are found by marking at most two arithmetic progressions for each relevant prime.

This is the central reduction: the original primality problem over very large integers is converted into a structured sieve over residue classes of the index $n$.

How the modular roots are computed

The implementations split the root-finding step into two cases.

If $p\equiv7\pmod8$, then $p\equiv3\pmod4$, and $2$ has the explicit square root

$\sqrt{2}\equiv2^{(p+1)/4}\pmod p.$

Multiplying by $2^{-1}$ gives a square root of $2^{-1}$:

$r\equiv2^{(p+1)/4}\cdot\frac{p+1}{2}\pmod p.$

If $p\equiv1\pmod8$, there is no equally short special formula in general, so the implementations use the Tonelli-Shanks algorithm to compute a square root of $(p+1)/2$ modulo $p$. In both cases the second residue class is obtained by negation modulo $p$.

The exceptional case $2n^2-1=p$

One subtlety is essential. When the sieve processes a prime $p$, it marks every index $n$ such that $p\mid(2n^2-1)$. But sometimes that value is not composite at all: it can equal $p$ itself. The equality

$2n^2-1=p$

is equivalent to

$n^2=\frac{p+1}{2}.$

So there is exactly one index to skip whenever $(p+1)/2$ is an ordinary integer square. The implementations test that condition once per prime, and if it holds they leave that single $n$ unmarked while still marking the rest of the progression.

Without this correction, every prime of the special form $p=2m^2-1$ would incorrectly eliminate its own generating index $n=m$.

Worked example: $p=7$ and the range $2\le n\le8$

Take $p=7$. Then

$2^{-1}\equiv4\pmod7,$

so the relevant congruence is

$n^2\equiv4\pmod7.$

Its two roots are $n\equiv2$ and $n\equiv5\pmod7$. Therefore every $n$ in either residue class makes $7\mid(2n^2-1)$.

Now list the values for $2\le n\le8$:

$7,\ 17,\ 31,\ 49,\ 71,\ 97,\ 127.$

The classes $2$ and $5$ modulo $7$ hit $n=2$ and $n=5$ in this interval. The first one must be skipped because

$2\cdot2^2-1=7,$

which is prime, not composite. The second one is correctly marked because

$2\cdot5^2-1=49=7^2.$

This tiny example already contains the whole method: solve one modular equation, mark its residue classes, and protect the one possible self-factor case.

How the Code Works

The C++, Python, and Java implementations follow the same algorithmic structure. They first generate all primes up to $\sqrt{2N^2-1}$, then maintain an array indexed by $n$ recording whether $2n^2-1$ has been proved composite by at least one sieving prime.

Prime generation and filtering

The first phase is a standard prime sieve. Each prime is then filtered by its residue modulo $8$: $2$ is discarded because $2n^2-1$ is odd, and primes congruent to $3$ or $5$ modulo $8$ are ignored because the congruence $n^2\equiv2^{-1}\pmod p$ has no solution for them.

Root finding and progression marking

For every remaining prime, the implementation computes one root of $(p+1)/2$ modulo $p$. When $p\equiv7\pmod8$, it uses the direct power formula; when $p\equiv1\pmod8$, it uses Tonelli-Shanks. The second root is the negative of the first one modulo $p$.

Those roots define up to two arithmetic progressions of valid indices. The implementation advances from the first term in each progression that is at least $2$, then steps forward by $p$, marking every visited index as composite.

Skipping the self-factor and counting the survivors

Before the marking loop begins, the implementation checks whether $(p+1)/2$ is an exact integer square. If so, that single index is excluded from marking, because its value is $p$ itself rather than a composite multiple of $p$.

After all relevant primes have been processed, the answer is the number of unmarked indices $n\ge2$. By the sieving invariant, these are exactly the $n$ for which $2n^2-1$ is prime.

Complexity Analysis

Let $N$ denote the upper limit for $n$. The initial prime sieve runs up to $\sqrt{2}N$, so its cost is on the usual order of

$O(N\log\log N).$

The dominant work is the marking of arithmetic progressions. For each relevant prime $p$, at most two residue classes are visited, which gives a total cost comparable to

$\sum_{p\le\sqrt{2}N} O\!\left(\frac{N}{p}\right)=O(N\log\log N).$

The modular square-root computations add extra work, but they do not change the overall asymptotic behavior of the sieve. Memory usage is $O(N)$, because the implementations store one boolean or byte flag for each index $n$.

Footnotes and References

Project Euler problem page: https://projecteuler.net/problem=216
Quadratic residue: https://en.wikipedia.org/wiki/Quadratic_residue
Legendre symbol: https://en.wikipedia.org/wiki/Legendre_symbol
Tonelli-Shanks algorithm: https://en.wikipedia.org/wiki/Tonelli%E2%80%93Shanks_algorithm
Sieve of Eratosthenes: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes