Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 51: Prime Digit Replacements

View on Project Euler

Project Euler Problem 51 Solution

EulerSolve provides an optimized solution for Project Euler Problem 51, Prime Digit Replacements, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary We seek the smallest prime \(p\) for which there exists a non-empty set of digit positions such that replacing all of those positions by the same digit \(r\in\{0,\dots,9\}\) produces eight primes. The chosen positions do not need to be adjacent, but every replacement uses one common digit across all chosen positions. The standard example \(56**3\) shows the phenomenon: replacing the two middle zeros in \(56003\) by \(0,1,\dots,9\) produces seven primes. Problem 51 asks for the first family whose prime count reaches eight. Mathematical Approach The C++, Python, and Java implementations treat the problem as a search over decimal patterns. For a fixed prime, the decisive mathematical objects are a target digit, a non-empty subset of the positions carrying that digit, and the replacement digit applied to all of those positions at once. Defining a replacement family Write the prime \(p\) in decimal as \(a_0a_1\dots a_{k-1}\). For a digit \(\tau\in\{0,\dots,9\}\), let $P_\tau=\{i:a_i=\tau\}.$ If \(M\subseteq P_\tau\) is non-empty and \(r\in\{0,\dots,9\}\), define the candidate \(N_r(p,\tau,M)\) by $b_i(r)=\begin{cases} r, & i\in M,\\ a_i, & i\notin M. \end{cases}$ Then \(N_r(p,\tau,M)\) is the integer with digits \(b_0(r)b_1(r)\dots b_{k-1}(r)\). If \(0\in M\) and \(r=0\), the result has a leading zero and is discarded....

Detailed mathematical approach

Problem Summary

We seek the smallest prime \(p\) for which there exists a non-empty set of digit positions such that replacing all of those positions by the same digit \(r\in\{0,\dots,9\}\) produces eight primes. The chosen positions do not need to be adjacent, but every replacement uses one common digit across all chosen positions.

The standard example \(56**3\) shows the phenomenon: replacing the two middle zeros in \(56003\) by \(0,1,\dots,9\) produces seven primes. Problem 51 asks for the first family whose prime count reaches eight.

Mathematical Approach

The C++, Python, and Java implementations treat the problem as a search over decimal patterns. For a fixed prime, the decisive mathematical objects are a target digit, a non-empty subset of the positions carrying that digit, and the replacement digit applied to all of those positions at once.

Defining a replacement family

Write the prime \(p\) in decimal as \(a_0a_1\dots a_{k-1}\). For a digit \(\tau\in\{0,\dots,9\}\), let

$P_\tau=\{i:a_i=\tau\}.$

If \(M\subseteq P_\tau\) is non-empty and \(r\in\{0,\dots,9\}\), define the candidate \(N_r(p,\tau,M)\) by

$b_i(r)=\begin{cases} r, & i\in M,\\ a_i, & i\notin M. \end{cases}$

Then \(N_r(p,\tau,M)\) is the integer with digits \(b_0(r)b_1(r)\dots b_{k-1}(r)\). If \(0\in M\) and \(r=0\), the result has a leading zero and is discarded. The prime family attached to \((p,\tau,M)\) is therefore

$F(p,\tau,M)=\{N_r(p,\tau,M):0\le r\le 9,\ N_r(p,\tau,M)\text{ is valid and prime}\}.$

The problem becomes: find the smallest prime \(p\) for which \(|F(p,\tau,M)|\ge 8\) for some \(\tau\) and some non-empty \(M\).

Why only equal digits need to be grouped together

If a prime \(p\) belongs to a valid family, then one of the ten replacements must reproduce \(p\) itself. In that replacement, every selected position receives the same digit. Therefore the selected positions in \(p\) must already contain one common digit. This proves that every family containing \(p\) can be recovered by choosing one digit value already present in \(p\) and then choosing a non-empty subset of the positions where that digit occurs.

That observation exactly matches the implementations: they first collect all positions of each decimal digit and then enumerate only non-empty subsets inside those groups. No valid family containing the current prime is missed by doing so.

Family size and the canonical representative

For fixed \((p,\tau,M)\), there are only ten possible replacements, so the family size is just

$|F(p,\tau,M)|=\sum_{r=0}^{9} I_r,$

where \(I_r=1\) if \(N_r(p,\tau,M)\) is a valid prime and \(I_r=0\) otherwise.

The same replacement pattern can be rediscovered from several members of the same family. To avoid accepting a later member, define

$m(p,\tau,M)=\min F(p,\tau,M).$

The implementations accept the current prime only when both \(|F(p,\tau,M)|\) reaches the required family size and \(p=m(p,\tau,M)\). Because primes are scanned in increasing order, the first accepted prime is automatically the smallest prime that belongs to any qualifying family.

Worked example: the seven-prime family from \(56003\)

Take \(p=56003\). The digit \(0\) appears at positions \(2\) and \(3\), so choose \(\tau=0\) and \(M=\{2,3\}\). Then every candidate has the form

$N_r=56rr3.$

The ten replacements are 56003, 56113, 56223, 56333, 56443, 56553, 56663, 56773, 56883, and 56993. Among them, 56223, 56553, and 56883 are composite, while the other seven are prime. Hence

$|F(56003,0,\{2,3\})|=7.$

This is the family cited in the problem statement, and it illustrates the exact object that the implementations enumerate for every prime.

Why the search is finite for each prime

If the digit \(\tau\) appears \(m_\tau\) times in a \(k\)-digit prime, then there are \(2^{m_\tau}-1\) non-empty subsets of \(P_\tau\). So the total number of replacement masks checked for that prime is

$\sum_{\tau=0}^{9}\bigl(2^{m_\tau}-1\bigr),$

where digits with \(m_\tau=0\) contribute nothing. Since the position sets \(P_0,\dots,P_9\) partition the \(k\) digit positions, we have

$\sum_{\tau=0}^{9} m_\tau=k$

and the counted subsets are only a portion of all non-empty subsets of the \(k\) positions, so

$\sum_{\tau=0}^{9}\bigl(2^{m_\tau}-1\bigr)\le 2^k-1.$

Each such mask generates exactly ten candidate numbers. That is why exhaustive search is practical once primality tests are cheap.

How the Code Works

Prime table and fallback primality test

The C++, Python, and Java implementations first build a sieve up to a configurable limit. Any candidate within that range is tested by direct table lookup. If a candidate lies above the sieve range, the implementations fall back to ordinary trial division up to \(\sqrt{n}\), so correctness does not depend on the sieve alone.

Enumerating repeated-digit subsets

Primes are scanned in increasing order, starting from 11. For each prime, the implementation records every position occupied by each decimal digit. For each digit group it loops through all non-empty subsets, substitutes replacement digits \(0\) through \(9\), discards leading-zero cases, and counts how many resulting values are prime.

Returning only the minimal family member

While processing one subset, the implementation also tracks the smallest prime produced anywhere in that family. A prime is returned only if the family size reaches the requested target and the current prime is itself the smallest prime in that family. This prevents the same family from being reported again from a larger member.

The built-in checkpoint uses the smaller target family size 6. In that easier setting, the first accepted prime is \(13\), obtained by replacing the leading digit in the sequence \(13,23,33,\dots,93\) while rejecting \(03\) because of the leading zero.

Complexity Analysis

Let \(L\) be the sieve limit. Building the sieve costs \(O(L\log\log L)\) time and \(O(L)\) memory. After that, the search phase iterates over primes \(p\le L\).

If \(p\) has \(k\) digits and digit multiplicities \(m_0,\dots,m_9\), then the family-generation work for that one prime is

$O\!\left(10\sum_{\tau=0}^{9}\bigl(2^{m_\tau}-1\bigr)\right),$

because every admissible subset is tried against the ten replacement digits. Since this sum is at most \(2^k-1\), the per-prime search cost is bounded by \(O(10\cdot 2^k)\). Under the default search limit, \(k\le 7\), so the exponential factor stays tiny. Inside the sieve range, each primality test is \(O(1)\); outside it, the fallback test is \(O(\sqrt{n})\).

In practice the method is fast because the digit length remains small and the search space is tightly constrained by the repeated-digit structure.

Footnotes and References

  1. Problem page: Project Euler 51
  2. Prime number: Wikipedia - Prime number
  3. Sieve of Eratosthenes: Wikipedia - Sieve of Eratosthenes
  4. Primality test: Wikipedia - Primality test
  5. Positional notation: Wikipedia - Positional notation
  6. Power set: Wikipedia - Power set

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 50 · All Project Euler solutions · Next: Problem 52

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮