Problem 845: Prime Digit Sum
View on Project EulerProject Euler Problem 845 Solution
EulerSolve provides an optimized solution for Project Euler Problem 845, Prime Digit Sum, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We seek the \(n\)-th positive integer whose decimal digit sum is prime. A direct scan would be far too slow for the real target, so the solution counts how many valid numbers lie in \([1,x]\), then finds the first \(x\) whose count reaches \(n\). Let \(s(x)\) be the base-10 digit sum of \(x\), and let \(\mathcal{P}\) denote the set of prime numbers. The central counting function is $C(x)=\#\{1\le m\le x : s(m)\in\mathcal{P}\}.$ Once \(C(x)\) can be evaluated quickly, the answer is simply the smallest \(x\) with \(C(x)\ge n\). Mathematical Approach The method is a digit dynamic program. Instead of enumerating every number up to \(x\), it counts possible suffixes by their digit sums and reuses those counts while scanning the decimal representation of \(x\). The implementations reserve tables for at most \(D=20\) decimal positions, so the largest relevant digit sum is \(S=9D=180\). Step 1: Count digit strings by length and sum For \(\ell\ge 0\) and \(0\le t\le S\), define $W(\ell,t)=\#\{(d_1,\dots,d_\ell)\in\{0,\dots,9\}^{\ell}: d_1+\cdots+d_\ell=t\}.$ Leading zeros are allowed. This matters because a string such as \(04\) represents the ordinary number \(4\), so the same table automatically covers shorter numbers when we later compare against a fixed-length bound....
Detailed mathematical approach
Problem Summary
We seek the \(n\)-th positive integer whose decimal digit sum is prime. A direct scan would be far too slow for the real target, so the solution counts how many valid numbers lie in \([1,x]\), then finds the first \(x\) whose count reaches \(n\).
Let \(s(x)\) be the base-10 digit sum of \(x\), and let \(\mathcal{P}\) denote the set of prime numbers. The central counting function is
$C(x)=\#\{1\le m\le x : s(m)\in\mathcal{P}\}.$
Once \(C(x)\) can be evaluated quickly, the answer is simply the smallest \(x\) with \(C(x)\ge n\).
Mathematical Approach
The method is a digit dynamic program. Instead of enumerating every number up to \(x\), it counts possible suffixes by their digit sums and reuses those counts while scanning the decimal representation of \(x\).
The implementations reserve tables for at most \(D=20\) decimal positions, so the largest relevant digit sum is \(S=9D=180\).
Step 1: Count digit strings by length and sum
For \(\ell\ge 0\) and \(0\le t\le S\), define
$W(\ell,t)=\#\{(d_1,\dots,d_\ell)\in\{0,\dots,9\}^{\ell}: d_1+\cdots+d_\ell=t\}.$
Leading zeros are allowed. This matters because a string such as \(04\) represents the ordinary number \(4\), so the same table automatically covers shorter numbers when we later compare against a fixed-length bound.
The initial conditions are
$W(0,0)=1,\qquad W(0,t)=0\text{ for }t>0,$
and the recurrence is
$W(\ell,t)=\sum_{d=0}^{9} W(\ell-1,t-d),$
with the convention that \(W(\ell,u)=0\) whenever \(u \lt 0\). After this table is built, we know how many tails of any remaining length realize any required digit sum.
Step 2: Convert those counts into prime-completing suffix counts
Suppose a prefix already contributes digit sum \(p\), and there are \(r\) digits still to choose. We need the number of tails that make the final total prime. Define
$G(r,p)=\sum_{\substack{u\ge 0\\ p+u\in\mathcal{P}}} W(r,u).$
Thus \(G(r,p)\) counts all length-\(r\) suffixes whose digit sum \(u\) turns the current prefix sum \(p\) into a prime total \(p+u\). This second table becomes the counting oracle used for every query.
Because the largest possible total is only \(180\), primality can be precomputed once by a small sieve and then reused for every state \((r,p)\).
Step 3: Evaluate \(C(x)\) digit by digit
Write the decimal expansion of \(x\) as \(x_0x_1\dots x_{k-1}\), and let
$\sigma_i=x_0+x_1+\cdots+x_{i-1}$
be the sum of the digits strictly before position \(i\), with \(\sigma_0=0\).
While scanning from left to right, position \(i\) offers two possibilities: follow the actual digit \(x_i\), or choose any smaller digit \(d \lt x_i\). If we choose a smaller digit, then the remaining \(k-1-i\) positions are completely free, so their contribution is exactly
$G(k-1-i,\sigma_i+d).$
Summing this over every position and every smaller digit yields all valid numbers strictly below \(x\). After the scan, we add one more if the digit sum of \(x\) itself is prime. In compact form,
$C(x)=\sum_{i=0}^{k-1}\sum_{d=0}^{x_i-1} G(k-1-i,\sigma_i+d)+\varepsilon(x),$
where \(\varepsilon(x)=1\) when \(s(x)\in\mathcal{P}\), and \(\varepsilon(x)=0\) otherwise.
Step 4: Why leading zeros handle shorter numbers automatically
If \(x\) has \(k\) digits, then every positive integer with fewer than \(k\) digits is still represented as a length-\(k\) string by padding it with leading zeros. For example, \(37\) is treated as \(037\) when compared against a 3-digit bound.
This removes the need for separate cases for 1-digit, 2-digit, and longer numbers. The number \(0\) causes no trouble, because its digit sum is \(0\), and \(0\notin\mathcal{P}\).
Step 5: Recover the \(n\)-th valid number by monotone search
The function \(C(x)\) is nondecreasing, because enlarging \([1,x]\) can only add more valid integers. Therefore the target value is
$\min\{x\ge 1 : C(x)\ge n\}.$
The implementations first grow an upper bound by repeated doubling until the count reaches \(n\). Once such a bound is found, ordinary binary search isolates the first \(x\) with \(C(x)\ge n\).
Worked Example: Why the 61st term is \(157\)
A useful checkpoint is that the 61st positive integer with prime digit sum is \(157\). The digit-DP explains this neatly.
First consider all numbers from \(0\) to \(99\). Using 2-digit strings with leading zeros, the prime digit sums are \(2,3,5,7,11,13,17\). The table \(W(2,t)\) gives
$W(2,2)=3,\quad W(2,3)=4,\quad W(2,5)=6,\quad W(2,7)=8,\quad W(2,11)=8,\quad W(2,13)=6,\quad W(2,17)=2,$
so
$G(2,0)=3+4+6+8+8+6+2=37.$
Thus exactly \(37\) numbers in \([1,99]\) have prime digit sum.
Now move to numbers below \(157\). Fix the hundreds digit as \(1\). For a tens digit smaller than \(5\), the current prefix sums are \(1,2,3,4,5\). The 1-digit suffix table gives
$G(1,1)=4,\quad G(1,2)=5,\quad G(1,3)=4,\quad G(1,4)=4,\quad G(1,5)=4,$
hence the block \(100\) to \(149\) contributes
$4+5+4+4+4=21.$
Finally, within \(150\) to \(156\), only \(151\) and \(155\) have prime digit sum, so this last partial block contributes \(2\). Therefore
$C(156)=37+21+2=60.$
Since \(1+5+7=13\) is prime, \(157\) itself is valid, giving
$C(157)=61.$
So the 61st term is indeed \(157\), exactly matching the checkpoint used by the implementations.
How the Code Works
The C++, Python, and Java implementations follow the same pipeline. They first precompute primality for every possible digit sum from \(0\) to \(180\). Next they build the table \(W(\ell,t)\) for all lengths up to \(20\), then build the suffix-completion table \(G(r,p)\).
For a query \(C(x)\), the implementation converts \(x\) to decimal, scans left to right, and whenever a smaller digit than the bound is possible at the current position, it adds the precomputed number of prime-completing tails. After the full scan, it checks whether the bound itself has prime digit sum and includes it if appropriate.
To obtain the \(n\)-th valid integer, the implementation repeatedly doubles an upper bound until enough valid numbers are covered, and then performs a standard binary search on that interval. No brute-force pass over all intermediate integers is needed.
Complexity Analysis
Let \(D\) be the maximum number of digits and \(S=9D\) the maximum digit sum. Building the prime table costs \(O(S\log\log S)\). Building \(W\) costs \(O(10DS)\). Building the suffix table \(G\) in the straightforward form used here costs \(O(DS^2)\), because each state \((r,p)\) scans all feasible tail sums.
One evaluation of \(C(x)\) costs \(O(10D)\): there are \(D\) positions and at most 10 candidate digits per position. The final search uses \(O(\log X)\) such evaluations, where \(X\) is the answer. Memory usage is \(O(DS)\). In these implementations, \(D=20\) and \(S=180\), so the constants are very small.
Footnotes and References
- Problem page: https://projecteuler.net/problem=845
- Digit sum: Wikipedia — Digit sum
- Dynamic programming: Wikipedia — Dynamic programming
- Prime number: Wikipedia — Prime number
- Binary search algorithm: Wikipedia — Binary search algorithm