Problem 387: Harshad Numbers
View on Project EulerProject Euler Problem 387 Solution
EulerSolve provides an optimized solution for Project Euler Problem 387, Harshad Numbers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We want the sum of all primes \(p \lt L\) such that deleting the last decimal digit of \(p\) leaves a strong right-truncatable Harshad number . In the local solution files the default bound is \(L = 10^{14}\). Let \(s(n)\) denote the sum of the decimal digits of \(n\). A positive integer \(n\) is a Harshad number if $s(n)\mid n.$ It is strong if $\frac{n}{s(n)}$ is prime. It is right-truncatable if every non-empty decimal prefix is Harshad. Writing \(n=\overline{d_1d_2\cdots d_k}\), each prefix \(\overline{d_1d_2\cdots d_i}\) for \(1 \le i \le k\) must satisfy the Harshad divisibility condition. Every candidate answer prime has a unique decomposition $p = 10h + d,\qquad d = p \bmod 10,\qquad h=\left\lfloor\frac{p}{10}\right\rfloor.$ The problem therefore reduces to generating exactly those prefixes \(h\) that are strong and right-truncatable Harshad numbers, then checking whether one more digit turns them into a prime below \(L\). Mathematical Approach Step 1: Search the prefix, not the final prime A brute-force approach would iterate over primes \(p \lt L\), remove the last digit, and test the whole prefix chain. The repository code does the opposite: it constructs only valid Harshad prefixes and never explores arbitrary integers. All one-digit numbers \(1,\dots,9\) are Harshad because for a single decimal digit we have \(s(n)=n\), hence \(n/s(n)=1\)....
Detailed mathematical approach
Problem Summary
We want the sum of all primes \(p \lt L\) such that deleting the last decimal digit of \(p\) leaves a strong right-truncatable Harshad number. In the local solution files the default bound is \(L = 10^{14}\).
Let \(s(n)\) denote the sum of the decimal digits of \(n\). A positive integer \(n\) is a Harshad number if
$s(n)\mid n.$
It is strong if
$\frac{n}{s(n)}$
is prime. It is right-truncatable if every non-empty decimal prefix is Harshad. Writing \(n=\overline{d_1d_2\cdots d_k}\), each prefix \(\overline{d_1d_2\cdots d_i}\) for \(1 \le i \le k\) must satisfy the Harshad divisibility condition.
Every candidate answer prime has a unique decomposition
$p = 10h + d,\qquad d = p \bmod 10,\qquad h=\left\lfloor\frac{p}{10}\right\rfloor.$
The problem therefore reduces to generating exactly those prefixes \(h\) that are strong and right-truncatable Harshad numbers, then checking whether one more digit turns them into a prime below \(L\).
Mathematical Approach
Step 1: Search the prefix, not the final prime
A brute-force approach would iterate over primes \(p \lt L\), remove the last digit, and test the whole prefix chain. The repository code does the opposite: it constructs only valid Harshad prefixes and never explores arbitrary integers.
All one-digit numbers \(1,\dots,9\) are Harshad because for a single decimal digit we have \(s(n)=n\), hence \(n/s(n)=1\). These values are the roots of the search tree.
Step 2: Extend prefixes with a digit-sum recurrence
Suppose the current node is a Harshad number \(n\) with digit sum \(s=s(n)\). Appending a digit \(d\in\{0,1,\dots,9\}\) gives
$n' = 10n + d,\qquad s(n') = s + d.$
The child is useful only if it is Harshad, namely
$10n + d \equiv 0 \pmod{s+d}.$
This test is exactly what the DFS applies before recursing. Because every child is created from an already valid Harshad parent, right-truncatability is automatic by induction: each visited node has a Harshad parent chain all the way back to a one-digit root.
Step 3: Identify strong Harshad prefixes immediately
Whenever a visited prefix \(n \ge 10\) is Harshad, the code computes
$q=\frac{n}{s(n)}.$
If \(q\) is prime, then \(n\) is strong. This is the decisive filter, because a final answer prime must come from a strong prefix. One-digit prefixes are excluded from this stage in the code since they would only produce \(q=1\), which is not prime.
Step 4: Only four last digits can produce a prime
For any prime \(p \gt 10\), the last digit cannot be even and cannot be \(5\). Therefore every strong prefix \(h\) produces at most four relevant prime candidates:
$p = 10h + d,\qquad d \in \{1,3,7,9\}.$
The implementations test each such \(p\) for the two remaining conditions: \(p \lt L\) and \(p\) is prime.
If both hold, \(p\) is added to the total. This characterization is complete and duplicate-free, because decimal truncation is unique: every valid prime determines exactly one prefix \(h=\lfloor p/10 \rfloor\).
Step 5: Bound the recursion by the largest useful prefix
If \(p=10h+d \lt L\), then necessarily
$h \le \left\lfloor\frac{L-1}{10}\right\rfloor.$
The C++ solution stores this value in harshad_limit. Any Harshad number larger than this can never be the prefix of a valid answer, so exploring beyond it is pointless.
The recursive stop test in the code is slightly sharper. If
$n \gt \left\lfloor\frac{\left\lfloor(L-1)/10\right\rfloor}{10}\right\rfloor,$
then every child \(10n+d\) already exceeds harshad_limit. The current node may still produce final primes \(10n+d\), but it cannot produce deeper Harshad prefixes that will later matter.
Worked Example
Take \(n=18\). Its digit sum is \(s(18)=9\), and \(18/9=2\), so \(18\) is a strong Harshad number. Since \(1\) and \(18\) are both Harshad, it is also right-truncatable.
The only possible prime endings are
$181,\ 183,\ 187,\ 189.$
Among these, \(181\) is prime, \(183\) and \(189\) are divisible by \(3\), and \(187=11\cdot 17\). Hence \(181\) is one of the required primes.
The local C++ program also validates the sample checkpoint
$\sum_{p \lt 10^4} p = 90619,$
and then cross-checks the optimized solver against a brute-force routine at \(L=10^6\). Those checks are consistent with the derivation above.
How the Code Works
The three local implementations use the same core state: the current prefix \(n\) and its digit sum \(s(n)\). Carrying the digit sum forward via \(s(n')=s(n)+d\) makes each tree transition constant time instead of recomputing all decimal digits from scratch.
The C++ file exposes solve(limit), optional argument parsing with --limit=..., and a --skip-checkpoints flag. Its DFS function dfs_harshad returns the sum contributed by one subtree. For primality it first removes small prime divisors up to \(37\), then runs deterministic Miller-Rabin with bases
$\{2,3,5,7,11,13,17\}.$
To avoid overflow in modular multiplication, C++ uses __uint128_t.
The Python file implements the same recurrence and the same Miller-Rabin base set, but accumulates the answer in a nonlocal variable inside dfs. The Java file mirrors the same DFS structure in dfsHarshad and delegates modular exponentiation to BigInteger.modPow. Mathematically all three versions are identical.
Complexity Analysis
Let \(H(L)\) be the number of right-truncatable Harshad prefixes actually visited, and let \(S(L)\) be the number of those prefixes that are strong. The DFS work per visited node is \(O(1)\): one digit-sum update and one divisibility test for each appended digit.
Each strong prefix triggers at most four primality tests, so a precise description is
$O\bigl(H(L) + 4S(L)\,T_{\mathrm{prime}}\bigr),$
where \(T_{\mathrm{prime}}\) is the cost of one Miller-Rabin test in the 64-bit range used here. Since \(S(L)\le H(L)\), this is often summarized as
$O\bigl(H(L)\,T_{\mathrm{prime}}\bigr).$
The recursion depth is the number of decimal digits of the largest explored prefix, so the auxiliary stack space is \(O(\log_{10} L)\). This is far smaller than scanning all integers or all primes below \(L\).
Footnotes and References
- Problem page: https://projecteuler.net/problem=387
- Harshad numbers: https://en.wikipedia.org/wiki/Harshad_number
- Miller-Rabin primality test: https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test
- Deterministic 64-bit primality testing reference: https://cp-algorithms.com/algebra/primality_tests.html