Problem 159: Digital Root Sums of Factorisations
View on Project EulerProject Euler Problem 159 Solution
EulerSolve provides an optimized solution for Project Euler Problem 159, Digital Root Sums of Factorisations, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For every positive integer \(n\), let \(\operatorname{dr}(n)\) be its digital root, obtained by repeatedly summing decimal digits until only one digit remains. If $n=f_1f_2\cdots f_k,\qquad f_i \ge 2,$ define the digital-root sum of that multiplicative factorisation by $\operatorname{drs}(f_1,\dots,f_k)=\sum_{i=1}^{k}\operatorname{dr}(f_i).$ The quantity of interest is $\operatorname{mdrs}(n)=\max \operatorname{drs}(f_1,\dots,f_k),$ where the maximum runs over all multiplicative factorizations of \(n\) into integers at least 2, including the trivial one-part factorisation \(n\) itself. The problem asks for $\sum_{n=2}^{10^6-1}\operatorname{mdrs}(n).$ A brute-force search over all factorizations of every integer below one million would repeat the same subproblems many times. The implementations avoid that by turning \(\operatorname{mdrs}\) into a dynamic program on factor pairs. Mathematical Approach The whole solution rests on one observation: once the best value is known for smaller integers, the best value for a larger composite can be assembled from a split \(n=ab\). Digital roots and the score of a factorisation For \(n \ge 1\), the digital root is given directly by $\operatorname{dr}(n)=1+((n-1)\bmod 9).$ So \(\operatorname{dr}(n)\) is always one of \(1,2,\dots,9\), and multiples of 9 have digital root 9....
Detailed mathematical approach
Problem Summary
For every positive integer \(n\), let \(\operatorname{dr}(n)\) be its digital root, obtained by repeatedly summing decimal digits until only one digit remains. If
$n=f_1f_2\cdots f_k,\qquad f_i \ge 2,$
define the digital-root sum of that multiplicative factorisation by
$\operatorname{drs}(f_1,\dots,f_k)=\sum_{i=1}^{k}\operatorname{dr}(f_i).$
The quantity of interest is
$\operatorname{mdrs}(n)=\max \operatorname{drs}(f_1,\dots,f_k),$
where the maximum runs over all multiplicative factorizations of \(n\) into integers at least 2, including the trivial one-part factorisation \(n\) itself. The problem asks for
$\sum_{n=2}^{10^6-1}\operatorname{mdrs}(n).$
A brute-force search over all factorizations of every integer below one million would repeat the same subproblems many times. The implementations avoid that by turning \(\operatorname{mdrs}\) into a dynamic program on factor pairs.
Mathematical Approach
The whole solution rests on one observation: once the best value is known for smaller integers, the best value for a larger composite can be assembled from a split \(n=ab\).
Digital roots and the score of a factorisation
For \(n \ge 1\), the digital root is given directly by
$\operatorname{dr}(n)=1+((n-1)\bmod 9).$
So \(\operatorname{dr}(n)\) is always one of \(1,2,\dots,9\), and multiples of 9 have digital root 9. If we choose not to factor \(n\) at all, then its score is simply \(\operatorname{dr}(n)\). That provides the baseline value for \(\operatorname{mdrs}(n)\).
The exact recurrence for \(\operatorname{mdrs}(n)\)
Take any nontrivial factorisation of \(n\):
$n=f_1f_2\cdots f_k,\qquad k\ge 2.$
Group the factors as \(a=f_1\) and \(b=f_2\cdots f_k\). Then \(n=ab\) with \(a,b\ge 2\), and the score of the original factorisation is at most \(\operatorname{mdrs}(a)+\operatorname{mdrs}(b)\). Conversely, for every split \(n=ab\), concatenating an optimal factorisation of \(a\) with an optimal factorisation of \(b\) produces a factorisation of \(n\) whose score is exactly that sum. Therefore
$\operatorname{mdrs}(n)=\max\!\left(\operatorname{dr}(n),\max_{\substack{ab=n\\a,b\ge 2}}\bigl(\operatorname{mdrs}(a)+\operatorname{mdrs}(b)\bigr)\right).$
This is the central mathematical object used by all three implementations. Prime numbers never enter the inner maximum, so for primes the value stays equal to \(\operatorname{dr}(n)\). Composite numbers are improved by whichever proper factor pair gives the largest sum.
Why one multiplicative sweep is sufficient
The table is initialized with
$\operatorname{mdrs}(n)=\operatorname{dr}(n)\qquad (1\le n<10^6).$
After that, every product \(p=ab\) with \(a,b\ge 2\) is used to relax the table entry for \(p\):
$\operatorname{mdrs}(p)\leftarrow \max\!\bigl(\operatorname{mdrs}(p),\operatorname{mdrs}(a)+\operatorname{mdrs}(b)\bigr).$
The only subtle point is correctness of the loop order. When the outer sweep reaches a value \(a\), all proper factorizations of \(a\) have already been processed, because every factor pair of \(a\) uses a smaller left factor. So \(\operatorname{mdrs}(a)\) is final at that moment. If the companion factor \(b\) is larger and later receives a better value, the same product \(p=ab\) reappears again when the outer sweep eventually reaches \(b\). Hence every unordered factor pair is considered at least once after both component values are final, and a single pass over multiples is enough.
Worked example: \(n=24\)
The direct digital root is \(\operatorname{dr}(24)=6\), so the baseline starts at 6. The proper factor pairs are
$24=2\cdot 12=3\cdot 8=4\cdot 6.$
Using already-computed smaller values gives
$\operatorname{mdrs}(2)+\operatorname{mdrs}(12)=2+8=10,$
$\operatorname{mdrs}(3)+\operatorname{mdrs}(8)=3+8=11,$
$\operatorname{mdrs}(4)+\operatorname{mdrs}(6)=4+6=10.$
So
$\operatorname{mdrs}(24)=11,$
attained by the factorisation \(24=3\cdot 8\). This example is important because it shows why the problem is not solved by the digital root alone: the best multiplicative split can exceed \(\operatorname{dr}(n)\) by a large margin.
How the Code Works
Build the baseline table
The C++, Python, and Java implementations allocate an array indexed by the integers below \(10^6\). Each entry is first filled with the closed-form digital root, so every number begins with the score of its trivial one-factor decomposition.
Relax every composite through its multiples
The main sweep chooses an outer factor \(a\) from 2 upward and then visits the multiples \(2a,3a,4a,\dots\). For each multiple \(p\), the companion factor is \(b=p/a\), so the implementation tries the candidate \(\operatorname{mdrs}(a)+\operatorname{mdrs}(b)\). If that candidate is better than the current entry for \(p\), the table is updated. One implementation also checks the known value \(\operatorname{mdrs}(24)=11\) before doing the full run.
Accumulate the required sum
Once the table has been relaxed over all products, the remaining work is linear: add \(\operatorname{mdrs}(n)\) for every \(n\) from 2 through \(10^6-1\). The three language versions differ only in surface syntax; the mathematical workflow is the same in all of them.
Complexity Analysis
Initializing the table costs \(O(N)\) with \(N=10^6\). The dominant phase is the sweep over multiples:
$\sum_{a=2}^{N-1}\left\lfloor\frac{N-1}{a}\right\rfloor = O(N\log N).$
The final accumulation is another \(O(N)\) pass, so the total running time is \(O(N\log N)\). The memory usage is \(O(N)\) for the array of \(\operatorname{mdrs}\) values.
Footnotes and References
- Project Euler Problem 159: Digital Root Sums of Factorisations
- Digital root: Wikipedia - Digital root
- Dynamic programming: Wikipedia - Dynamic programming
- Divisor and factor pairs: Wikipedia - Divisor