Problem 20: Factorial Digit Sum
View on Project EulerProject Euler Problem 20 Solution
EulerSolve provides an optimized solution for Project Euler Problem 20, Factorial Digit Sum, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary The Project Euler instance asks for the sum of the decimal digits of \(100!\). The number \(100!=1\cdot2\cdot3\cdots100\) is far too large for ordinary fixed-width integers: it has \(158\) decimal digits and ends with \(24\) zeros. So the task is not to approximate the factorial, but to construct its exact decimal expansion and then add those digits. Mathematical Approach Let \(F_k=k!\) and let \(s(k)\) denote the sum of the decimal digits of \(F_k\). All three implementations rely on the same mathematical fact: factorials grow by the recurrence \(F_k=kF_{k-1}\), so if we can maintain the exact decimal representation after each multiplication, the final digit sum is immediate. The factorial recurrence gives a natural state progression The starting point is $F_0=1,\qquad F_k=kF_{k-1}\quad(k\ge1).$ This recurrence is problem-specific and complete: to reach \(100!\), we do not need any combinatorial reformulation, only a sequence of exact multiplications by \(2,3,\dots,100\). The boundary case \(0!=1\) is also built into the recurrence and is a useful correctness check. Represent the current factorial in base \(10\) Write the current factorial as $F_k=\sum_{j=0}^{m_k-1} d_j\,10^j,\qquad 0\le d_j\le9.$ Here \(d_0\) is the units digit, \(d_1\) the tens digit, and so on....
Detailed mathematical approach
Problem Summary
The Project Euler instance asks for the sum of the decimal digits of \(100!\). The number \(100!=1\cdot2\cdot3\cdots100\) is far too large for ordinary fixed-width integers: it has \(158\) decimal digits and ends with \(24\) zeros. So the task is not to approximate the factorial, but to construct its exact decimal expansion and then add those digits.
Mathematical Approach
Let \(F_k=k!\) and let \(s(k)\) denote the sum of the decimal digits of \(F_k\). All three implementations rely on the same mathematical fact: factorials grow by the recurrence \(F_k=kF_{k-1}\), so if we can maintain the exact decimal representation after each multiplication, the final digit sum is immediate.
The factorial recurrence gives a natural state progression
The starting point is
$F_0=1,\qquad F_k=kF_{k-1}\quad(k\ge1).$
This recurrence is problem-specific and complete: to reach \(100!\), we do not need any combinatorial reformulation, only a sequence of exact multiplications by \(2,3,\dots,100\). The boundary case \(0!=1\) is also built into the recurrence and is a useful correctness check.
Represent the current factorial in base \(10\)
Write the current factorial as
$F_k=\sum_{j=0}^{m_k-1} d_j\,10^j,\qquad 0\le d_j\le9.$
Here \(d_0\) is the units digit, \(d_1\) the tens digit, and so on. Storing digits in this little-endian order is convenient because multiplication by a small integer begins at the units place and pushes carry to higher places. Once the last multiplication is finished, the target quantity is simply
$s(k)=\sum_{j=0}^{m_k-1} d_j.$
Carry propagation is the exact multiplication rule
Assume the digits \(d_j\) already represent \((k-1)!\). To multiply by \(k\), process the decimal expansion one digit at a time. With initial carry \(c_0=0\), define
$t_j=kd_j+c_j,\qquad d'_j=t_j\bmod 10,\qquad c_{j+1}=\left\lfloor\frac{t_j}{10}\right\rfloor.$
After all existing digits have been updated, the remaining carry \(c_{m_k}\) is written out as additional decimal digits. This is exactly schoolbook multiplication in base \(10\). The key invariant is: after finishing multiplier \(k\), the updated digit list represents \(k!\) exactly. That invariant is what makes the final digit sum trustworthy.
Why the full decimal state must be kept
The sum-of-digits function is not multiplicative, because carries matter. Knowing only the current digit sum does not tell us the next one after multiplication. For instance, \(8\cdot5=40\): the product has digit sum \(4\), not \(8\cdot5\), not \(8+5\), and not anything that can be recovered without the actual decimal carries. That is why the implementations compute the whole factorial exactly and postpone digit summation until the end.
Worked example: building \(10!\)
The recurrence produces
$1,\ 2,\ 6,\ 24,\ 120,\ 720,\ 5040,\ 40320,\ 362880,\ 3628800.$
Therefore
$10!=3628800,\qquad s(10)=3+6+2+8+8+0+0=27.$
This example is large enough to show both phenomena that matter for the full problem: carries appear repeatedly, and the decimal representation acquires trailing zeros automatically when enough factors of \(2\) and \(5\) have accumulated.
Useful numerical invariants
Two classical formulas provide strong sanity checks. First, for any nonnegative integer \(N\), the decimal digit sum satisfies
$s(N)\equiv N\pmod 9.$
Since \(n!\) is divisible by \(9\) for every \(n\ge6\), we must have \(s(n!)\equiv0\pmod 9\) in the Project Euler range. Second, the number of trailing zeros is controlled by the exponent of \(5\):
$z(n)=\sum_{r\ge1}\left\lfloor\frac{n}{5^r}\right\rfloor.$
For \(n=100\), this gives \(z(100)=\lfloor100/5\rfloor+\lfloor100/25\rfloor=20+4=24\). Those zeros add nothing to the digit sum, but they explain the long string of final zeros. A size estimate also comes from
$D(n)=\lfloor\log_{10}(n!)\rfloor+1,$
and Stirling's approximation confirms that \(100!\) has \(158\) digits, so an explicit decimal-digit method is entirely manageable.
How the Code Works
The C++ implementation follows the mathematics directly. It starts from the decimal number \(1\), stored as a one-digit list, then multiplies by every integer from \(2\) through \(n\). Each pass updates every stored digit with the carry recurrence above, appends any remaining carry digits, and preserves the invariant that the list is the exact decimal representation of the current factorial. After the final multiplication, it sums the digits in that list.
The Python and Java implementations perform the same logical computation at a higher level. They build the exact factorial using the language runtime's arbitrary-precision integer machinery, convert the result to decimal text, and add the decimal digits. So the implementations differ only in how explicitly they expose the big-integer arithmetic; mathematically, all three compute the same exact object before taking the digit sum.
Complexity Analysis
For the explicit decimal-digit method, if \(D(k)\) is the number of decimal digits of \(k!\), then multiplying \((k-1)!\) by \(k\) touches \(D(k-1)\) digits and performs constant work per digit. Therefore the total running time is
$\sum_{k=1}^{n} O(D(k))=O\!\left(\sum_{k=1}^{n} k\log k\right)=O(n^2\log n).$
The storage requirement is the size of the current decimal expansion, namely \(O(D(n))=O(n\log n)\) digits. For the actual Project Euler input \(n=100\), this is tiny: only \(158\) decimal digits are stored at the end, and the final digit-summing pass is linear in that length. The higher-level arbitrary-precision implementations may use faster internal multiplication routines, but for this problem they are still dominated by constructing and scanning one exact factorial.
Footnotes and References
- Problem page: Project Euler Problem 20
- Factorial: Wikipedia - Factorial
- Arbitrary-precision arithmetic: Wikipedia - Arbitrary-precision arithmetic
- Legendre's formula for exponents in \(n!\): Wikipedia - Legendre's formula
- Stirling's approximation: Wikipedia - Stirling's approximation
- Digit sums and congruences modulo \(9\): Wikipedia - Digital root