Problem 17: Number Letter Counts
View on Project EulerProject Euler Problem 17 Solution
EulerSolve provides an optimized solution for Project Euler Problem 17, Number Letter Counts, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Problem 17 asks for the total number of letters used when the integers from 1 to 1000 are written out in British English words, with spaces and hyphens ignored. For example, three hundred and forty-two contributes 23 letters, and one hundred and fifteen contributes 20. The key point is that this is not a string-processing problem in disguise. Once the English naming rules are fixed, every number from 1 to 1000 falls into a small number of arithmetic cases: units, teens, tens with an optional unit, hundreds with an optional remainder, and the special case 1000. The implementations exploit exactly that structure. Mathematical Approach Let \(L(n)\) be the number of letters in the British English name of \(n\), after removing spaces and hyphens. The solution is built from three tiny lookup tables and one recursive reduction from a number in \(100\) to \(999\) down to its remainder modulo \(100\). The Letter-Count Function The implementations use the following fixed letter counts: $u(0),u(1),\dots,u(9)=(0,3,3,5,4,4,3,5,5,4),$ $v(0),v(1),\dots,v(9)=(3,6,6,8,8,7,7,9,8,8),$ $t(0),t(1),\dots,t(9)=(0,0,6,6,5,5,5,7,6,6).$ Here \(u(d)\) counts the words one through nine, \(v(r)\) counts ten through nineteen via \(10+r\), and \(t(q)\) counts the decade words twenty through ninety. The constants \(7\) and \(3\) come from the words hundred and and ....
Detailed mathematical approach
Problem Summary
Problem 17 asks for the total number of letters used when the integers from 1 to 1000 are written out in British English words, with spaces and hyphens ignored. For example, three hundred and forty-two contributes 23 letters, and one hundred and fifteen contributes 20.
The key point is that this is not a string-processing problem in disguise. Once the English naming rules are fixed, every number from 1 to 1000 falls into a small number of arithmetic cases: units, teens, tens with an optional unit, hundreds with an optional remainder, and the special case 1000. The implementations exploit exactly that structure.
Mathematical Approach
Let \(L(n)\) be the number of letters in the British English name of \(n\), after removing spaces and hyphens. The solution is built from three tiny lookup tables and one recursive reduction from a number in \(100\) to \(999\) down to its remainder modulo \(100\).
The Letter-Count Function
The implementations use the following fixed letter counts:
$u(0),u(1),\dots,u(9)=(0,3,3,5,4,4,3,5,5,4),$
$v(0),v(1),\dots,v(9)=(3,6,6,8,8,7,7,9,8,8),$
$t(0),t(1),\dots,t(9)=(0,0,6,6,5,5,5,7,6,6).$
Here \(u(d)\) counts the words one through nine, \(v(r)\) counts ten through nineteen via \(10+r\), and \(t(q)\) counts the decade words twenty through ninety. The constants \(7\) and \(3\) come from the words hundred and and.
With those objects, \(L(n)\) is completely described by the piecewise rule
$L(n)=\begin{cases} u(n), & 1\le n\le 9,\\ v(n-10), & 10\le n\le 19,\\ t(\lfloor n/10\rfloor)+u(n\bmod 10), & 20\le n\le 99,\\ u(h)+7, & n=100h,\ 1\le h\le 9,\\ u(h)+7+3+L(r), & n=100h+r,\ 1\le h\le 9,\ 1\le r\le 99,\\ 11, & n=1000. \end{cases}$
The invariant behind the last two lines is simple: after stripping off the hundreds prefix, every non-round hundred leaves a remainder \(r\in\{1,\dots,99\}\), and the counting problem becomes exactly the same subproblem as before, just on a smaller number.
Summing the Numbers Below One Hundred
The first useful block is the range \(1\) to \(99\). Define
$U=\sum_{d=1}^{9}u(d)=36,\qquad V=\sum_{r=0}^{9}v(r)=70.$
For each decade \(20q\) to \(20q+9\) with \(q\in\{2,\dots,9\}\), the decade word appears ten times and the unit words \(1\) through \(9\) appear once each. Therefore
$\sum_{n=10q}^{10q+9}L(n)=10\,t(q)+U.$
Adding the unit block, the teen block, and the eight decade blocks gives
$B=\sum_{n=1}^{99}L(n)=U+V+\sum_{q=2}^{9}\bigl(10\,t(q)+U\bigr)=854.$
This constant \(B\) is the backbone of the whole count, because every nontrivial hundreds block contains one copy of the entire \(1\) to \(99\) pattern.
Hundreds, the British “And”, and the Block Formula
Fix a hundreds digit \(h\in\{1,\dots,9\}\). The block from \(100h\) to \(100h+99\) contains:
$100\bigl(u(h)+7\bigr)$
letters from the prefix \(h\) hundred, because that prefix appears in all 100 numbers of the block. Among those 100 numbers, exactly 99 are not multiples of 100, so British usage inserts and exactly 99 times, contributing
$99\cdot 3.$
The remainders \(1\) through \(99\) also appear once each, so the rest of the contribution is another copy of \(B\). Hence the entire block sum is
$H_h=\sum_{n=100h}^{100h+99}L(n)=100\bigl(u(h)+7\bigr)+99\cdot 3+B.$
Now sum the initial block \(1\) to \(99\) together with the nine hundreds blocks:
$\sum_{n=1}^{999}L(n)=B+\sum_{h=1}^{9}H_h=10B+100\sum_{h=1}^{9}\bigl(u(h)+7\bigr)+9\cdot 99\cdot 3=21113.$
Finally, \(1000\) contributes one thousand, which has \(11\) letters, so
$\sum_{n=1}^{1000}L(n)=21113+11=21124.$
Worked Examples
The two checkpoint values used by the implementations illustrate the rule cleanly. For \(342\), we write \(342=300+42\), so
$L(342)=L(3)+7+3+L(42)=5+7+3+(5+3)=23.$
For \(115\), the remainder lies in the teen range:
$L(115)=L(1)+7+3+L(15)=3+7+3+7=20.$
These examples show why the British and matters: omitting it would make both counts too small.
How the Code Works
Constant Tables Instead of Strings
The C++, Python, and Java implementations never build the English words themselves. They store only the three constant tables above, plus the two fixed additions for hundred and and. That turns the problem from text handling into pure integer arithmetic.
Evaluate One Number, Then Accumulate the Range
For a single number, the implementation first handles \(1000\) as a special case. Otherwise it checks whether the number has a hundreds digit; if so, it adds the hundreds prefix, reduces the number modulo \(100\), and adds \(3\) more letters exactly when the remainder is nonzero. The remaining value is then dispatched to one of three cases: \(20\) to \(99\), \(10\) to \(19\), or \(0\) to \(9\).
After that, the total answer is just the sum of this per-number routine over the desired interval. The Project Euler target is \(1\) through \(1000\), while the C++ implementation also exposes smaller limits between \(1\) and \(1000\). The implementations include checkpoint examples such as \(1\) through \(5\mapsto 19\), \(342\mapsto 23\), and \(115\mapsto 20\), all of which follow directly from the piecewise formula above.
Complexity Analysis
The implementations perform constant work for each integer in the range, so the running time is \(O(N)\) for a limit \(N\le 1000\). Since every lookup table has fixed size and only a few integer variables are used, the extra space usage is \(O(1)\).
Mathematically, the derivation above also yields an \(O(1)\) closed-form evaluation for the specific target \(N=1000\). The implementations choose the simpler loop because at this scale it is already instantaneous and mirrors the English-number rules very directly.