Problem 698: 123 Numbers
View on Project EulerProject Euler Problem 698 Solution
EulerSolve provides an optimized solution for Project Euler Problem 698, 123 Numbers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary A positive integer is called a 123-number if its decimal expansion uses only the digits \(1\), \(2\), and \(3\), and every nonzero digit count is itself a 123-number. If \(c_1(n)\), \(c_2(n)\), and \(c_3(n)\) denote the numbers of \(1\)'s, \(2\)'s, and \(3\)'s in \(n\), then each positive value among those counts must again satisfy the same rule. The task is to locate the \(N\)-th such number in increasing order and report its value modulo \(123123123\). Mathematical Approach The key observation is that a candidate number is determined by two layers of information: its total length and the triple of digit counts \((A,B,C)\). Once that triple is admissible, the number of corresponding decimal strings is a multinomial coefficient. The implementation therefore counts entire families of strings and jumps directly to the desired rank instead of enumerating all valid numbers one by one. Step 1: Define the recursive admissibility set Let \(\mathcal{S}\) be the set of all positive integers that are valid as digit counts....
Detailed mathematical approach
Problem Summary
A positive integer is called a 123-number if its decimal expansion uses only the digits \(1\), \(2\), and \(3\), and every nonzero digit count is itself a 123-number. If \(c_1(n)\), \(c_2(n)\), and \(c_3(n)\) denote the numbers of \(1\)'s, \(2\)'s, and \(3\)'s in \(n\), then each positive value among those counts must again satisfy the same rule. The task is to locate the \(N\)-th such number in increasing order and report its value modulo \(123123123\).
Mathematical Approach
The key observation is that a candidate number is determined by two layers of information: its total length and the triple of digit counts \((A,B,C)\). Once that triple is admissible, the number of corresponding decimal strings is a multinomial coefficient. The implementation therefore counts entire families of strings and jumps directly to the desired rank instead of enumerating all valid numbers one by one.
Step 1: Define the recursive admissibility set
Let \(\mathcal{S}\) be the set of all positive integers that are valid as digit counts. We start with
$1 \in \mathcal{S}.$
For \(n>1\), membership means two things simultaneously:
$n \in \mathcal{S} \iff \text{every digit of } n \text{ lies in } \{1,2,3\} \text{ and } \forall i \in \{1,2,3\},\ c_i(n)=0 \text{ or } c_i(n)\in\mathcal{S}.$
This recursion is well founded because each count \(c_i(n)\) is at most the number of digits of \(n\), hence strictly smaller than \(n\) for every \(n>1\). So admissibility can be computed bottom-up: when testing \(n\), all smaller count values are already known.
Step 2: Count valid strings of a fixed length
Fix a length \(L\). Suppose the final string contains \(A\) copies of digit \(1\), \(B\) copies of digit \(2\), and \(C\) copies of digit \(3\). Then
$A+B+C=L,$
and the triple is admissible exactly when every nonzero member of \(\{A,B,C\}\) lies in \(\mathcal{S}\).
For one such admissible triple, the number of distinct strings is
$M(L;A,B,C)=\frac{L!}{A!\,B!\,C!}=\binom{L}{A}\binom{L-A}{B}.$
Therefore the total number of valid 123-numbers of length \(L\) is
$F(L)=\sum_{\substack{A+B+C=L\\A,B,C\ge 0\\A,B,C\in \mathcal{S}\cup\{0\}}} \binom{L}{A}\binom{L-A}{B}.$
This reduces the fixed-length problem to a scan over admissible count triples rather than over all \(3^L\) raw strings.
Step 3: Convert numeric order into length order plus lexicographic order
Every valid number with fewer digits is numerically smaller than every valid number with more digits. Inside one fixed length, ordinary lexicographic order agrees with numeric order because there are no leading zeros.
So the search proceeds in two stages. First find the unique length \(L\) such that
$\sum_{\ell=1}^{L-1} F(\ell) < N \le \sum_{\ell=1}^{L} F(\ell).$
Then define the residual rank inside that length by
$R=N-\sum_{\ell=1}^{L-1} F(\ell).$
Now the task is purely combinatorial: find the \(R\)-th admissible length-\(L\) string in lexicographic order.
Step 4: Count completions from a fixed prefix
Assume we have already fixed a prefix of length \(p\), and that this prefix uses \(p_1\) ones, \(p_2\) twos, and \(p_3\) threes. Let the remaining length be
$r=L-p.$
If the final count triple is \((A,B,C)\), then the suffix still needs
$a=A-p_1,\qquad b=B-p_2,\qquad c=C-p_3,$
with \(a,b,c\ge 0\) and \(a+b+c=r\). For this particular final triple, the number of admissible completions of the current prefix is
$\frac{r!}{a!\,b!\,c!}=\binom{r}{a}\binom{r-a}{b}.$
Summing over all admissible final triples gives the block size
$G_L(p,p_1,p_2,p_3)=\sum_{\substack{A+B+C=L\\A,B,C\in \mathcal{S}\cup\{0\}\\A\ge p_1,\ B\ge p_2,\ C\ge p_3}} \binom{r}{A-p_1}\binom{r-(A-p_1)}{B-p_2}.$
Testing the next digit in the order \(1,2,3\) partitions all valid completions into consecutive lexicographic blocks. If the desired rank is larger than the first block, subtract that block and continue with the next digit; otherwise keep that digit and move to the next position.
Step 5: Worked example
The first admissible count values are
$1,2,3,11,12,13,21,22,23,31,32,33,\dots$
so \(4\notin\mathcal{S}\). That immediately explains the early lengths. For \(L=1,2,3\), every count triple uses only the values \(0,1,2,3\), so every ternary string is valid and
$F(1)=3,\qquad F(2)=9,\qquad F(3)=27.$
The cumulative total through length \(3\) is therefore
$3+9+27=39.$
Hence the \(40\)-th 123-number must be the first admissible string of length \(4\). The lexicographically smallest candidate is \(1111\), but its count triple is \((4,0,0)\), which is invalid because \(4\notin\mathcal{S}\). The next candidate is \(1112\), whose count triple is \((3,1,0)\); both positive counts \(3\) and \(1\) are admissible. Therefore the first valid length-\(4\) string is \(1112\), so the \(40\)-th 123-number is \(1112\).
How the Code Works
The C++, Python, and Java implementations all follow the same structure. They first build a boolean table of admissible count values in increasing order, which resolves the recursive definition iteratively. Next they construct Pascal-triangle rows on demand, so every binomial coefficient needed for the multinomial formulas can be read in constant time afterward.
All counting is performed with saturation at \(N+1\). This is enough because the search never needs an exact block size once that block is already larger than the remaining rank. After the admissibility table and binomial table are available, the implementation sums \(F(1),F(2),\dots\) until it identifies the target length, then builds the answer one digit at a time by evaluating prefix-completion counts for tentative next digits \(1\), \(2\), and \(3\).
When the final decimal string is known, the remainder modulo \(123123123\) is evaluated left to right via
$v_0=0,\qquad v_{k+1}\equiv 10v_k+d_{k+1}\pmod{123123123}.$
This avoids ever converting the full answer into a large integer type.
Complexity Analysis
Let \(M\) be the size of the admissibility table and \(L\) the length of the target number. Building the admissibility table costs \(O(M\log M)\) time because each candidate is scanned digit by digit, and it uses \(O(M)\) memory. Growing Pascal rows up to length \(L\) costs \(O(L^2)\) time and memory.
To locate the correct length, the implementation evaluates \(F(\ell)\) for \(\ell=1,2,\dots,L\). Each such evaluation scans \(O(\ell^2)\) count pairs, so the total cost is
$\sum_{\ell=1}^{L} O(\ell^2)=O(L^3).$
The unranking stage performs at most three prefix queries per position, and each query scans admissible final triples in \(O(L^2)\), giving another \(O(L^3)\) phase. Overall the method runs in
$O(M\log M+L^3)$
time and uses \(O(M+L^2)\) memory for the target instance.
Footnotes and References
- Problem page: https://projecteuler.net/problem=698
- Multinomial coefficients: Wikipedia — Multinomial theorem
- Binomial coefficients: Wikipedia — Binomial coefficient
- Lexicographical order: Wikipedia — Lexicographical order
- Dynamic programming: Wikipedia — Dynamic programming