Problem 414: Kaprekar Constant
View on Project EulerProject Euler Problem 414 Solution
EulerSolve provides an optimized solution for Project Euler Problem 414, Kaprekar Constant, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For each base \(B=6k+3\) with \(2 \le k \le 300\), we consider every integer \(i\) with \(0 \lt i \lt B^5\), write it as a 5-digit base-\(B\) string with leading zeros allowed, sort its digits in decreasing and increasing order, subtract, and repeat. For these bases, the 5-digit routine has a unique non-zero Kaprekar constant \(C_B\). Define \(s_B(i)=0\) when \(i=C_B\) or when the 5 digits of \(i\) are all equal, and otherwise let \(s_B(i)\) be the number of iterations needed to reach \(C_B\). Then $S(B)=\sum_{0 \lt i \lt B^5} s_B(i),$ and the goal is the last \(18\) digits of $\sum_{k=2}^{300} S(6k+3).$ Mathematical Approach Step 1: Compress Every 5-Digit State to Two Gaps Let the sorted digits of the current 5-digit string be $x_0 \le x_1 \le x_2 \le x_3 \le x_4.$ The descending and ascending numbers are $N_{\downarrow}=x_4 B^4 + x_3 B^3 + x_2 B^2 + x_1 B + x_0,$ $N_{\uparrow}=x_0 B^4 + x_1 B^3 + x_2 B^2 + x_3 B + x_4.$ The Kaprekar subtraction is therefore $N_{\downarrow}-N_{\uparrow}=(x_4-x_0)(B^4-1)+(x_3-x_1)(B^3-B).$ So the next value depends only on the two outer gaps $u=x_4-x_0,\qquad v=x_3-x_1,$ with \(0 \le v \le u \le B-1\). Define $R_B(u,v)=u(B^4-1)+vB(B^2-1).$ To advance one step, the implementation writes \(R_B(u,v)\) in base \(B\), sorts its five digits, and recomputes the new pair \((u',v')\)....
Detailed mathematical approach
Problem Summary
For each base \(B=6k+3\) with \(2 \le k \le 300\), we consider every integer \(i\) with \(0 \lt i \lt B^5\), write it as a 5-digit base-\(B\) string with leading zeros allowed, sort its digits in decreasing and increasing order, subtract, and repeat. For these bases, the 5-digit routine has a unique non-zero Kaprekar constant \(C_B\).
Define \(s_B(i)=0\) when \(i=C_B\) or when the 5 digits of \(i\) are all equal, and otherwise let \(s_B(i)\) be the number of iterations needed to reach \(C_B\). Then
$S(B)=\sum_{0 \lt i \lt B^5} s_B(i),$
and the goal is the last \(18\) digits of
$\sum_{k=2}^{300} S(6k+3).$
Mathematical Approach
Step 1: Compress Every 5-Digit State to Two Gaps
Let the sorted digits of the current 5-digit string be
$x_0 \le x_1 \le x_2 \le x_3 \le x_4.$
The descending and ascending numbers are
$N_{\downarrow}=x_4 B^4 + x_3 B^3 + x_2 B^2 + x_1 B + x_0,$
$N_{\uparrow}=x_0 B^4 + x_1 B^3 + x_2 B^2 + x_3 B + x_4.$
The Kaprekar subtraction is therefore
$N_{\downarrow}-N_{\uparrow}=(x_4-x_0)(B^4-1)+(x_3-x_1)(B^3-B).$
So the next value depends only on the two outer gaps
$u=x_4-x_0,\qquad v=x_3-x_1,$
with \(0 \le v \le u \le B-1\). Define
$R_B(u,v)=u(B^4-1)+vB(B^2-1).$
To advance one step, the implementation writes \(R_B(u,v)\) in base \(B\), sorts its five digits, and recomputes the new pair \((u',v')\). This gives a deterministic transition
$T_B(u,v)=(u',v').$
The original \(B^5\) search space is thus reduced to only \(O(B^2)\) gap states.
Step 2: The Fixed State and the Kaprekar Constant
Because \(B=6k+3\), the base is divisible by \(3\). Write
$B=3m,$
where \(m\) is odd. The non-zero fixed state used by the routine is
$u=2m,\qquad v=m.$
Substituting these values into the closed form gives
$R_B(2m,m)=2m(B^4-1)+m(B^3-B)=2mB^4+mB^3-mB-2m.$
In base \(B\), this is exactly
$R_B(2m,m)=(2m,\; m-1,\; B-1,\; 2m-1,\; m)_B.$
Its sorted digit multiset is
$\{m-1,\; m,\; 2m-1,\; 2m,\; B-1\},$
so the outer differences remain
$u=(B-1)-(m-1)=2m,\qquad v=(2m)-m=m,$
hence
$T_B(2m,m)=(2m,m).$
This is the Kaprekar constant state. For example, when \(B=15\) we have \(m=5\), so the constant is
$C_{15}=(10,4,14,9,5)_{15},$
and when \(B=21\) we get
$C_{21}=(14,6,20,13,7)_{21}.$
Step 3: Count How Many Inputs Share One Gap State
Let \(M_B(u,v)\) be the number of 5-digit base-\(B\) strings whose sorted digits have outer gaps \((u,v)\). Fix \(u \gt 0\), let the smallest digit be \(a\), and let the largest digit be \(e=a+u\). There are exactly
$B-u$
possible values of \(a\). For each such \(a\), the count depends only on the equality pattern of the middle digits. The constants \(20,30,60,120\) below are multinomial counts such as \(5!/3!\), \(5!/(2!2!)\), \(5!/2!\), and \(5!\).
Case \(v=0\). Then the second and fourth sorted digits are equal, so the multiset has the form
$[a,c,c,c,e].$
If \(c=a\) or \(c=e\), there are \(5\) permutations in each case. If \(a \lt c \lt e\), there are \(u-1\) choices for \(c\), each with \(20\) permutations. Therefore
$M_B(u,0)=(B-u)\bigl(5+5+20(u-1)\bigr),\qquad u \ge 1.$
Case \(u=v\). Now the second digit must equal the minimum and the fourth digit must equal the maximum, so the multiset is
$[a,a,c,e,e].$
If \(c=a\) or \(c=e\), there are \(10\) permutations in each case. If \(a \lt c \lt e\), there are \(u-1\) choices and \(30\) permutations each. Hence
$M_B(u,u)=(B-u)\bigl(10+10+30(u-1)\bigr),\qquad u \ge 1.$
Case \(0 \lt v \lt u\). There are two boundary families where one inner digit collapses onto an endpoint:
$[a,a,c,d,e],\quad [a,a,d,d,e],\quad [a,a,a,d,e],$
and symmetrically
$[a,b,c,e,e],\quad [a,b,b,e,e],\quad [a,b,e,e,e].$
These contribute
$2\bigl(60(v-1)+30+20\bigr).$
If both inner gaps are strict, choose \(b\) with \(a \lt b \lt e-v\), so there are \(u-v-1\) possibilities. Then \(d=b+v\), and the remaining multisets are
$[a,b,b,d,e],\qquad [a,b,d,d,e],\qquad [a,b,c,d,e],$
where in the last form the middle digit satisfies \(b \lt c \lt d\), giving \(v-1\) choices. Their total contribution is
$60(u-v-1)+60(u-v-1)+120(u-v-1)(v-1).$
Combining everything,
$M_B(u,v)=(B-u)\Bigl(100+120(v-1)+120(u-v-1)+120(u-v-1)(v-1)\Bigr),$
for \(0 \lt v \lt u\).
Step 4: Depth on the Functional Graph
Every state \((u,v)\) has exactly one outgoing edge, namely \(T_B(u,v)\). So the gap states form a functional graph. Let \(D_B(u,v)\) be the number of additional Kaprekar steps needed after the first subtraction has produced the state \((u,v)\). Then
$D_B(u,v)= \begin{cases} 0,& T_B(u,v)=(u,v),\\ 1 + D_B\!\bigl(T_B(u,v)\bigr),& T_B(u,v)\ne(u,v). \end{cases}$
The implementation memoizes this recurrence, so each state depth is computed only once.
Step 5: Assemble \(S(B)\)
Among the \(B^5-1\) positive 5-digit strings, exactly \(B-1\) have all digits equal, and the Kaprekar constant itself also contributes \(0\). Every other starting value contributes one unavoidable first iteration before the gap-state depth takes over. Therefore the universal first-step contribution is
$B^5-1-(B-1)-1=B^5-B-1.$
After that first step, only the gap state matters, so
$\boxed{S(B)=B^5-B-1+\sum_{u=1}^{B-1}\sum_{v=0}^{u} M_B(u,v)\,D_B(u,v).}$
This is exactly the quantity accumulated by the implementation. The published checkpoints
$S(15)=5274369,\qquad S(111)=400668930299$
are recovered by this formula.
How the Code Works
The C++, Python, and Java implementations all use the same mathematical reduction. For each base \(B\), they enumerate the triangular set of gap states \((u,v)\), compute the deterministic successor by evaluating \(R_B(u,v)\) and sorting its five base-\(B\) digits, memoize the depth to the fixed state, multiply that depth by the corresponding multiplicity \(M_B(u,v)\), and finally add the universal offset \(B^5-B-1\). All arithmetic is reduced modulo \(10^{18}\) because only the last \(18\) digits are required.
Complexity Analysis
For a fixed base \(B\), the number of states is \(O(B^2)\). Each transition uses only constant-time arithmetic plus a sort of five digits, which is also constant. Memoization ensures that each state depth is evaluated once, so the total work per base is \(O(B^2)\) time and \(O(B^2)\) memory. This is exponentially smaller than brute-forcing all \(B^5\) starting values.
Footnotes and References
- Problem page: https://projecteuler.net/problem=414
- Kaprekar routine: Wikipedia — Kaprekar's routine
- Functional graph: Wikipedia — Functional graph
- Multinomial coefficient: Wikipedia — Multinomial theorem