Problem 990: Addition Equations
View on Project EulerProject Euler Problem 990 Solution
EulerSolve provides an optimized solution for Project Euler Problem 990, Addition Equations, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(F(n)\) denote the number of true decimal equations of the form \[ A_1+\cdots+A_\ell = B_1+\cdots+B_r, \] where every term is a positive integer written without leading zeros. The total length counts every digit, every plus sign, and the equality sign: \[ |E|=\sum_{i=1}^{\ell}\operatorname{len}(A_i)+\sum_{j=1}^{r}\operatorname{len}(B_j)+(\ell-1)+(r-1)+1. \] The problem asks for \(F(50)\bmod 10^9+7\). The implementations verify themselves against the small values \[ F(3)=9,\qquad F(5)=171,\qquad F(7)=4878. \] Brute force is not realistic: even before checking arithmetic, the number of possible strings explodes with the number of terms, their lengths, and all digit choices. The successful approach is to read the equation from right to left, one decimal column at a time, and count only the data that can still affect higher columns. Mathematical Approach The key simplification is that decimal addition is local. After the low columns have been fixed, the unfinished part of the equation is determined only by the terms that still have digits left and by the carry coming from below. Active Terms and the Length Bound A term is active in a column if it still contributes a digit in that column. If the equation has \(T=\ell+r\) total terms, then even the shortest possible equation has \(T\) digits and \(T-1\) separators, so \[ |E|\ge 2T-1....
Detailed mathematical approach
Problem Summary
Let \(F(n)\) denote the number of true decimal equations of the form
\[ A_1+\cdots+A_\ell = B_1+\cdots+B_r, \]
where every term is a positive integer written without leading zeros. The total length counts every digit, every plus sign, and the equality sign:
\[ |E|=\sum_{i=1}^{\ell}\operatorname{len}(A_i)+\sum_{j=1}^{r}\operatorname{len}(B_j)+(\ell-1)+(r-1)+1. \]
The problem asks for \(F(50)\bmod 10^9+7\). The implementations verify themselves against the small values
\[ F(3)=9,\qquad F(5)=171,\qquad F(7)=4878. \]
Brute force is not realistic: even before checking arithmetic, the number of possible strings explodes with the number of terms, their lengths, and all digit choices. The successful approach is to read the equation from right to left, one decimal column at a time, and count only the data that can still affect higher columns.
Mathematical Approach
The key simplification is that decimal addition is local. After the low columns have been fixed, the unfinished part of the equation is determined only by the terms that still have digits left and by the carry coming from below.
Active Terms and the Length Bound
A term is active in a column if it still contributes a digit in that column. If the equation has \(T=\ell+r\) total terms, then even the shortest possible equation has \(T\) digits and \(T-1\) separators, so
\[ |E|\ge 2T-1. \]
Therefore \(|E|\le 50\) implies \(T\le 25\). This is exactly the bound used by the implementations: there are never more than 25 active terms in total, and that keeps the state space finite and small.
Before any digit column is processed, the only committed characters are the separators. If there are \(\ell\) terms on the left and \(r\) on the right, the initial committed length is
\[ (\ell-1)+(r-1)+1=\ell+r-1. \]
One-Side Column Polynomials
Suppose a column begins with \(a\) active terms on one side, and exactly \(a'\le a\) of them survive to the next higher column. Then \(a-a'\) terms end in the current column, so their current digit is the leading digit of that number and must lie in \(\{1,\dots,9\}\). The surviving \(a'\) terms may contribute any digit in \(\{0,\dots,9\}\).
This gives the generating polynomial
\[ P_{a,a'}(x)=\bigl(x+x^2+\cdots+x^9\bigr)^{a-a'}\bigl(1+x+x^2+\cdots+x^9\bigr)^{a'}. \]
If we write
\[ c_{a,a'}(s)=[x^s]\,P_{a,a'}(x), \]
then \(c_{a,a'}(s)\) counts the admissible digit assignments on that side whose column sum is \(s\). Because the terms are distinguishable by their positions in the written equation, choosing which \(a'\) of the \(a\) active terms survive contributes the factor \(\binom{a}{a'}\).
The Carry Invariant
Let \(L\) and \(R\) be the left and right digit sums in the current column, and let \(\delta\) be the carry entering this column from the already processed lower columns. Valid base-10 arithmetic is equivalent to
\[ L-R+\delta=10\delta', \]
where \(\delta'\) is the carry passed to the next higher column. So only differences
\[ d=L-R \]
with \(d\equiv -\delta \pmod{10}\) are feasible for the current state.
For fixed survivor counts on both sides it is convenient to precompute
\[ T_{a,a',b,b'}(d)=\sum_{s-t=d} c_{a,a'}(s)\,c_{b,b'}(t), \]
the number of digit assignments that produce column difference \(d\). This is the real combinatorial object used by the implementations: once \(T_{a,a',b,b'}(d)\) is known, the individual digit sums \(s\) and \(t\) no longer have to be revisited during the DP.
State and Recurrence
Define \(D(u,a,b,\delta)\) as the number of partially constructed equations whose lower columns are already fixed, whose committed total length is \(u\), which still have \(a\) active left terms and \(b\) active right terms, and whose carry into the current column is \(\delta\).
The initial states are
\[ D(\ell+r-1,\ell,r,0)=1 \]
for every \(\ell\ge 1\), \(r\ge 1\), \(\ell+r\le 25\). Processing one new column appends one digit for every active term, so the length update is
\[ u'=u+a+b. \]
If the next column leaves \(a'\le a\) active terms on the left and \(b'\le b\) on the right, then the transition is
\[ D(u+a+b,a',b',\delta') \;+\!=\; D(u,a,b,\delta)\binom{a}{a'}\binom{b}{b'}T_{a,a',b,b'}(d), \]
whenever \(d+\delta=10\delta'\). An equation is complete exactly when
\[ a=b=0,\qquad \delta=0, \]
so the final count is
\[ F(n)=\sum_{u=0}^{n} D(u,0,0,0). \]
Why the Carry Window Is Only \([-25,25]\)
At any moment there are at most 25 active terms in total, so the column difference always satisfies
\[ |L-R|\le 9\cdot 25=225. \]
If \(|\delta|\le 25\), then
\[ |\delta'|=\left|\frac{L-R+\delta}{10}\right|\le \frac{225+25}{10}=25. \]
Since the process starts with carry \(0\), every reachable carry remains inside \([-25,25]\). That is why all three implementations can use a fixed carry dimension instead of an unbounded map.
Worked Example: Recovering \(F(5)=171\)
The length-5 check is a good sanity test because it already uses the same machinery as the full problem. Consider equations of the shape \(a+b=c\) with all three numbers one-digit. The separators contribute length 2, so one processed column gives total length 5. The initial state is \(D(2,2,1,0)=1\), and the only possible survivor counts are \(a'=0\), \(b'=0\).
The relevant polynomials are
\[ P_{2,0}(x)=\bigl(x+\cdots+x^9\bigr)^2,\qquad P_{1,0}(x)=x+\cdots+x^9. \]
Because the final carry must be zero, we need difference \(d=0\). For each right-hand digit \(c\in\{1,\dots,9\}\), there are exactly \(c-1\) ordered pairs \((a,b)\) with \(a+b=c\). Therefore
\[ T_{2,0,1,0}(0)=\sum_{c=2}^{9}(c-1)=36. \]
The mirrored shape \(c=a+b\) contributes another 36. Together with the 9 one-digit identities \(d=d\) and the 90 two-digit identities \(m=m\), we obtain
\[ F(5)=9+90+36+36=171, \]
which matches the checkpoint exactly.
How the Code Works
Precompute Combinatorial Tables
The C++, Python, and Java implementations first build all binomial coefficients up to 25. They then compute the coefficient lists \(c_{a,a'}(s)\) for every possible pair \((a,a')\) by multiplying the nonzero-leading-digit polynomial with the unrestricted-digit polynomial.
Build Difference Tables by Residue Class
For every quadruple \((a,a',b,b')\), the implementation combines the two one-side tables into the difference counts \(T_{a,a',b,b'}(d)\). These counts are grouped by \(d\bmod 10\), so a DP state with incoming carry \(\delta\) only has to inspect the compatible residue class \(d\equiv -\delta \pmod{10}\). This is the reason the transition step stays compact.
Sweep the DP and Read Off the Answer
The DP starts from every admissible split of the total terms between the two sides. States are processed in increasing committed length. Each update chooses survivor counts, multiplies by the corresponding binomial factors, reads the precomputed difference table, computes the next carry \((d+\delta)/10\), and adds the contribution to the state with length \(u+a+b\). After the sweep finishes, the answer is the sum of all completed states with length at most \(n\). The three implementations follow this same recurrence; the C++ version can additionally spread part of the preprocessing across multiple worker threads.
Complexity Analysis
The decisive fact is that \(|E|\le 50\) forces at most 25 total terms and only 51 possible carry values. The DP storage therefore has \((n+1)(26)(26)(51)\) slots; for \(n=50\) this is about \(1.76\times 10^6\) entries, and many of them are unreachable because \(a+b\le 25\).
The main work is the sweep over reachable states together with all survivor choices \((a',b')\) and the short precomputed lists of feasible differences in the matching residue class. Memory usage is linear in the number of stored states plus the finite family of difference tables. In practice this is fast because the algorithm never enumerates actual equation strings; it counts them indirectly through column sums, carries, and survivor sets.
Footnotes and References
- Problem page: Project Euler 990
- Positional notation: Wikipedia - Positional notation
- Carrying in arithmetic: Wikipedia - Carry (arithmetic)
- Generating functions: Wikipedia - Generating function
- Binomial coefficients: Wikipedia - Binomial coefficient
- Dynamic programming: Wikipedia - Dynamic programming