Problem 866: Tidying Up B
View on Project EulerProject Euler Problem 866 Solution
EulerSolve provides an optimized solution for Project Euler Problem 866, Tidying Up B, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary The task is to evaluate \(G(100)\) modulo $M=987654319.$ For exposition, write \(T_n=G(n)\). The implementations use the base value \(T_0=1\) and, for every \(n\ge 1\), the recurrence $T_n=\frac{n(2n-1)}{n}\sum_{i=0}^{n-1} T_i T_{n-1-i}=(2n-1)\sum_{i=0}^{n-1} T_i T_{n-1-i}.$ This is a weighted Catalan-type convolution. Every new term is built from all ordered splits of \(n-1\) into a left part of size \(i\) and a right part of size \(n-1-i\). A direct recursive evaluation would repeat the same smaller terms many times, so the practical method is a bottom-up dynamic program performed modulo \(M\). Mathematical Approach The key observation is that the entire problem is encoded by one quadratic recurrence. Once that recurrence is rewritten clearly, the rest is coefficient extraction, dynamic programming, and modular arithmetic. Step 1: Put the recurrence into a usable form The formula used by the implementations contains a factor \(n(2n-1)\) followed by division by \(n\). Algebraically, the \(n\) cancels immediately, so the mathematical recurrence is simply $T_n=(2n-1)\sum_{i=0}^{n-1} T_i T_{n-1-i}\qquad (n\ge 1),$ with \(T_0=1\). The factor \(2n-1\) depends only on the current level \(n\)....
Detailed mathematical approach
Problem Summary
The task is to evaluate \(G(100)\) modulo
$M=987654319.$
For exposition, write \(T_n=G(n)\). The implementations use the base value \(T_0=1\) and, for every \(n\ge 1\), the recurrence
$T_n=\frac{n(2n-1)}{n}\sum_{i=0}^{n-1} T_i T_{n-1-i}=(2n-1)\sum_{i=0}^{n-1} T_i T_{n-1-i}.$
This is a weighted Catalan-type convolution. Every new term is built from all ordered splits of \(n-1\) into a left part of size \(i\) and a right part of size \(n-1-i\). A direct recursive evaluation would repeat the same smaller terms many times, so the practical method is a bottom-up dynamic program performed modulo \(M\).
Mathematical Approach
The key observation is that the entire problem is encoded by one quadratic recurrence. Once that recurrence is rewritten clearly, the rest is coefficient extraction, dynamic programming, and modular arithmetic.
Step 1: Put the recurrence into a usable form
The formula used by the implementations contains a factor \(n(2n-1)\) followed by division by \(n\). Algebraically, the \(n\) cancels immediately, so the mathematical recurrence is simply
$T_n=(2n-1)\sum_{i=0}^{n-1} T_i T_{n-1-i}\qquad (n\ge 1),$
with \(T_0=1\).
The factor \(2n-1\) depends only on the current level \(n\). Everything structurally interesting happens inside the convolution
$T_0T_{n-1}+T_1T_{n-2}+\cdots+T_{n-1}T_0.$
Step 2: Recognize the Cauchy product
Define the generating function
$F(x)=\sum_{n\ge 0} T_n x^n.$
Then the square \(F(x)^2\) has coefficients
$[x^m]F(x)^2=\sum_{i=0}^{m} T_i T_{m-i}.$
Therefore the inner sum in the recurrence is exactly the coefficient of \(x^{n-1}\) in \(F(x)^2\):
$T_n=(2n-1)[x^{n-1}]F(x)^2.$
This explains why the recurrence is a textbook Cauchy-product convolution and why only previously computed terms are needed.
Step 3: Derive a generating-function identity
Substitute the coefficient formula back into the series for \(F(x)\):
$F(x)-1=\sum_{n\ge 1} T_n x^n=x\sum_{m\ge 0}(2m+1)[x^m]F(x)^2\,x^m.$
If we write
$F(x)^2=\sum_{m\ge 0} c_m x^m,$
then
$\sum_{m\ge 0}(2m+1)c_m x^m=F(x)^2+2x\bigl(F(x)^2\bigr)'.$
Hence
$F(x)-1=xF(x)^2+2x^2\bigl(F(x)^2\bigr)'=xF(x)^2+4x^2F(x)F'(x).$
This differential identity is not required to compute the answer, but it confirms that the sequence is governed entirely by the weighted square of its own generating function.
Step 4: Turn the recurrence into dynamic programming
Because \(T_n\) depends only on \(T_0,T_1,\dots,T_{n-1}\), the values can be computed in increasing order of \(n\). For a fixed \(n\), the convolution contains exactly \(n\) products. Once \(T_n\) has been stored, it can be reused by every later term. This turns an exponentially branching recursive definition into a quadratic-time table fill.
Step 5: Worked example
The first few terms show the pattern clearly:
$T_1=1\cdot(T_0T_0)=1.$
$T_2=3\cdot(T_0T_1+T_1T_0)=3(1+1)=6.$
$T_3=5\cdot(T_0T_2+T_1T_1+T_2T_0)=5(6+1+6)=65.$
$T_4=7\cdot(T_0T_3+T_1T_2+T_2T_1+T_3T_0)=7(65+6+6+65)=994.$
The value \(T_4=994\) is the natural checkpoint for the recurrence and matches the verification built into the implementations.
Step 6: Interpret the modular division correctly
Mathematically, one may multiply by \(2n-1\) directly. The implementations, however, keep the factorized form \(n(2n-1)\) and then divide by \(n\) modulo \(M\). That division is carried out as multiplication by the modular inverse of \(n\):
$T_n\equiv n(2n-1)\left(\sum_{i=0}^{n-1} T_i T_{n-1-i}\right)n^{-1}\pmod{M}.$
For the required range \(1\le n\le 100\), those inverses exist modulo \(M\), so the modular computation is exactly equivalent to the simplified recurrence.
How the Code Works
The C++, Python, and Java implementations all follow the same numerical plan. They allocate a one-dimensional table of length \(101\), place the base value \(1\) in the first slot, and then iterate upward from \(n=1\) to \(n=100\).
At each step, the implementation runs one inner loop over all split positions \(i\). This produces the convolution sum
$\sum_{i=0}^{n-1} T_i T_{n-1-i}\pmod{M}.$
After that, it applies the weight \(n(2n-1)\) and multiplies by the modular inverse of \(n\), which is mathematically the same as multiplying by \(2n-1\).
The three language versions differ only in low-level arithmetic details. The C++ and Java implementations compute modular inverses with the extended Euclidean algorithm, while the Python implementation uses modular exponentiation. All three methods produce the same residue class modulo \(M\).
Complexity Analysis
Let \(N\) be the target index. The recurrence for \(T_n\) uses exactly \(n\) products, so the total work is
$\sum_{n=1}^{N} n=\frac{N(N+1)}{2}=O(N^2).$
The modular inverse step costs only \(O(\log M)\) per level, so across all levels it contributes \(O(N\log M)\), which is lower order than the quadratic convolution. The memory usage is \(O(N)\), because only the already computed sequence values need to be stored.
Footnotes and References
- Problem page: Project Euler 866
- Cauchy product: Wikipedia - Cauchy product
- Generating function: Wikipedia - Generating function
- Catalan numbers: Wikipedia - Catalan number
- Modular multiplicative inverse: Wikipedia - Modular multiplicative inverse