Problem 502: Counting Castles
View on Project EulerProject Euler Problem 502 Solution
EulerSolve provides an optimized solution for Project Euler Problem 502, Counting Castles, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For each pair \((w,h)\), we must count castles of width \(w\) and exact height \(h\) that satisfy the parity condition built into the problem statement, with every result taken modulo \(10^9+7\). The final answer is the sum of three such evaluations. Direct enumeration is hopeless, so the implementation turns the construction into coefficient extraction from rational generating functions. Mathematical Approach The implementation works with cumulative counts up to a height bound and only at the end converts them into the exact-height quantity required by the problem. Step 1: Encode One Height Step by a Two-State Transfer Let \(x\) mark the width. The castle decomposition used by the implementation can be summarized by a two-state transfer, and there are two versions of that transfer: $M_{+}(x)=\begin{pmatrix}1+x & x\\ -x & 1-x\end{pmatrix},\qquad M_{-}(x)=\begin{pmatrix}-(1+x) & -x\\ -x & 1-x\end{pmatrix}.$ The first matrix counts every admissible object positively. The second matrix attaches a sign so that averaging the two channels later will keep only the even-parity contribution required by the problem....
Detailed mathematical approach
Problem Summary
For each pair \((w,h)\), we must count castles of width \(w\) and exact height \(h\) that satisfy the parity condition built into the problem statement, with every result taken modulo \(10^9+7\). The final answer is the sum of three such evaluations. Direct enumeration is hopeless, so the implementation turns the construction into coefficient extraction from rational generating functions.
Mathematical Approach
The implementation works with cumulative counts up to a height bound and only at the end converts them into the exact-height quantity required by the problem.
Step 1: Encode One Height Step by a Two-State Transfer
Let \(x\) mark the width. The castle decomposition used by the implementation can be summarized by a two-state transfer, and there are two versions of that transfer:
$M_{+}(x)=\begin{pmatrix}1+x & x\\ -x & 1-x\end{pmatrix},\qquad M_{-}(x)=\begin{pmatrix}-(1+x) & -x\\ -x & 1-x\end{pmatrix}.$
The first matrix counts every admissible object positively. The second matrix attaches a sign so that averaging the two channels later will keep only the even-parity contribution required by the problem.
Step 2: Raise the Transfer to Height \(h\)
Starting from the base vector
$v_0=\binom{0}{1},$
the effect of allowing height at most \(h\) is obtained from
$M_{\pm}(x)^h v_0=\binom{P_h^{(\pm)}(x)}{Q_h^{(\pm)}(x)}.$
The corresponding cumulative generating functions are
$R_h^{(+)}(x)=\frac{P_h^{(+)}(x)}{Q_h^{(+)}(x)},\qquad R_h^{(-)}(x)=\frac{P_h^{(-)}(x)}{Q_h^{(-)}(x)}.$
If we write
$S_h(w)=[x^w]R_h^{(+)}(x),\qquad D_h(w)=[x^w]R_h^{(-)}(x),$
then \(S_h(w)\) and \(D_h(w)\) are the two cumulative counts for width \(w\) and height at most \(h\).
Step 3: Project to the Required Parity and to Exact Height
The signed channel is designed so that odd-parity objects cancel and even-parity objects survive. Therefore the cumulative even count is
$E_h(w)=\frac{S_h(w)+D_h(w)}{2}\pmod{10^9+7}.$
But the problem asks for exact height \(h\), not height at most \(h\). So we subtract the previous cumulative level:
$F(w,h)=E_h(w)-E_{h-1}(w)\pmod{10^9+7}.$
This is the quantity reported by the implementation for each query.
Step 4: Large Width Uses Bostan-Mori Coefficient Extraction
Suppose we need one coefficient of a rational function \(P(x)/Q(x)\), namely \([x^k]P(x)/Q(x)\). For very large \(w\), the implementation avoids expanding the whole series. It defines
$\widetilde{Q}(x)=Q(-x),\qquad U(x)=P(x)\widetilde{Q}(x),\qquad V(x)=Q(x)\widetilde{Q}(x).$
The polynomial \(V(x)\) is even, so only even powers remain in the denominator. If \(k=2m\), we keep the even coefficients of \(U\); if \(k=2m+1\), we keep the odd coefficients. After that selection, the target coefficient becomes a coefficient of index \(m\) in a new rational function. Repeating this halves the index again and again until only a constant term remains. This is the Bostan-Mori step used when width is enormous and height is relatively small.
Step 5: Small Width Uses Truncated Formal Power Series
When \(w\) is moderate, only coefficients up to degree \(w\) matter. In that regime every polynomial multiplication is truncated modulo \(x^{w+1}\), so the matrix power never grows beyond degree \(w\). Once \(P(x)\) and \(Q(x)\) are known up to that degree, the implementation computes the inverse series of \(Q(x)\):
$Q(x)^{-1}\equiv I(x)\pmod{x^{w+1}},\qquad I(x)=\sum_{n=0}^{w} i_n x^n.$
Because the constant term of \(Q(x)\) is nonzero, the coefficients of \(I(x)\) are determined recursively by
$i_0=q_0^{-1},\qquad i_n=-q_0^{-1}\sum_{j=1}^{n} q_j\,i_{n-j}\pmod{10^9+7}.$
Then
$[x^w]\frac{P(x)}{Q(x)}=[x^w]\bigl(P(x)I(x)\bigr),$
so the desired count is obtained without ever expanding beyond the needed degree.
Worked Example: \((w,h)=(4,2)\)
The implementation validates itself on the checkpoint \(F(4,2)=10\). For height \(2\), the positive channel gives
$M_{+}(x)^2 v_0=\binom{2x}{1-2x},\qquad R_2^{(+)}(x)=\frac{2x}{1-2x}=2x+4x^2+8x^3+16x^4+\cdots.$
The signed channel gives
$M_{-}(x)^2 v_0=\binom{2x^2}{1-2x+2x^2},\qquad R_2^{(-)}(x)=\frac{2x^2}{1-2x+2x^2}=2x^2+4x^3+4x^4+\cdots.$
Therefore
$S_2(4)=16,\qquad D_2(4)=4,\qquad E_2(4)=\frac{16+4}{2}=10.$
For height \(1\), the two channels are
$R_1^{(+)}(x)=\frac{x}{1-x},\qquad R_1^{(-)}(x)=-\frac{x}{1-x},$
so their average contributes \(0\) at every width. Hence
$F(4,2)=E_2(4)-E_1(4)=10-0=10,$
exactly as required.
Step 6: Final Composition for the Whole Problem
After defining \(F(w,h)\) as above, the requested answer is
$F(10^{12},100)+F(10^4,10^4)+F(100,10^{12})\pmod{10^9+7}.$
The entire job is therefore reduced to evaluating the same exact-height routine three times.
How the Code Works
The C++, Python, and Java implementations all follow the same mathematical pipeline. For a given pair \((w,h)\), they compute the cumulative count for height at most \(h\) in the positive channel and in the signed channel, repeat the same computation for height at most \(h-1\), average the two channel results with the modular inverse of \(2\), and subtract the previous level to isolate exact height.
Inside one channel, the implementation first raises the relevant \(2\times 2\) polynomial matrix to the required height by binary exponentiation. It then extracts the coefficient of \(x^w\) from the resulting rational function. When width is huge, it uses repeated Bostan-Mori halving; when width is smaller, it truncates every polynomial to degree \(w\), computes a formal inverse for the denominator, and reads the target coefficient directly from the truncated product.
The Python implementation is only a thin launcher around the same arithmetic, while the C++ and Java implementations carry out the polynomial and matrix operations directly. The mathematical result is identical in all three languages.
Complexity Analysis
Let \(M(n)\) denote the cost of multiplying degree-\(n\) polynomials. In these implementations the underlying polynomial multiplication is quadratic, so \(M(n)=O(n^2)\).
In the Bostan-Mori branch, the matrix power for height \(h\) produces polynomials of degree \(O(h)\). The cost of building the rational function is therefore \(O(h^2\log h)\), and the repeated coefficient-halving stage adds \(O(h^2\log w)\). Memory usage is \(O(h)\).
In the truncated branch, all polynomial degrees are clipped at \(w\). Binary exponentiation of the matrix then costs \(O(w^2\log h)\), the inverse-series recursion costs \(O(w^2)\), and memory usage is \(O(w)\). The implementation switches between the two branches because the three required queries live in very different parameter regimes.
Footnotes and References
- Problem page: https://projecteuler.net/problem=502
- Generating function: Wikipedia - Generating function
- Formal power series: Wikipedia - Formal power series
- Polynomial coefficient extraction and Bostan-Mori style reduction: cp-algorithms - Polynomial operations