Problem 917: Minimal Path Using Additive Cost
View on Project EulerProject Euler Problem 917 Solution
EulerSolve provides an optimized solution for Project Euler Problem 917, Minimal Path Using Additive Cost, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary The sequence is defined by $s_1=102022661,\qquad s_{k+1}\equiv s_k^2 \pmod{998388889}.$ From it we take $a_i=s_{2i-1},\qquad b_i=s_{2i},$ so the cost of cell \((i,j)\) is $c_{i,j}=a_i+b_j.$ The goal is to minimize the sum of the visited cell values along a path from \((1,1)\) to \((n,n)\) using only right and down moves. A direct grid dynamic program would use \(n^2\) states, which is hopeless for the actual value \(n=10^7\). The implementations therefore exploit the additive form \(a_i+b_j\), rewrite the path by the columns where the downward moves occur, and then compress the remaining one-dimensional states with a geometric lower-hull argument. Mathematical Approach The solution has three layers. First, separate the part of the cost that every monotone path must pay. Second, encode a path by a nondecreasing sequence of column choices. Third, observe that the resulting dynamic program only needs the columns lying on the lower hull of the sampled graph \(t \mapsto b_{t+1}\). Separating the unavoidable base cost Let \(P(i,j)\) denote the minimum total cost of a path ending at \((i,j)\), including the value of that cell itself. The usual recurrence is $P(1,1)=a_1+b_1,$ $P(i,j)=a_i+b_j+\min\bigl(P(i,j-1),P(i-1,j)\bigr).$ Every monotone path from the top-left corner to the bottom-right corner visits each row at least once and each column at least once....
Detailed mathematical approach
Problem Summary
The sequence is defined by
$s_1=102022661,\qquad s_{k+1}\equiv s_k^2 \pmod{998388889}.$
From it we take
$a_i=s_{2i-1},\qquad b_i=s_{2i},$
so the cost of cell \((i,j)\) is
$c_{i,j}=a_i+b_j.$
The goal is to minimize the sum of the visited cell values along a path from \((1,1)\) to \((n,n)\) using only right and down moves. A direct grid dynamic program would use \(n^2\) states, which is hopeless for the actual value \(n=10^7\). The implementations therefore exploit the additive form \(a_i+b_j\), rewrite the path by the columns where the downward moves occur, and then compress the remaining one-dimensional states with a geometric lower-hull argument.
Mathematical Approach
The solution has three layers. First, separate the part of the cost that every monotone path must pay. Second, encode a path by a nondecreasing sequence of column choices. Third, observe that the resulting dynamic program only needs the columns lying on the lower hull of the sampled graph \(t \mapsto b_{t+1}\).
Separating the unavoidable base cost
Let \(P(i,j)\) denote the minimum total cost of a path ending at \((i,j)\), including the value of that cell itself. The usual recurrence is
$P(1,1)=a_1+b_1,$
$P(i,j)=a_i+b_j+\min\bigl(P(i,j-1),P(i-1,j)\bigr).$
Every monotone path from the top-left corner to the bottom-right corner visits each row at least once and each column at least once. That means the sum
$A_i=\sum_{r=1}^{i} a_r,\qquad B_j=\sum_{c=1}^{j} b_c$
contains an unavoidable one-time contribution from every visited row and column. Subtract it by defining
$D(i,j)=P(i,j)-A_i-B_j.$
Then \(D(1,1)=0\), and the recurrence becomes
$D(i,j)=\min\bigl(D(i,j-1)+a_i,\ D(i-1,j)+b_j\bigr).$
This form is important because it shows what is really being optimized. Moving right in row \(i\) repeats the row cost \(a_i\). Moving down in column \(j\) repeats the column cost \(b_j\). The final answer is therefore
$P(n,n)=A_n+B_n+D(n,n).$
Encoding a path by the columns of its downward moves
A path from \((1,1)\) to \((n,n)\) uses exactly \(n-1\) downward moves. Let \(t_i\) be the number of right moves already performed before the \(i\)-th downward move. Then
$0\le t_1\le t_2\le \cdots \le t_{n-1}\le n-1.$
This nondecreasing sequence determines the whole path. The \(i\)-th downward move occurs in column \(t_i+1\), so it contributes \(b_{t_i+1}\). If \(r_i\) is the number of right moves made in row \(i\), then
$r_1=t_1,\qquad r_i=t_i-t_{i-1}\quad(2\le i\le n-1),\qquad r_n=(n-1)-t_{n-1}.$
The extra cost beyond the base term is therefore
$D(n,n)=\sum_{i=1}^{n} r_i a_i+\sum_{i=1}^{n-1} b_{t_i+1}.$
Now telescope the row part:
$\sum_{i=1}^{n} r_i a_i=t_1a_1+\sum_{i=2}^{n-1}(t_i-t_{i-1})a_i+\bigl((n-1)-t_{n-1}\bigr)a_n,$
so
$\sum_{i=1}^{n} r_i a_i=(n-1)a_n+\sum_{i=1}^{n-1}(a_i-a_{i+1})t_i.$
Hence the optimization problem is exactly
$D(n,n)=(n-1)a_n+\sum_{i=1}^{n-1}\Bigl((a_i-a_{i+1})t_i+b_{t_i+1}\Bigr).$
The original grid problem has now been converted into a one-dimensional optimization over nondecreasing integer sequences.
A prefix-minimum dynamic program on the down-step sequence
Introduce
$u_i=a_i-a_{i+1},\qquad g(t)=b_{t+1}.$
Let \(F_i(t)\) be the minimum contribution of the first \(i\) downward moves under the condition that \(t_i=t\). Then
$F_i(t)=u_i t+g(t)+\min_{0\le s\le t}F_{i-1}(s),$
with the initial condition \(F_0(0)=0\) and \(F_0(t)=+\infty\) for \(t>0\). Once all \(n-1\) downward moves have been processed,
$D(n,n)=(n-1)a_n+\min_{0\le t\le n-1}F_{n-1}(t).$
The recurrence shows that the previous layer is needed only through the prefix-minimum function
$M_{i-1}(t)=\min_{0\le s\le t}F_{i-1}(s).$
If we evaluated this for every \(t\), we would still spend \(O(n^2)\) time. The crucial remaining question is why the implementations can keep only a much smaller subset of columns.
Why the lower hull of \((t,b_{t+1})\) is enough
Consider the sampled points
$\bigl(t,g(t)\bigr)=\bigl(t,b_{t+1}\bigr),\qquad 0\le t\le n-1.$
Let the vertices of their lower hull be
$0=h_0\lt h_1\lt \cdots \lt h_{m-1}\le n-1.$
They are built by a monotone scan. If three consecutive candidates \(t_1\lt t_2\lt t_3\) satisfy
$\bigl(t_2-t_1\bigr)\bigl(g(t_3)-g(t_1)\bigr)-\bigl(g(t_2)-g(t_1)\bigr)\bigl(t_3-t_1\bigr)\le 0,$
then \(t_2\) is removed, because its point lies on or above the segment joining the other two.
This hull is exactly what the fast solver needs. If a column \(t\) is not on the lower hull, then there exist hull vertices \(p\lt t\lt q\) and a number \(\lambda\in(0,1)\) with \(t=\lambda p+(1-\lambda)q\) such that
$g(t)\ge \lambda g(p)+(1-\lambda)g(q).$
Therefore, for every slope \(u\),
$u t+g(t)\ge \lambda\bigl(up+g(p)\bigr)+(1-\lambda)\bigl(uq+g(q)\bigr).$
So a non-hull column can never beat both neighboring hull columns for the tilted cost \(u t+g(t)\). Adding the linear term \(u t\) is an affine shear of the point set, and affine shears preserve which sampled columns lie on the lower hull.
Now apply this to the dynamic program. The prefix-minimum function \(M_{i-1}(t)\) is piecewise constant: it only changes when a new record low appears. If those record lows occur at hull columns, then on each interval where \(M_{i-1}\) is constant, minimizing
$F_i(t)=M_{i-1}(t)+u_i t+g(t)$
is the same as minimizing a constant plus a tilted copy of \(g\), so the minimum on that interval is again attained at a hull column. This gives an induction: every prefix minimum, every layer minimum, and in particular the final optimum can be read from hull columns alone.
Worked example: \(n=2\)
For \(n=2\), the generated values are
$a_1=102022661,\quad b_1=864751430,\quad a_2=661600260,\quad b_2=328965239.$
So the grid is
$\begin{pmatrix} a_1+b_1 & a_1+b_2\\ a_2+b_1 & a_2+b_2 \end{pmatrix} = \begin{pmatrix} 966774091 & 430987900\\ 1526351690 & 990565499 \end{pmatrix}.$
There is only one down-step parameter, \(t_1\). If \(t_1=0\), the path goes down first; if \(t_1=1\), it goes right first. The unavoidable base part is
$a_1+a_2+b_1+b_2=1957339590.$
The extra part is
$D(2,2)=a_2+\min\bigl(b_1,\ a_1-a_2+b_2\bigr),$
hence
$D(2,2)=661600260+\min(864751430,-230612360)=430987900.$
Therefore
$P(2,2)=1957339590+430987900=2388327490.$
This is the same checkpoint used by the implementations and it illustrates the full decomposition into base cost plus optimized extra cost.
How the Code Works
Sequence generation and base accumulation
The C++, Python, and Java implementations first generate the deterministic sequence by repeated modular squaring. From that stream they extract all \(a_i\) and \(b_i\), while simultaneously accumulating \(\sum a_i\) and \(\sum b_i\). If \(n=1\), the answer is just \(a_1+b_1\), so no hull or dynamic program is needed.
Lower-hull construction
For \(n\ge 2\), the implementation scans the sampled points \((t,b_{t+1})\) for \(0\le t\le n-1\) and keeps only the lower-hull columns. The geometric test is the cross-product inequality written above. This preprocessing is independent of the dynamic-programming layer, because each layer differs only by adding a linear term \(u_i t\), and that does not change which sampled columns are hull vertices.
Compressed dynamic program and verification
After the hull is known, the implementation stores the surviving column indices and their \(b\)-values, then runs the recurrence only on that compressed list. Each layer is processed from left to right while maintaining the running prefix minimum from the previous layer, so only two rolling arrays are needed. At the end, the best compressed state is combined with \((n-1)a_n\) and the base sum \(\sum a_i+\sum b_i\).
The C++ implementation also keeps the original quadratic grid DP for small-size validation. It checks explicit values for \(n=1\), \(n=2\), and \(n=10\), and it compares the fast and quadratic methods for all \(1\le n\le 120\). The Python and Java implementations keep only the fast solver, but they follow the same mathematics.
Complexity Analysis
Generating the sequence and the base sums costs \(O(n)\) time. Building the lower hull is another \(O(n)\) pass. If the hull contains \(m\) surviving columns, the compressed dynamic program takes \(O(nm)\) time, because there are \(n-1\) layers and each layer scans those \(m\) states once.
The implementations store the full arrays of \(a_i\) and \(b_i\) together with two rolling DP rows, so the memory usage is \(O(n+m)\). For comparison, the validation DP on the full grid needs \(O(n^2)\) time and \(O(n)\) memory with rolling rows. The entire point of the final method is to replace an impossible \(n^2\) grid computation at \(n=10^7\) by an \(O(n)\) preprocessing phase followed by a compressed one-dimensional DP.
Footnotes and References
- Project Euler Problem 917: https://projecteuler.net/problem=917
- Dynamic programming: Wikipedia - Dynamic programming
- Convex hull: Wikipedia - Convex hull
- Prefix sum: Wikipedia - Prefix sum
- Modular arithmetic: Wikipedia - Modular arithmetic
- Convex Hull Trick overview: cp-algorithms - Convex Hull Trick