Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 149: Searching for a Maximum-Sum Subsequence

View on Project Euler

Project Euler Problem 149 Solution

EulerSolve provides an optimized solution for Project Euler Problem 149, Searching for a Maximum-Sum Subsequence, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary A \(2000 \times 2000\) table is filled row by row with a pseudo-random sequence \(s_1,s_2,\dots,s_{4{,}000{,}000}\). Among every row, every column, and both diagonal families, we must find the contiguous segment whose sum is as large as possible. The difficulty is not generating the table; it is that the search space of all contiguous segments is enormous if approached naively. The crucial simplification is that each admissible segment lies on a single one-dimensional line, and each such line can be solved optimally by the standard maximum-subarray recurrence. Mathematical Approach Let \(n=2000\). The implementations first generate the centered lagged sequence, reinterpret it as an \(n \times n\) matrix, and then reduce the entire problem to repeated one-dimensional scans. The Lagged Sequence and the Grid For \(1 \le k \le 55\), the values are defined by $s_k=\left(100003-200003k+300007k^3 \bmod 10^6\right)-500000.$ For \(56 \le k \le n^2\), the recurrence becomes $s_k=\left(s_{k-24}+s_{k-55}+10^6 \bmod 10^6\right)-500000.$ After the reduction modulo \(10^6\), every entry lies in the interval \([-500000,499999]\)....

Detailed mathematical approach

Problem Summary

A \(2000 \times 2000\) table is filled row by row with a pseudo-random sequence \(s_1,s_2,\dots,s_{4{,}000{,}000}\). Among every row, every column, and both diagonal families, we must find the contiguous segment whose sum is as large as possible.

The difficulty is not generating the table; it is that the search space of all contiguous segments is enormous if approached naively. The crucial simplification is that each admissible segment lies on a single one-dimensional line, and each such line can be solved optimally by the standard maximum-subarray recurrence.

Mathematical Approach

Let \(n=2000\). The implementations first generate the centered lagged sequence, reinterpret it as an \(n \times n\) matrix, and then reduce the entire problem to repeated one-dimensional scans.

The Lagged Sequence and the Grid

For \(1 \le k \le 55\), the values are defined by

$s_k=\left(100003-200003k+300007k^3 \bmod 10^6\right)-500000.$

For \(56 \le k \le n^2\), the recurrence becomes

$s_k=\left(s_{k-24}+s_{k-55}+10^6 \bmod 10^6\right)-500000.$

After the reduction modulo \(10^6\), every entry lies in the interval \([-500000,499999]\). The matrix is then filled in row-major order:

$A_{i,j}=s_{(i-1)n+j}\qquad (1 \le i,j \le n).$

So the problem is really about the matrix \(A\), but the generator matters because it tells us there are exactly \(4{,}000{,}000\) values and that each cell can be addressed directly through one linear index.

Reducing the Two-Dimensional Search to Four Families of Lines

Any valid subsequence must be contiguous inside one row, one column, one down-right diagonal, or one down-left diagonal. It never bends, branches, or skips entries. That lets us define four line families:

$R_i=(A_{i,1},A_{i,2},\dots,A_{i,n}),\qquad 1 \le i \le n,$

$C_j=(A_{1,j},A_{2,j},\dots,A_{n,j}),\qquad 1 \le j \le n,$

$D_p=(A_{i,j}: j-i=p),\qquad -(n-1) \le p \le n-1,$

$E_q=(A_{i,j}: i+j=q),\qquad 2 \le q \le 2n.$

If \(M(L)\) denotes the maximum sum of a contiguous block inside a line \(L\), then the final answer is

$\max\bigl(\{M(R_i)\}\cup\{M(C_j)\}\cup\{M(D_p)\}\cup\{M(E_q)\}\bigr).$

The two-dimensional part of the problem is therefore only geometric bookkeeping: identify every relevant line exactly once, then solve the same one-dimensional optimization on each of them.

The Key Recurrence for One Line

Take any scanned line \(x_1,x_2,\dots,x_m\). Define \(B_t\) as the maximum sum of a contiguous segment that must end at position \(t\). Then

$B_1=x_1,\qquad B_t=\max(x_t,\;B_{t-1}+x_t)\quad (t \ge 2).$

This recurrence is exact because an optimal segment ending at \(t\) has only two possibilities: either it starts fresh at \(t\), or it extends the best segment that already ended at \(t-1\). No third case exists.

If we also define

$G_t=\max(G_{t-1},B_t),$

then \(G_t\) is the best contiguous sum seen in the prefix \(x_1,\dots,x_t\). By the time the line is finished, \(G_m=M(L)\). This is Kadane's algorithm in its dynamic-programming form, using only constant extra memory.

Why the Diagonal Traversals Cover Everything Exactly Once

Rows and columns are obvious, but the diagonal families need slightly more care. Every down-right diagonal is determined by the difference \(j-i\), so there are \(2n-1\) of them. The implementation starts these lines from the top edge \((1,c)\) for all \(c\), then from the left edge \((r,1)\) for \(r \ge 2\). That lists every constant-\(j-i\) diagonal once, with no duplicate start point.

Similarly, every down-left diagonal is determined by the sum \(i+j\). The implementation starts them from the top edge \((1,c)\) for all \(c\), then from the right edge \((r,n)\) for \(r \ge 2\). Again we get exactly one start point for each diagonal.

The length pattern of either diagonal family is

$1,2,\dots,n-1,n,n-1,\dots,2,1,$

whose sum is \(n^2\). So each diagonal family touches exactly \(n^2\) matrix entries in total, just as the row scan and the column scan do.

Worked Example on a Single Scanned Line

Suppose one row or diagonal contains the values

$L=(-8,3,5,-2,4,-10,6,7).$

Then the best sum ending at each position is

$B=(-8,3,8,6,10,0,6,13).$

The interpretation is the whole point of the recurrence. At the sixth entry, for example, the best segment ending there has sum \(0\), because extending the earlier best segment gives \(10-10=0\), which is better than starting a new segment at \(-10\). At the eighth entry, the recurrence reaches \(13\), so the best contiguous block in this line is \(6+7=13\).

The real problem simply repeats this computation across every row, every column, and every diagonal generated from the \(2000 \times 2000\) table.

How the Code Works

Generate and Store the Sequence

The C++, Python, and Java implementations first allocate a one-dimensional container of length \(4{,}000{,}001\) so that the lagged sequence can be stored with convenient 1-based indexing. The first 55 terms use the cubic formula, and all later terms use the \((24,55)\)-lag recurrence. Matrix entry \((r,c)\) is then read directly from the row-major position \(rn+c+1\) in zero-based loop coordinates.

Sweep the Four Direction Families

After generation, the implementation performs four independent sweeps: all rows, all columns, all down-right diagonals, and all down-left diagonals. Before each new line, it resets the current line-ending optimum to a very negative sentinel, then updates that value with the recurrence \( \max(v,\text{current}+v) \) as each new matrix element \(v\) is visited.

The diagonal loops are written by boundary starts rather than by building explicit diagonal arrays. That keeps the code simple: each line is consumed in place, immediately feeding the maximum-subarray update.

Maintain a Global Maximum

While scanning a line, the implementation also updates a global best value. Once every family has been processed, that global best is exactly the answer, because every admissible contiguous subsequence has appeared in one and only one of those scans.

The three language versions use the same mathematics and the same traversal order; they differ only in syntax and container types.

Complexity Analysis

Let \(n=2000\). Sequence generation costs \(n^2\) updates. The row sweep examines \(n^2\) entries, the column sweep examines another \(n^2\), and the two diagonal families contribute \(n^2\) entries each. So the total work is

$n^2+n^2+n^2+n^2+n^2=5n^2=20{,}000{,}000$

basic value updates, which is \(O(n^2)\).

Space usage is \(O(n^2)\) in the given implementations because they store the whole table through the generated sequence. The running state for Kadane's recurrence itself is only \(O(1)\). Also, since each entry is at most \(499999\) and each line has length at most \(2000\), any line sum stays below \(2000 \cdot 499999 < 10^9\), so fixed-width integer arithmetic is sufficient for the compiled versions.

Footnotes and References

  1. Problem page: Project Euler 149
  2. Maximum subarray problem: Wikipedia - Maximum subarray problem
  3. Kadane's algorithm overview: GeeksforGeeks - Largest Sum Contiguous Subarray
  4. Lagged Fibonacci generator: Wikipedia - Lagged Fibonacci generator
  5. Diagonal families in a matrix: Wikipedia - Main diagonal

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 148 · All Project Euler solutions · Next: Problem 150

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮