Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 328: Lowest-cost Search

View on Project Euler

Project Euler Problem 328 Solution

EulerSolve provides an optimized solution for Project Euler Problem 328, Lowest-cost Search, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary An unknown integer lies in \([1,n]\). If we guess \(k\) and we are wrong, we pay a cost of \(k\), and we are only told whether the hidden number is smaller or larger. We then continue recursively on the surviving interval. Let \(C(n)\) be the minimum possible worst-case cost, and define $S(N)=\sum_{n=1}^{N}C(n).$ The task is to compute \(S(200000)\). Mathematical Approach 1) Canonical minimax interval DP. For a general interval \([l,r]\), define $F(l,r)=\min_{k\in[l,r]}\Bigl(k+\max(F(l,k-1),F(k+1,r))\Bigr),$ with base case $F(l,l)=0.$ The quantity we want is $C(n)=F(1,n).$ 2) Why the naive DP is hopeless. There are \(O(n^2)\) intervals, and for each one the naive transition tries all \(O(n)\) pivots. So the standard interval DP is cubic. That is completely infeasible for $n=200000.$ 3) Rewrite the first guess using the suffix length. Suppose the first guess for \([1,n]\) is \(k\). Write $m=n-k,$ so \(m\) is the length of the right suffix \([k+1,n]\). Then the left branch has length \(k-1=n-m-1\), so its cost is exactly $L_n(m)=k+C(k-1)=(n-m)+C(n-m-1).$ This part depends only on the already-known values \(C(1),C(2),\dots\). 4) The right branch is not translation-invariant, but it is almost affine. The right interval is \([k+1,n]\), not \([1,m]\)....

Detailed mathematical approach

Problem Summary

An unknown integer lies in \([1,n]\). If we guess \(k\) and we are wrong, we pay a cost of \(k\), and we are only told whether the hidden number is smaller or larger. We then continue recursively on the surviving interval. Let \(C(n)\) be the minimum possible worst-case cost, and define

$S(N)=\sum_{n=1}^{N}C(n).$

The task is to compute \(S(200000)\).

Mathematical Approach

1) Canonical minimax interval DP.

For a general interval \([l,r]\), define

$F(l,r)=\min_{k\in[l,r]}\Bigl(k+\max(F(l,k-1),F(k+1,r))\Bigr),$

with base case

$F(l,l)=0.$

The quantity we want is

$C(n)=F(1,n).$

2) Why the naive DP is hopeless.

There are \(O(n^2)\) intervals, and for each one the naive transition tries all \(O(n)\) pivots. So the standard interval DP is cubic. That is completely infeasible for

$n=200000.$

3) Rewrite the first guess using the suffix length.

Suppose the first guess for \([1,n]\) is \(k\). Write

$m=n-k,$

so \(m\) is the length of the right suffix \([k+1,n]\).

Then the left branch has length \(k-1=n-m-1\), so its cost is exactly

$L_n(m)=k+C(k-1)=(n-m)+C(n-m-1).$

This part depends only on the already-known values \(C(1),C(2),\dots\).

4) The right branch is not translation-invariant, but it is almost affine.

The right interval is \([k+1,n]\), not \([1,m]\). Every guess inside that interval is shifted upward by \(k=n-m\), so each wrong step on the right picks up an extra additive offset of \(k\).

The implementation exploits a structural fact: for a suffix of length \(m\), if

$d=\lfloor \log_2 m\rfloor,$

then the optimal right-branch envelope can be represented by two affine pieces in the shift \(k=n-m\):

$R_{n,d}^{(1)}(m)=(n-m)(d+1)+A(m),$

$R_{n,d}^{(2)}(m)=(n-m)d+B(m).$

Here \(A(m)\) and \(B(m)\) are precomputed one-dimensional arrays that summarize the additive constants for the two relevant depth layers.

5) Therefore each candidate pivot is evaluated by a max of three terms.

For fixed \(n\) and suffix length \(m\), the first-guess cost becomes

$\operatorname{Obj}(n,m)=\max\Bigl(L_n(m),R_{n,d}^{(1)}(m),R_{n,d}^{(2)}(m)\Bigr),$

where \(d=\lfloor\log_2 m\rfloor\).

So instead of solving a two-dimensional interval DP, the program only needs to minimize this objective over

$m=1,2,\dots,n-1.$

6) What the arrays \(A(m)\) and \(B(m)\) mean.

They encode the optimal worst-case additive costs of the right suffix after factoring out the linear dependence on the global shift \(k=n-m\). The code fills them in increasing order of \(m\), using the fact that intervals with the same

$d=\lfloor\log_2 m\rfloor$

belong to one depth band

$2^d\le m\le 2^{d+1}-1.$

Inside one band, the candidate formulas become simple linear transforms of \(A(m)\) and \(B(m)\), which makes range-minimum preprocessing possible.

7) Why sparse tables appear.

For each depth band \(d\), the code stores

$A(m)-m(d+1)\quad\text{and}\quad B(m)-md$

in sparse tables. That allows fast RMQ queries inside any subrange of that band, so the best candidate of each affine family can be recovered in

$O(1)$

time after preprocessing.

8) Why only a few candidates per band are tested.

The code also uses monotonicity tests. Define the left cost

$L_n(m)=(n-m)+C(n-m-1),$

and compare it with the right envelope. Their crossings move monotonically as \(m\) changes, so once the crossing location is bracketed, only a tiny neighborhood around that point can contain the minimizer. That is why the implementation checks only a handful of values near band boundaries, RMQ minima, and crossing points.

9) Small exact validation.

The program still computes the exact cubic interval DP for

$n\le 200$

and asserts that the accelerated values match it. This is the correctness safety net for the optimization.

10) Checkpoints.

The code validates

$C(1)=0,\qquad C(2)=1,\qquad C(3)=2,\qquad C(8)=12,\qquad C(100)=400,$

and also

$\sum_{n=1}^{100}C(n)=17575.$

These are strong consistency checks for both the exact and accelerated paths.

Algorithm

1) Build the helper envelopes \(A(m)\) and \(B(m)\) for all \(m\le N\).

2) For each depth band, build sparse-table RMQ structures.

3) Compute \(C(n)\) for \(n=2,3,\dots,N\) by scanning depth bands and checking only the relevant local candidates.

4) Accumulate the running sum \(S(N)\).

Complexity Analysis

The optimized method runs in roughly

$O(N\log N)$

time with linear arrays plus RMQ tables. This replaces the impossible cubic interval DP by a method that is fast enough for

$N=200000.$

Checks And Final Result

The checkpoints are

$C(1)=0,\quad C(2)=1,\quad C(3)=2,\quad C(8)=12,\quad C(100)=400,$

$\sum_{n=1}^{100}C(n)=17575.$

The final answer is

$S(200000)=260511850222.$

Further Reading

  1. Problem page: https://projecteuler.net/problem=328
  2. Minimax decision trees: https://en.wikipedia.org/wiki/Minimax
  3. Sparse table RMQ: https://cp-algorithms.com/data_structures/sparse-table.html

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 327 · All Project Euler solutions · Next: Problem 329

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮