Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 328: Lowest-cost Search

Project Euler Problem 328 Solution

EulerSolve provides an optimized solution for Project Euler Problem 328, Lowest-cost Search, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary An unknown integer lies in $[1,n]$. If we guess $k$ and we are wrong, we pay a cost of $k$, and we are only told whether the hidden number is smaller or larger. We then continue recursively on the surviving interval. Let $C(n)$ be the minimum possible worst-case cost, and define $S(N)=\sum_{n=1}^{N}C(n).$ The task is to compute $S(200000)$. Mathematical Approach 1) Canonical minimax interval DP. For a general interval $[l,r]$, define $F(l,r)=\min_{k\in[l,r]}\Bigl(k+\max(F(l,k-1),F(k+1,r))\Bigr),$ with base case $F(l,l)=0.$ The quantity we want is $C(n)=F(1,n).$ 2) Why the naive DP is hopeless. There are $O(n^2)$ intervals, and for each one the naive transition tries all $O(n)$ pivots. So the standard interval DP is cubic. That is completely infeasible for $n=200000.$ 3) Rewrite the first guess using the suffix length. Suppose the first guess for $[1,n]$ is $k$. Write $m=n-k,$ so $m$ is the length of the right suffix $[k+1,n]$. Then the left branch has length $k-1=n-m-1$, so its cost is exactly $L_n(m)=k+C(k-1)=(n-m)+C(n-m-1).$ This part depends only on the already-known values $C(1),C(2),\dots$. 4) The right branch is not translation-invariant, but it is almost affine. The right interval is $[k+1,n]$, not $[1,m]$....

Detailed mathematical approach

Problem Summary

An unknown integer lies in $[1,n]$. If we guess $k$ and we are wrong, we pay a cost of $k$, and we are only told whether the hidden number is smaller or larger. We then continue recursively on the surviving interval. Let $C(n)$ be the minimum possible worst-case cost, and define

$S(N)=\sum_{n=1}^{N}C(n).$

The task is to compute $S(200000)$.

Mathematical Approach

1) Canonical minimax interval DP.

For a general interval $[l,r]$, define

$F(l,r)=\min_{k\in[l,r]}\Bigl(k+\max(F(l,k-1),F(k+1,r))\Bigr),$

with base case

$F(l,l)=0.$

The quantity we want is

$C(n)=F(1,n).$

2) Why the naive DP is hopeless.

There are $O(n^2)$ intervals, and for each one the naive transition tries all $O(n)$ pivots. So the standard interval DP is cubic. That is completely infeasible for

$n=200000.$

3) Rewrite the first guess using the suffix length.

Suppose the first guess for $[1,n]$ is $k$. Write

$m=n-k,$

so $m$ is the length of the right suffix $[k+1,n]$.

Then the left branch has length $k-1=n-m-1$, so its cost is exactly

$L_n(m)=k+C(k-1)=(n-m)+C(n-m-1).$

This part depends only on the already-known values $C(1),C(2),\dots$.

4) The right branch is not translation-invariant, but it is almost affine.

The right interval is $[k+1,n]$, not $[1,m]$. Every guess inside that interval is shifted upward by $k=n-m$, so each wrong step on the right picks up an extra additive offset of $k$.

The implementation exploits a structural fact: for a suffix of length $m$, if

$d=\lfloor \log_2 m\rfloor,$

then the optimal right-branch envelope can be represented by two affine pieces in the shift $k=n-m$:

$R_{n,d}^{(1)}(m)=(n-m)(d+1)+A(m),$

$R_{n,d}^{(2)}(m)=(n-m)d+B(m).$

Here $A(m)$ and $B(m)$ are precomputed one-dimensional arrays that summarize the additive constants for the two relevant depth layers.

5) Therefore each candidate pivot is evaluated by a max of three terms.

For fixed $n$ and suffix length $m$, the first-guess cost becomes

$\operatorname{Obj}(n,m)=\max\Bigl(L_n(m),R_{n,d}^{(1)}(m),R_{n,d}^{(2)}(m)\Bigr),$

where $d=\lfloor\log_2 m\rfloor$.

So instead of solving a two-dimensional interval DP, the program only needs to minimize this objective over

$m=1,2,\dots,n-1.$

6) What the arrays $A(m)$ and $B(m)$ mean.

They encode the optimal worst-case additive costs of the right suffix after factoring out the linear dependence on the global shift $k=n-m$. The code fills them in increasing order of $m$, using the fact that intervals with the same

$d=\lfloor\log_2 m\rfloor$

belong to one depth band

$2^d\le m\le 2^{d+1}-1.$

Inside one band, the candidate formulas become simple linear transforms of $A(m)$ and $B(m)$, which makes range-minimum preprocessing possible.

7) Why sparse tables appear.

For each depth band $d$, the code stores

$A(m)-m(d+1)\quad\text{and}\quad B(m)-md$

in sparse tables. That allows fast RMQ queries inside any subrange of that band, so the best candidate of each affine family can be recovered in

$O(1)$

time after preprocessing.

8) Why only a few candidates per band are tested.

The code also uses monotonicity tests. Define the left cost

$L_n(m)=(n-m)+C(n-m-1),$

and compare it with the right envelope. Their crossings move monotonically as $m$ changes, so once the crossing location is bracketed, only a tiny neighborhood around that point can contain the minimizer. That is why the implementation checks only a handful of values near band boundaries, RMQ minima, and crossing points.

9) Small exact validation.

The program still computes the exact cubic interval DP for

$n\le 200$

and asserts that the accelerated values match it. This is the correctness safety net for the optimization.

10) Checkpoints.

The code validates

$C(1)=0,\qquad C(2)=1,\qquad C(3)=2,\qquad C(8)=12,\qquad C(100)=400,$

and also

$\sum_{n=1}^{100}C(n)=17575.$

These are strong consistency checks for both the exact and accelerated paths.

Algorithm

1) Build the helper envelopes $A(m)$ and $B(m)$ for all $m\le N$.

2) For each depth band, build sparse-table RMQ structures.

3) Compute $C(n)$ for $n=2,3,\dots,N$ by scanning depth bands and checking only the relevant local candidates.

4) Accumulate the running sum $S(N)$.

Complexity Analysis

The optimized method runs in roughly

$O(N\log N)$

time with linear arrays plus RMQ tables. This replaces the impossible cubic interval DP by a method that is fast enough for

$N=200000.$

Checks And Final Result

The checkpoints are

$C(1)=0,\quad C(2)=1,\quad C(3)=2,\quad C(8)=12,\quad C(100)=400,$