Problem 887: Bounded Binary Search
View on Project EulerProject Euler Problem 887 Solution
EulerSolve provides an optimized solution for Project Euler Problem 887, Bounded Binary Search, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For the bound \(L=7^{10}\), the bounded binary-search model is encoded by a capped capacity function \(T(m,d)\). Here \(m\) is the remaining number of search steps and \(d\) is the current bound parameter. For each fixed \(d\), the quantity of interest is the smallest \(m\) that lets the search distinguish at least \(N\) possibilities. Define that inverse threshold by $R_d(N)=\min\{m\ge 0: T(m,d)\ge N\}.$ The required total is $A=\sum_{N=1}^{L}R_0(N)+\sum_{d=1}^{7}\sum_{N=1}^{L}R_d(N),\qquad L=7^{10}.$ The difficulty is that \(T(m,d)\) is recursive, but the implementations exploit three facts: every value is capped at \(L\), the sequence is monotone in \(m\), and the inverse thresholds can be summed by jump intervals rather than one query at a time. Mathematical Approach The implementations work with the following capped recurrence: $T(m,d)=\min\left(L,\ T(m-1,d-1)+T\left(m-1,\ T(m-1,d-1)+d-1\right)\right),$ together with boundary rules that make the recursion finite and allow an efficient inversion. Step 1: Define the Capped Search Capacity The boundary conditions are $T(0,d)=1\quad(d\ge -1),\qquad T(m,-1)=1,\qquad T(m,d)=0\quad(d<-1).$ These rules say that once there is no remaining search depth, or once the parameter has reached the terminal boundary \(-1\), only one outcome is still distinguishable....
Detailed mathematical approach
Problem Summary
For the bound \(L=7^{10}\), the bounded binary-search model is encoded by a capped capacity function \(T(m,d)\). Here \(m\) is the remaining number of search steps and \(d\) is the current bound parameter. For each fixed \(d\), the quantity of interest is the smallest \(m\) that lets the search distinguish at least \(N\) possibilities.
Define that inverse threshold by
$R_d(N)=\min\{m\ge 0: T(m,d)\ge N\}.$
The required total is
$A=\sum_{N=1}^{L}R_0(N)+\sum_{d=1}^{7}\sum_{N=1}^{L}R_d(N),\qquad L=7^{10}.$
The difficulty is that \(T(m,d)\) is recursive, but the implementations exploit three facts: every value is capped at \(L\), the sequence is monotone in \(m\), and the inverse thresholds can be summed by jump intervals rather than one query at a time.
Mathematical Approach
The implementations work with the following capped recurrence:
$T(m,d)=\min\left(L,\ T(m-1,d-1)+T\left(m-1,\ T(m-1,d-1)+d-1\right)\right),$
together with boundary rules that make the recursion finite and allow an efficient inversion.
Step 1: Define the Capped Search Capacity
The boundary conditions are
$T(0,d)=1\quad(d\ge -1),\qquad T(m,-1)=1,\qquad T(m,d)=0\quad(d<-1).$
These rules say that once there is no remaining search depth, or once the parameter has reached the terminal boundary \(-1\), only one outcome is still distinguishable. If the parameter falls below that boundary, the branch contributes nothing.
Every value is truncated at \(L\). This cap is mathematically safe because the final sum only asks about thresholds \(1\le N\le L\), so any larger capacity is indistinguishable from \(L\).
Step 2: Recognize the Ordinary Binary-Search Regime
When the bound parameter is already large compared with the remaining depth, the restriction disappears and the search becomes a full binary tree. In that regime,
$T(m,d)=\min(2^m,L),\qquad d\ge m-1.$
This shortcut is one of the key optimizations. Since
$2^{29}=536{,}870{,}912>7^{10}=282{,}475{,}249,$
any completely unrestricted branch reaches the cap by depth \(29\).
Step 3: Invert the Monotone Capacity Function
For fixed \(d\), the sequence \(T(0,d),T(1,d),T(2,d),\dots\) is nondecreasing in \(m\). Intuitively, allowing one more search step cannot reduce the number of distinguishable outcomes, and the recurrence only combines previously computed nonnegative capacities.
Therefore the inverse threshold
$R_d(N)=\min\{m\ge 0:T(m,d)\ge N\}$
is well defined for every \(1\le N\le L\). The problem is thus a sum of inverse threshold values, not a sum of the capacities themselves.
Step 4: Simplify the Special Case \(d=0\)
Substituting \(d=0\) into the recurrence gives
$T(m,0)=T(m-1,-1)+T(m-1,0)=1+T(m-1,0),$
with initial value \(T(0,0)=1\). Hence
$T(m,0)=m+1.$
Inverting this linear sequence yields
$R_0(N)=N-1.$
That is why the \(d=0\) contribution is handled separately as the arithmetic series
$\sum_{N=1}^{L}(N-1)=\frac{L(L-1)}{2}.$
Step 5: Turn the Inverse Sum into a Jump Histogram
Suppose
$T(m-1,d)<N\le T(m,d).$
Then the smallest admissible depth for that \(N\) is exactly \(m\), so every \(N\) in the jump interval \((T(m-1,d),T(m,d)]\) contributes \(m\) to the total. If we define \(T(-1,d)=0\), then
$\sum_{N=1}^{L}R_d(N)=\sum_{m\ge 0} m\bigl(T(m,d)-T(m-1,d)\bigr).$
This identity is the main computational trick. Instead of asking for \(R_d(N)\) separately for every \(N\), one scans \(m=0,1,2,\dots\) until \(T(m,d)=L\) and adds one weighted jump per step.
Worked Example
The first two nontrivial parameters already illustrate the pattern.
For \(d=1\), the values are
$\begin{aligned} T(0,1)&=1,\\ T(1,1)&=2,\\ T(2,1)&=4,\\ T(3,1)&=T(2,0)+T(2,3)=3+4=7,\\ T(4,1)&=T(3,0)+T(3,4)=4+8=12. \end{aligned}$
So the thresholds break into intervals
$\begin{aligned} (0,1]&\Rightarrow R_1(N)=0,\\ (1,2]&\Rightarrow R_1(N)=1,\\ (2,4]&\Rightarrow R_1(N)=2,\\ (4,7]&\Rightarrow R_1(N)=3. \end{aligned}$
In particular, \(R_1(7)=3\).
For \(d=2\), once \(m\ge 4\), the second recursive branch is already in the unrestricted regime, so
$T(m,2)=T(m-1,1)+2^{m-1}\qquad(m\ge 4).$
Using \(T(9,1)=265\), we get
$T(10,2)=T(9,1)+2^9=265+512=777,$
hence \(R_2(777)=10\), matching the checkpoint used by the implementations.
How the Code Works
The C++, Python, and Java implementations memoize each previously computed state, indexed by the current depth and the effective bound parameter. Each lookup first applies the boundary rules above. If the unrestricted branch condition \(d\ge m-1\) is met, the value is returned immediately as \(\min(2^m,L)\); otherwise the program evaluates the two smaller recursive states, adds them, and caps the sum at \(L\).
After that memoized engine is available, the implementation handles \(d=0\) through the closed form \(\frac{L(L-1)}2\). For each \(d=1,\dots,7\), it increases \(m\) from \(0\) upward, keeps the previous capacity value, and accumulates
$m\bigl(T(m,d)-T(m-1,d)\bigr)$
until the cap \(L\) is reached. The final accumulation uses integer types large enough to avoid overflow in every language.
Complexity Analysis
Fix a parameter \(d\), and let \(M_d\) be the first depth for which \(T(M_d,d)=L\). Let \(S_d\) be the set of distinct memoized states visited while evaluating the capacities for depths \(0\) through \(M_d\). Because each state is computed at most once and later reused, the running time for that \(d\) is \(O(|S_d|)\), and the memo storage is also \(O(|S_d|)\).
Therefore the full program costs
$O\left(\sum_{d=1}^{7}|S_d|\right)$
time and
$O\left(\max_{1\le d\le 7}|S_d|\right)$
auxiliary memory, plus negligible overhead for the final big-integer style additions. The cap at \(L\) and the closed binary-search branch keep the explored state space far smaller than the naive recursion tree.
Footnotes and References
- Problem page: https://projecteuler.net/problem=887
- Binary search: Wikipedia - Binary search algorithm
- Memoization: Wikipedia - Memoization
- Monotonic function: Wikipedia - Monotonic function
- Recurrence relation: Wikipedia - Recurrence relation