Problem 503: Compromise or Persist
View on Project EulerProject Euler Problem 503 Solution
EulerSolve provides an optimized solution for Project Euler Problem 503, Compromise or Persist, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary There are \(n\) rounds, and the hidden absolute ranks \(1,2,\dots,n\) appear in uniformly random order, where rank \(1\) is best. At round \(t\), you do not know the absolute rank of the current item; you only know its relative rank among the first \(t\) items seen so far. After seeing that relative rank, you must choose between stopping immediately and accepting the current item, or continuing to the next round. If you reach the final round, you must accept the last remaining item. The goal is to minimize the expected absolute rank of the accepted item. Let \(F(n)\) denote that minimum expected value. A brute-force search over all decision trees is hopeless, so the solution uses backward dynamic programming and the fact that the stopping value depends only on the observed relative rank. Mathematical Approach Let \(V_t\) be the minimum expected final absolute rank when round \(t\) is about to begin, before the relative rank of the current item has been revealed. The desired answer is \(F(n)=V_1\). Step 1: Define the State and the Boundary Condition At the last round there is no decision left: the current item must be accepted....
Detailed mathematical approach
Problem Summary
There are \(n\) rounds, and the hidden absolute ranks \(1,2,\dots,n\) appear in uniformly random order, where rank \(1\) is best. At round \(t\), you do not know the absolute rank of the current item; you only know its relative rank among the first \(t\) items seen so far. After seeing that relative rank, you must choose between stopping immediately and accepting the current item, or continuing to the next round. If you reach the final round, you must accept the last remaining item.
The goal is to minimize the expected absolute rank of the accepted item. Let \(F(n)\) denote that minimum expected value. A brute-force search over all decision trees is hopeless, so the solution uses backward dynamic programming and the fact that the stopping value depends only on the observed relative rank.
Mathematical Approach
Let \(V_t\) be the minimum expected final absolute rank when round \(t\) is about to begin, before the relative rank of the current item has been revealed. The desired answer is \(F(n)=V_1\).
Step 1: Define the State and the Boundary Condition
At the last round there is no decision left: the current item must be accepted. Because every absolute rank from \(1\) to \(n\) is equally likely to be the final unseen item, the expected rank at round \(n\) is simply the average of all ranks:
$V_n=\frac{1+2+\cdots+n}{n}=\frac{n+1}{2}.$
This gives the base case for the backward recurrence. Every earlier round will compare the value of stopping now against the value \(V_{t+1}\) of postponing the decision.
Step 2: Convert Relative Rank into Expected Absolute Rank
Suppose that at round \(t\) the observed item is the \(r\)-th best among the first \(t\) items, with \(1\le r\le t\). We need the expected absolute rank of that item among all \(n\) items.
If the \(t\) revealed absolute ranks are written in increasing order as
$1\le a_1<a_2<\cdots<a_t\le n,$
then the observed item has absolute rank \(a_r\). Now adjoin the endpoints \(a_0=0\) and \(a_{t+1}=n+1\). The \(t+1\) gaps
$a_1-a_0,\ a_2-a_1,\ \dots,\ a_{t+1}-a_t$
are exchangeable and their sum is \(n+1\). Therefore each gap has expected size
$\frac{n+1}{t+1}.$
Adding the first \(r\) expected gaps gives
$\mathbb{E}[a_r]=\frac{r(n+1)}{t+1}.$
So if we stop after seeing relative rank \(r\) at round \(t\), the expected final absolute rank is
$S_t(r)=\frac{n+1}{t+1}\,r.$
It is convenient to write
$s_t=\frac{n+1}{t+1},\qquad S_t(r)=s_t r.$
Step 3: Prove that the Optimal Rule is a Threshold Rule
At round \(t\), once the relative rank \(r\) is known, there are only two possible costs: stopping costs \(s_t r\), while continuing costs \(V_{t+1}\).
Stopping is optimal exactly when
$s_t r\le V_{t+1}.$
Because \(s_t r\) increases linearly with \(r\), the set of ranks for which we stop is an initial segment \(r=1,2,\dots,k_t\). Hence the policy at every round is characterized by a single threshold
$k_t=\left\lfloor\frac{V_{t+1}}{s_t}\right\rfloor,$
with the obvious clipping
$0\le k_t\le t.$
If \(k_t=0\), we always continue. If \(k_t=t\), we always stop. Otherwise we stop precisely when the observed relative rank is at most \(k_t\).
Step 4: Average over the \(t\) Possible Relative Ranks
For a random permutation, the relative rank of the current item among the first \(t\) items is uniformly distributed on \(\{1,2,\dots,t\}\). Therefore each \(r\) occurs with probability \(1/t\).
Using the threshold rule, the Bellman update is
$V_t=\frac{1}{t}\left(\sum_{r=1}^{k_t} s_t r+\sum_{r=k_t+1}^{t} V_{t+1}\right).$
The second sum is simply \((t-k_t)V_{t+1}\), and the first uses the triangular-number identity
$\sum_{r=1}^{k_t} r=\frac{k_t(k_t+1)}{2}.$
Thus
$V_t=\frac{1}{t}\left(s_t\frac{k_t(k_t+1)}{2}+(t-k_t)V_{t+1}\right).$
This recurrence matches the program line for line.
Step 5: Evaluate the Recurrence Backward
Starting from \(V_n=(n+1)/2\), we compute
$V_{n-1},\ V_{n-2},\ \dots,\ V_1$
one round at a time. Each update needs only the value from the next round, so the entire dynamic program can be run with constant memory. No large table is required even when \(n=10^6\).
The implementations also protect the floor operation from roundoff drift by adding a tiny positive tolerance before taking \(\lfloor\cdot\rfloor\), then clamping the threshold into the valid interval \([0,t]\).
Worked Example: \(n=4\)
The checkpoint \(F(4)=15/8\) can be reproduced directly from the recurrence.
At the last round,
$V_4=\frac{5}{2}.$
For \(t=3\), we have \(s_3=5/4\), so
$k_3=\left\lfloor\frac{V_4}{s_3}\right\rfloor=\left\lfloor\frac{5/2}{5/4}\right\rfloor=\lfloor 2\rfloor=2.$
Therefore
$V_3=\frac{1}{3}\left(\frac{5}{4}(1+2)+1\cdot\frac{5}{2}\right)=\frac{25}{12}.$
For \(t=2\), we have \(s_2=5/3\), hence
$k_2=\left\lfloor\frac{V_3}{s_2}\right\rfloor=\left\lfloor\frac{25/12}{5/3}\right\rfloor=\left\lfloor\frac{5}{4}\right\rfloor=1,$
and so
$V_2=\frac{1}{2}\left(\frac{5}{3}\cdot 1+1\cdot\frac{25}{12}\right)=\frac{15}{8}.$
For \(t=1\), we get \(s_1=5/2\) and
$k_1=\left\lfloor\frac{15/8}{5/2}\right\rfloor=\left\lfloor\frac{3}{4}\right\rfloor=0.$
So the first round is never accepted immediately, and
$V_1=V_2=\frac{15}{8}.$
Hence
$F(4)=\frac{15}{8},$
which agrees with the checkpoint used by the implementation.
How the Code Works
The C++, Python, and Java implementations all follow the same compact loop. They begin with the forced-final-round value \((n+1)/2\), interpreted as the value for the next round. Then, for each \(t\) from \(n-1\) down to \(1\), they compute the slope \(s_t=(n+1)/(t+1)\), divide the next-round value by that slope to obtain the stopping threshold, round it down with a small tolerance, and clamp it to the interval \([0,t]\).
After that, the implementation evaluates the stopping contribution with the closed form \(1+2+\cdots+k_t=k_t(k_t+1)/2\), adds the continuation contribution \((t-k_t)V_{t+1}\), divides by \(t\), and overwrites the stored value with the newly computed \(V_t\). The loop therefore stores only one floating-point state instead of an array of all \(V_t\) values. One of the implementations also checks the small cases \(n=3\), \(n=4\), and \(n=10\) before printing the answer for \(n=1{,}000{,}000\).
Complexity Analysis
Each round performs only a constant number of arithmetic operations, and the triangular sum is replaced by a closed formula. Therefore the running time is
$O(n),$
and the memory usage is
$O(1).$
For the target input, the computation is simply a single backward pass from \(t=n-1\) down to \(t=1\).
Footnotes and References
- Problem page: https://projecteuler.net/problem=503
- Optimal stopping: Wikipedia — Optimal stopping
- Order statistic: Wikipedia — Order statistic
- Bellman equation: Wikipedia — Bellman equation