Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 503: Compromise or Persist

Project Euler Problem 503 Solution

EulerSolve provides an optimized solution for Project Euler Problem 503, Compromise or Persist, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary There are $n$ rounds, and the hidden absolute ranks $1,2,\dots,n$ appear in uniformly random order, where rank $1$ is best. At round $t$, you do not know the absolute rank of the current item; you only know its relative rank among the first $t$ items seen so far. After seeing that relative rank, you must choose between stopping immediately and accepting the current item, or continuing to the next round. If you reach the final round, you must accept the last remaining item. The goal is to minimize the expected absolute rank of the accepted item. Let $F(n)$ denote that minimum expected value. A brute-force search over all decision trees is hopeless, so the solution uses backward dynamic programming and the fact that the stopping value depends only on the observed relative rank. Mathematical Approach Let $V_t$ be the minimum expected final absolute rank when round $t$ is about to begin, before the relative rank of the current item has been revealed. The desired answer is $F(n)=V_1$. Step 1: Define the State and the Boundary Condition At the last round there is no decision left: the current item must be accepted....

Detailed mathematical approach

Problem Summary

There are $n$ rounds, and the hidden absolute ranks $1,2,\dots,n$ appear in uniformly random order, where rank $1$ is best. At round $t$, you do not know the absolute rank of the current item; you only know its relative rank among the first $t$ items seen so far. After seeing that relative rank, you must choose between stopping immediately and accepting the current item, or continuing to the next round. If you reach the final round, you must accept the last remaining item.

The goal is to minimize the expected absolute rank of the accepted item. Let $F(n)$ denote that minimum expected value. A brute-force search over all decision trees is hopeless, so the solution uses backward dynamic programming and the fact that the stopping value depends only on the observed relative rank.

Mathematical Approach

Let $V_t$ be the minimum expected final absolute rank when round $t$ is about to begin, before the relative rank of the current item has been revealed. The desired answer is $F(n)=V_1$.

Step 1: Define the State and the Boundary Condition

At the last round there is no decision left: the current item must be accepted. Because every absolute rank from $1$ to $n$ is equally likely to be the final unseen item, the expected rank at round $n$ is simply the average of all ranks:

$V_n=\frac{1+2+\cdots+n}{n}=\frac{n+1}{2}.$

This gives the base case for the backward recurrence. Every earlier round will compare the value of stopping now against the value $V_{t+1}$ of postponing the decision.

Step 2: Convert Relative Rank into Expected Absolute Rank

Suppose that at round $t$ the observed item is the $r$-th best among the first $t$ items, with $1\le r\le t$. We need the expected absolute rank of that item among all $n$ items.

If the $t$ revealed absolute ranks are written in increasing order as

$1\le a_1<a_2<\cdots<a_t\le n,$

then the observed item has absolute rank $a_r$. Now adjoin the endpoints $a_0=0$ and $a_{t+1}=n+1$. The $t+1$ gaps

$a_1-a_0,\ a_2-a_1,\ \dots,\ a_{t+1}-a_t$

are exchangeable and their sum is $n+1$. Therefore each gap has expected size

$\frac{n+1}{t+1}.$

Adding the first $r$ expected gaps gives

$\mathbb{E}[a_r]=\frac{r(n+1)}{t+1}.$

So if we stop after seeing relative rank $r$ at round $t$, the expected final absolute rank is

$S_t(r)=\frac{n+1}{t+1}\,r.$

It is convenient to write

$s_t=\frac{n+1}{t+1},\qquad S_t(r)=s_t r.$

Step 3: Prove that the Optimal Rule is a Threshold Rule

At round $t$, once the relative rank $r$ is known, there are only two possible costs: stopping costs $s_t r$, while continuing costs $V_{t+1}$.

Stopping is optimal exactly when

$s_t r\le V_{t+1}.$

Because $s_t r$ increases linearly with $r$, the set of ranks for which we stop is an initial segment $r=1,2,\dots,k_t$. Hence the policy at every round is characterized by a single threshold

$k_t=\left\lfloor\frac{V_{t+1}}{s_t}\right\rfloor,$

with the obvious clipping

$0\le k_t\le t.$

If $k_t=0$, we always continue. If $k_t=t$, we always stop. Otherwise we stop precisely when the observed relative rank is at most $k_t$.

Step 4: Average over the $t$ Possible Relative Ranks

For a random permutation, the relative rank of the current item among the first $t$ items is uniformly distributed on $\{1,2,\dots,t\}$. Therefore each $r$ occurs with probability $1/t$.

Using the threshold rule, the Bellman update is

$V_t=\frac{1}{t}\left(\sum_{r=1}^{k_t} s_t r+\sum_{r=k_t+1}^{t} V_{t+1}\right).$

The second sum is simply $(t-k_t)V_{t+1}$, and the first uses the triangular-number identity

$\sum_{r=1}^{k_t} r=\frac{k_t(k_t+1)}{2}.$

Thus

$V_t=\frac{1}{t}\left(s_t\frac{k_t(k_t+1)}{2}+(t-k_t)V_{t+1}\right).$

This recurrence matches the program line for line.

Step 5: Evaluate the Recurrence Backward

Starting from $V_n=(n+1)/2$, we compute

$V_{n-1},\ V_{n-2},\ \dots,\ V_1$

one round at a time. Each update needs only the value from the next round, so the entire dynamic program can be run with constant memory. No large table is required even when $n=10^6$.

The implementations also protect the floor operation from roundoff drift by adding a tiny positive tolerance before taking $\lfloor\cdot\rfloor$, then clamping the threshold into the valid interval $[0,t]$.

Worked Example: $n=4$

The checkpoint $F(4)=15/8$ can be reproduced directly from the recurrence.

At the last round,

$V_4=\frac{5}{2}.$

For $t=3$, we have $s_3=5/4$, so

$k_3=\left\lfloor\frac{V_4}{s_3}\right\rfloor=\left\lfloor\frac{5/2}{5/4}\right\rfloor=\lfloor 2\rfloor=2.$

Therefore

$V_3=\frac{1}{3}\left(\frac{5}{4}(1+2)+1\cdot\frac{5}{2}\right)=\frac{25}{12}.$

For $t=2$, we have $s_2=5/3$, hence

$k_2=\left\lfloor\frac{V_3}{s_2}\right\rfloor=\left\lfloor\frac{25/12}{5/3}\right\rfloor=\left\lfloor\frac{5}{4}\right\rfloor=1,$

and so

$V_2=\frac{1}{2}\left(\frac{5}{3}\cdot 1+1\cdot\frac{25}{12}\right)=\frac{15}{8}.$

For $t=1$, we get $s_1=5/2$ and

$k_1=\left\lfloor\frac{15/8}{5/2}\right\rfloor=\left\lfloor\frac{3}{4}\right\rfloor=0.$

So the first round is never accepted immediately, and

$V_1=V_2=\frac{15}{8}.$

Hence

$F(4)=\frac{15}{8},$

which agrees with the checkpoint used by the implementation.

How the Code Works

The C++, Python, and Java implementations all follow the same compact loop. They begin with the forced-final-round value $(n+1)/2$, interpreted as the value for the next round. Then, for each $t$ from $n-1$ down to $1$, they compute the slope $s_t=(n+1)/(t+1)$, divide the next-round value by that slope to obtain the stopping threshold, round it down with a small tolerance, and clamp it to the interval $[0,t]$.

After that, the implementation evaluates the stopping contribution with the closed form $1+2+\cdots+k_t=k_t(k_t+1)/2$, adds the continuation contribution $(t-k_t)V_{t+1}$, divides by $t$, and overwrites the stored value with the newly computed $V_t$. The loop therefore stores only one floating-point state instead of an array of all $V_t$ values. One of the implementations also checks the small cases $n=3$, $n=4$, and $n=10$ before printing the answer for $n=1{,}000{,}000$.

Complexity Analysis

Each round performs only a constant number of arithmetic operations, and the triangular sum is replaced by a closed formula. Therefore the running time is

$O(n),$

and the memory usage is

$O(1).$

For the target input, the computation is simply a single backward pass from $t=n-1$ down to $t=1$.

Footnotes and References

Problem page: https://projecteuler.net/problem=503
Optimal stopping: Wikipedia — Optimal stopping
Order statistic: Wikipedia — Order statistic
Bellman equation: Wikipedia — Bellman equation