Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 950: Pirate Treasure

Project Euler Problem 950 Solution

EulerSolve provides an optimized solution for Project Euler Problem 950, Pirate Treasure, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Problem 950 asks for the quantity \(T(N,C,D)\) at an extreme scale. The implementations evaluate \[ \sum_{k=1}^{6} T(10^{16},10^k+1,10^k+1) \pmod{10^9}, \] so even one parameter set is far beyond direct simulation. What the code actually sums is a sequence \((c_n)_{n\ge 1}\) with \[ T(N,C,D)=\sum_{n=1}^{N} c_n, \qquad c_1=C. \] The hard part is that the next exceptional value does not depend only on the current term; it depends on a whole multiset of carried values. The successful strategy is to keep that multiset in compressed form, jump directly to the next index where a nontrivial update occurs, and sum the long linear stretches in between with a closed formula. Mathematical Approach The Event-State Invariant At a landing position \(s\), the algorithm maintains two pieces of information: the current sequence value \(c_s\), and a multiset \(\mathcal{M}_s\) of size \(s\) that controls every future transition. Only positive entries are stored explicitly. Zeros are tracked by a counter \(z_s\). Write the multiset as \[ \mathcal{M}_s=\{0^{z_s}\}\cup \{v_1^{m_1},v_2^{m_2},\dots,v_r^{m_r}\}, \qquad 0 \lt v_1 \lt \cdots \lt v_r. \] Here the exponent means multiplicity: \(v_i^{m_i}\) means that the value \(v_i\) occurs \(m_i\) times. The invariant is \[ z_s+\sum_{i=1}^{r} m_i=s....

Detailed mathematical approach

Problem Summary

Problem 950 asks for the quantity \(T(N,C,D)\) at an extreme scale. The implementations evaluate

\[ \sum_{k=1}^{6} T(10^{16},10^k+1,10^k+1) \pmod{10^9}, \]

so even one parameter set is far beyond direct simulation. What the code actually sums is a sequence \((c_n)_{n\ge 1}\) with

\[ T(N,C,D)=\sum_{n=1}^{N} c_n, \qquad c_1=C. \]

The hard part is that the next exceptional value does not depend only on the current term; it depends on a whole multiset of carried values. The successful strategy is to keep that multiset in compressed form, jump directly to the next index where a nontrivial update occurs, and sum the long linear stretches in between with a closed formula.

Mathematical Approach

The Event-State Invariant

At a landing position \(s\), the algorithm maintains two pieces of information: the current sequence value \(c_s\), and a multiset \(\mathcal{M}_s\) of size \(s\) that controls every future transition. Only positive entries are stored explicitly. Zeros are tracked by a counter \(z_s\).

Write the multiset as

\[ \mathcal{M}_s=\{0^{z_s}\}\cup \{v_1^{m_1},v_2^{m_2},\dots,v_r^{m_r}\}, \qquad 0 \lt v_1 \lt \cdots \lt v_r. \]

Here the exponent means multiplicity: \(v_i^{m_i}\) means that the value \(v_i\) occurs \(m_i\) times. The invariant is

\[ z_s+\sum_{i=1}^{r} m_i=s. \]

The key summary statistic derived from this state is

\[ P_s(b)=\text{the sum of the } b \text{ smallest elements of }\mathcal{M}_s. \]

Because zeros come first and the positive values are stored in increasing groups, \(P_s(b)\) can be read off by a short scan of the grouped state instead of by expanding all \(s\) entries.

Feasible Jumps and the Budget Test

From a landing at index \(s\), the next nontrivial landing is searched in the form \(s+d\) with \(1\le d\le s\). For each candidate distance \(d\), the implementations define

\[ b=\left\lfloor\frac{s-d+1}{2}\right\rfloor, \qquad g=\left\lceil\frac{d}{\sqrt D}\right\rceil. \]

The quantity \(b\) tells us how many of the smallest carried values must be consumed, while \(g\) is the common surcharge attached to a jump of length \(d\). The total cost of using that jump is

\[ \operatorname{cost}(d)=P_s(b)+b\,g. \]

The jump is feasible exactly when

\[ \operatorname{cost}(d)\le C. \]

The first feasible \(d\) is chosen. This is the decisive observation in the code: once the earliest feasible landing is known, every index before it follows the trivial linear rule and does not need to be processed one by one.

Why the Search Starts at \(s-2C\) and Always Terminates

The scan does not start from \(d=1\) blindly. Since \(g\ge 1\) and \(P_s(b)\ge 0\), any feasible jump must satisfy \(b\le C\). Using

\[ b=\left\lfloor\frac{s-d+1}{2}\right\rfloor, \]

this implies

\[ d\ge s-2C. \]

So the candidate range can be shortened to

\[ d\in[\max(1,s-2C),\,s]. \]

There is also a guaranteed fallback. When \(d=s\), we get

\[ b=\left\lfloor\frac{s-s+1}{2}\right\rfloor=0, \]

hence \(\operatorname{cost}(s)=0\), which is always feasible. Therefore every state has a next landing. This special case produces a full reset: the next landing is at \(2s\), the new sequence value becomes \(C\), and the positive part of the state collapses to a single copy of \(C\).

Exact Landing-State Update

Assume the chosen jump has \(b>0\). Let \(z_{\mathrm{sel}}=\min(b,z_s)\); these are the selected zeros among the \(b\) smallest elements. The remaining \(b-z_{\mathrm{sel}}\) selected elements are the smallest positive values of \(\mathcal{M}_s\), say

\[ x_1,\dots,x_{b-z_{\mathrm{sel}}}. \]

The landing value is then

\[ c_{s+d}=C-\operatorname{cost}(d). \]

The new positive entries are built as follows:

\[ \underbrace{\{g,\dots,g\}}_{z_{\mathrm{sel}}\text{ copies}} \cup \{x_1+g,\dots,x_{b-z_{\mathrm{sel}}}+g\}. \]

If \(c_{s+d}>0\), one further copy of \(c_{s+d}\) is inserted. Everything else is zero, so the new zero count is simply whatever is needed to make the total size equal to \(s+d\). Finally, equal positive values are merged back into grouped multiplicities. That normalization step is what keeps the state compact.

Arithmetic Blocks Between Landings

If the first feasible jump length is \(d\), then there is no special landing at any of the intermediate indices \(s+1,s+2,\dots,s+d-1\). Their values follow the linear pattern

\[ c_{s+t}=c_s+t \qquad (1\le t \lt d). \]

So the skipped contribution is a simple arithmetic progression:

\[ \sum_{t=1}^{d-1}(c_s+t) = (d-1)c_s+\frac{(d-1)d}{2}. \]

This is the reason the algorithm can jump over huge stretches of indices. Once the next landing is known, the entire block before it is summed in \(O(1)\) time.

Worked Example: \(C=D=3\)

The checkpoint case \(T(30,3,3)=190\) already shows the two essential behaviors. After a few updates the state at \(s=4\) is

\[ c_4=2, \qquad \mathcal{M}_4=\{0,0,1,2\}. \]

Try \(d=1\). Then

\[ b=\left\lfloor\frac{4-1+1}{2}\right\rfloor=2, \qquad g=\left\lceil\frac{1}{\sqrt 3}\right\rceil=1, \qquad P_4(2)=0, \]

because the two smallest multiset elements are both zero. Hence

\[ \operatorname{cost}(1)=2, \qquad c_5=3-2=1. \]

The two selected zeros become two copies of \(1\), and the landing value contributes one more \(1\). So the next grouped state is

\[ \mathcal{M}_5=\{0,0,1,1,1\}. \]

Later, at \(s=8\), every \(d<8\) fails the budget test, so the fallback \(d=8\) is used. That means indices \(9\) through \(15\) are just the arithmetic run \(1,2,3,4,5,6,7\) above \(c_8=0\), and index \(16\) resets to \(c_{16}=3\). This single trace explains why the code alternates between exact landing updates and long linear blocks.

How the Code Works

The C++ and Python implementations maintain the landing index \(s\), the current value \(c_s\), the zero count, and the sorted positive groups. For each landing they scan candidate distances \(d\) from \(\max(1,s-2C)\) upward. To compute \(g=\lceil d/\sqrt D\rceil\) safely, they start from a numerical estimate and then correct it with the exact integer inequality \(g^2D\ge d^2\). Since \(g\) is monotone in \(d\), it can then be updated incrementally.

For each candidate, the implementation evaluates \(P_s(b)\) by consuming zeros first and then walking through the positive groups from smallest to largest. The scan stops early as soon as the partial sum exceeds \(C\), because such a candidate can never be feasible. Once the first feasible \(d\) is found, the code either performs the reset case \(d=s\) or constructs the next grouped multiset from the selected smallest elements, then sorts and merges equal positive values again.

The outer accumulation loop never expands all \(N\) indices. It adds the skipped arithmetic block in closed form, adds the landing value \(c_{s+d}\), and repeats from the new state. The Python version is a direct compact transcription of the same logic. The Java version serves as a thin launcher around the same computation, compiling and invoking the C++ implementation so that all three languages produce the same numerical result.

Complexity Analysis

Let \(J\) be the number of landing states actually visited. At landing \(j\), let \(R_j\) be the number of candidate distances examined before the first feasible one is found, and let \(G_j\) be the number of distinct positive groups in the compressed state. Both the prefix-sum test and the landing update inspect at most \(G_j\) groups, so a safe worst-case bound is

\[ O\!\left(\sum_{j=1}^{J} R_j G_j\right). \]

The memory usage is \(O(\max_j G_j)\), because zeros are implicit and equal positive values are merged. That is the decisive compression: a state with enormous index \(s\) may still be represented by only a few grouped values. The final program repeats this computation for six parameter pairs, which is only a constant-factor multiplier.

The crucial point is qualitative rather than asymptotic. A naive method would need \(10^{16}\) sequence updates per parameter set, while this method processes only the landing states and treats the vast linear stretches analytically.

Footnotes and References

Problem page: Project Euler 950 - Pirate Treasure
Multiset: Wikipedia - Multiset
Arithmetic progression: Wikipedia - Arithmetic progression
Floor and ceiling functions: Wikipedia - Floor and ceiling functions