Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 398: Cutting Rope

View on Project Euler

Project Euler Problem 398 Solution

EulerSolve provides an optimized solution for Project Euler Problem 398, Cutting Rope, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Choose uniformly from all ordered compositions of \(n\) into \(m\) positive integers: $\mathcal{C}_{n,m}=\left\{(x_1,\dots,x_m)\in \mathbb{Z}_{>0}^m : x_1+\cdots+x_m=n\right\}.$ After sorting one sampled composition into nondecreasing order, write $X_{(1)}\le X_{(2)}\le \cdots \le X_{(m)}.$ The task solved by the code is to compute the expected value of the second shortest piece, namely \(\mathbb{E}[X_{(2)}]\), for the Project Euler parameters \(n=10^7\) and \(m=100\). Mathematical Approach 1) Total number of equally likely cuts By the standard stars-and-bars argument, the number of ordered compositions of \(n\) into \(m\) positive parts is $\left|\mathcal{C}_{n,m}\right|=\binom{n-1}{m-1}.$ So every probability in the program is a composition count divided by \(\binom{n-1}{m-1}\). 2) Tail-sum identity for an integer-valued random variable Because \(X_{(2)}\) is a positive integer, we may use the tail-sum formula $\mathbb{E}[X_{(2)}]=\sum_{k\ge 1}\Pr(X_{(2)}\ge k).$ This reduces the whole problem to counting, for each \(k\), how many compositions have second smallest part at least \(k\). 3) Interpret the event \(X_{(2)}\ge k\) The condition \(X_{(2)}\ge k\) means that at most one part is smaller than \(k\). Equivalently, the composition belongs to one of two disjoint cases: Case A: all \(m\) parts are at least \(k\)....

Detailed mathematical approach

Problem Summary

Choose uniformly from all ordered compositions of \(n\) into \(m\) positive integers:

$\mathcal{C}_{n,m}=\left\{(x_1,\dots,x_m)\in \mathbb{Z}_{>0}^m : x_1+\cdots+x_m=n\right\}.$

After sorting one sampled composition into nondecreasing order, write

$X_{(1)}\le X_{(2)}\le \cdots \le X_{(m)}.$

The task solved by the code is to compute the expected value of the second shortest piece, namely \(\mathbb{E}[X_{(2)}]\), for the Project Euler parameters \(n=10^7\) and \(m=100\).

Mathematical Approach

1) Total number of equally likely cuts

By the standard stars-and-bars argument, the number of ordered compositions of \(n\) into \(m\) positive parts is

$\left|\mathcal{C}_{n,m}\right|=\binom{n-1}{m-1}.$

So every probability in the program is a composition count divided by \(\binom{n-1}{m-1}\).

2) Tail-sum identity for an integer-valued random variable

Because \(X_{(2)}\) is a positive integer, we may use the tail-sum formula

$\mathbb{E}[X_{(2)}]=\sum_{k\ge 1}\Pr(X_{(2)}\ge k).$

This reduces the whole problem to counting, for each \(k\), how many compositions have second smallest part at least \(k\).

3) Interpret the event \(X_{(2)}\ge k\)

The condition \(X_{(2)}\ge k\) means that at most one part is smaller than \(k\). Equivalently, the composition belongs to one of two disjoint cases:

Case A: all \(m\) parts are at least \(k\).

Case B: exactly one part is in \(\{1,2,\dots,k-1\}\), and the remaining \(m-1\) parts are at least \(k\).

4) Count Case A: every part is at least \(k\)

Write \(x_i=y_i+(k-1)\) for every part. Then each \(y_i\ge 1\) and

$y_1+\cdots+y_m=n-m(k-1).$

Therefore the number of compositions in Case A is

$A_k=\binom{n-m(k-1)-1}{m-1}.$

As usual, this is understood to be \(0\) whenever the top entry is smaller than \(m-1\).

5) Count Case B: exactly one short part

Choose the short position in \(m\) ways. Let that short part have length \(t\), where \(1\le t\le k-1\). The other \(m-1\) parts are at least \(k\), so after subtracting \(k-1\) from each of them we get

$z_1+\cdots+z_{m-1}=n-t-(m-1)(k-1),\qquad z_i\ge 1.$

For fixed \(t\), the number of such compositions is

$\binom{n-t-(m-1)(k-1)-1}{m-2}.$

Summing over \(t\) gives

$B_k=m\sum_{t=1}^{k-1}\binom{n-t-(m-1)(k-1)-1}{m-2}.$

Now apply the hockey-stick identity:

$\sum_{t=1}^{k-1}\binom{n-t-(m-1)(k-1)-1}{m-2}=\binom{n-(m-1)(k-1)-1}{m-1}-\binom{n-m(k-1)-1}{m-1}.$

Hence

$B_k=m\left(\binom{n-(m-1)(k-1)-1}{m-1}-\binom{n-m(k-1)-1}{m-1}\right).$

6) Closed formula for the tail probability

Adding the two disjoint cases yields

$A_k+B_k=m\binom{n-(m-1)(k-1)-1}{m-1}-(m-1)\binom{n-m(k-1)-1}{m-1}.$

Dividing by the total number of compositions gives the exact tail used by the solver:

$\Pr(X_{(2)}\ge k)=\frac{m\binom{n-(m-1)(k-1)-1}{m-1}-(m-1)\binom{n-m(k-1)-1}{m-1}}{\binom{n-1}{m-1}}.$

Substituting this into the tail-sum identity gives the full expectation. This is the formula implemented in C++, Python, and Java.

7) Small checkpoints from the source code

The C++ program verifies two test cases before printing the final value. For \((n,m)=(3,2)\), the only compositions are \((1,2)\) and \((2,1)\), so the second shortest piece is always \(2\), hence \(\mathbb{E}[X_{(2)}]=2\).

For \((n,m)=(8,3)\), the denominator is \(\binom{7}{2}=21\). The nonzero tail terms are

$\Pr(X_{(2)}\ge 1)=1,$

$\Pr(X_{(2)}\ge 2)=\frac{3\binom{5}{2}-2\binom{4}{2}}{\binom{7}{2}}=\frac{18}{21}=\frac{6}{7},$

$\Pr(X_{(2)}\ge 3)=\frac{3\binom{3}{2}-2\binom{1}{2}}{\binom{7}{2}}=\frac{9}{21}=\frac{3}{7}.$

All later terms are zero, so

$\mathbb{E}[X_{(2)}]=1+\frac{6}{7}+\frac{3}{7}=\frac{16}{7},$

which matches the checkpoint in the repository.

How the Code Works

The implementations avoid constructing huge binomial coefficients directly. They set

$r=m-1,$

$R_k=\frac{\binom{n-(m-1)(k-1)-1}{r}}{\binom{n-1}{r}},\qquad S_k=\frac{\binom{n-m(k-1)-1}{r}}{\binom{n-1}{r}}.$

Then the tail probability becomes

$\Pr(X_{(2)}\ge k)=mR_k-(m-1)S_k.$

If the current binomial top is \(a\) and the next step decreases it by \(d\), the ratio update is

$\frac{\binom{a-d}{r}}{\binom{a}{r}}=\prod_{j=0}^{r-1}\frac{a-d-j}{a-j}.$

This telescoping product is exactly what ratio_binom_drop computes. The variables a_r and a_s track the current top entries for the two numerator terms, while ratio_r and ratio_s hold the normalized values \(R_k\) and \(S_k\). Once a top entry falls below \(r\), that term is set to zero and the loop terminates naturally.

Complexity Analysis

Each loop iteration updates two products of length \(r=m-1\), so one iteration costs \(O(m)\) time. If \(T\) tail levels are nonzero, the full computation costs \(O(Tm)\) time and \(O(1)\) additional memory. Since the top parameters decrease by \(m-1\) or \(m\) at every step, \(T\) is on the order of \(n/(m-1)\). The method is therefore compact in memory and efficient enough for the Project Euler input.

References

  1. Problem page: https://projecteuler.net/problem=398
  2. Order statistics: Wikipedia — Order statistic
  3. Compositions and stars-and-bars: Wikipedia — Stars and bars
  4. Binomial coefficient identities: Wikipedia — Binomial coefficient

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 397 · All Project Euler solutions · Next: Problem 399

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮