Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 398: Cutting Rope

Project Euler Problem 398 Solution

EulerSolve provides an optimized solution for Project Euler Problem 398, Cutting Rope, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Choose uniformly from all ordered compositions of $n$ into $m$ positive integers: $\mathcal{C}_{n,m}=\left\{(x_1,\dots,x_m)\in \mathbb{Z}_{>0}^m : x_1+\cdots+x_m=n\right\}.$ After sorting one sampled composition into nondecreasing order, write $X_{(1)}\le X_{(2)}\le \cdots \le X_{(m)}.$ The task solved by the code is to compute the expected value of the second shortest piece, namely $\mathbb{E}[X_{(2)}]$, for the Project Euler parameters $n=10^7$ and $m=100$. Mathematical Approach 1) Total number of equally likely cuts By the standard stars-and-bars argument, the number of ordered compositions of $n$ into $m$ positive parts is $\left|\mathcal{C}_{n,m}\right|=\binom{n-1}{m-1}.$ So every probability in the program is a composition count divided by $\binom{n-1}{m-1}$. 2) Tail-sum identity for an integer-valued random variable Because $X_{(2)}$ is a positive integer, we may use the tail-sum formula $\mathbb{E}[X_{(2)}]=\sum_{k\ge 1}\Pr(X_{(2)}\ge k).$ This reduces the whole problem to counting, for each $k$, how many compositions have second smallest part at least $k$. 3) Interpret the event $X_{(2)}\ge k$ The condition $X_{(2)}\ge k$ means that at most one part is smaller than $k$. Equivalently, the composition belongs to one of two disjoint cases: Case A: all $m$ parts are at least $k$....

Detailed mathematical approach

Problem Summary

Choose uniformly from all ordered compositions of $n$ into $m$ positive integers:

$\mathcal{C}_{n,m}=\left\{(x_1,\dots,x_m)\in \mathbb{Z}_{>0}^m : x_1+\cdots+x_m=n\right\}.$

After sorting one sampled composition into nondecreasing order, write

$X_{(1)}\le X_{(2)}\le \cdots \le X_{(m)}.$

The task solved by the code is to compute the expected value of the second shortest piece, namely $\mathbb{E}[X_{(2)}]$, for the Project Euler parameters $n=10^7$ and $m=100$.

Mathematical Approach

1) Total number of equally likely cuts

By the standard stars-and-bars argument, the number of ordered compositions of $n$ into $m$ positive parts is

$\left|\mathcal{C}_{n,m}\right|=\binom{n-1}{m-1}.$

So every probability in the program is a composition count divided by $\binom{n-1}{m-1}$.

2) Tail-sum identity for an integer-valued random variable

Because $X_{(2)}$ is a positive integer, we may use the tail-sum formula

$\mathbb{E}[X_{(2)}]=\sum_{k\ge 1}\Pr(X_{(2)}\ge k).$

This reduces the whole problem to counting, for each $k$, how many compositions have second smallest part at least $k$.

3) Interpret the event $X_{(2)}\ge k$

The condition $X_{(2)}\ge k$ means that at most one part is smaller than $k$. Equivalently, the composition belongs to one of two disjoint cases:

Case A: all $m$ parts are at least $k$.

Case B: exactly one part is in $\{1,2,\dots,k-1\}$, and the remaining $m-1$ parts are at least $k$.

4) Count Case A: every part is at least $k$

Write $x_i=y_i+(k-1)$ for every part. Then each $y_i\ge 1$ and

$y_1+\cdots+y_m=n-m(k-1).$

Therefore the number of compositions in Case A is

$A_k=\binom{n-m(k-1)-1}{m-1}.$

As usual, this is understood to be $0$ whenever the top entry is smaller than $m-1$.

5) Count Case B: exactly one short part

Choose the short position in $m$ ways. Let that short part have length $t$, where $1\le t\le k-1$. The other $m-1$ parts are at least $k$, so after subtracting $k-1$ from each of them we get

$z_1+\cdots+z_{m-1}=n-t-(m-1)(k-1),\qquad z_i\ge 1.$

For fixed $t$, the number of such compositions is

$\binom{n-t-(m-1)(k-1)-1}{m-2}.$

Summing over $t$ gives

$B_k=m\sum_{t=1}^{k-1}\binom{n-t-(m-1)(k-1)-1}{m-2}.$

Now apply the hockey-stick identity:

$\sum_{t=1}^{k-1}\binom{n-t-(m-1)(k-1)-1}{m-2}=\binom{n-(m-1)(k-1)-1}{m-1}-\binom{n-m(k-1)-1}{m-1}.$

Hence

$B_k=m\left(\binom{n-(m-1)(k-1)-1}{m-1}-\binom{n-m(k-1)-1}{m-1}\right).$

6) Closed formula for the tail probability

Adding the two disjoint cases yields

$A_k+B_k=m\binom{n-(m-1)(k-1)-1}{m-1}-(m-1)\binom{n-m(k-1)-1}{m-1}.$

Dividing by the total number of compositions gives the exact tail used by the solver:

$\Pr(X_{(2)}\ge k)=\frac{m\binom{n-(m-1)(k-1)-1}{m-1}-(m-1)\binom{n-m(k-1)-1}{m-1}}{\binom{n-1}{m-1}}.$

Substituting this into the tail-sum identity gives the full expectation. This is the formula implemented in C++, Python, and Java.

7) Small checkpoints from the source code

The C++ program verifies two test cases before printing the final value. For $(n,m)=(3,2)$, the only compositions are $(1,2)$ and $(2,1)$, so the second shortest piece is always $2$, hence $\mathbb{E}[X_{(2)}]=2$.

For $(n,m)=(8,3)$, the denominator is $\binom{7}{2}=21$. The nonzero tail terms are

$\Pr(X_{(2)}\ge 1)=1,$

$\Pr(X_{(2)}\ge 2)=\frac{3\binom{5}{2}-2\binom{4}{2}}{\binom{7}{2}}=\frac{18}{21}=\frac{6}{7},$

$\Pr(X_{(2)}\ge 3)=\frac{3\binom{3}{2}-2\binom{1}{2}}{\binom{7}{2}}=\frac{9}{21}=\frac{3}{7}.$

All later terms are zero, so

$\mathbb{E}[X_{(2)}]=1+\frac{6}{7}+\frac{3}{7}=\frac{16}{7},$

which matches the checkpoint in the repository.

How the Code Works

The implementations avoid constructing huge binomial coefficients directly. They set

$r=m-1,$

$R_k=\frac{\binom{n-(m-1)(k-1)-1}{r}}{\binom{n-1}{r}},\qquad S_k=\frac{\binom{n-m(k-1)-1}{r}}{\binom{n-1}{r}}.$

Then the tail probability becomes

$\Pr(X_{(2)}\ge k)=mR_k-(m-1)S_k.$

If the current binomial top is $a$ and the next step decreases it by $d$, the ratio update is

$\frac{\binom{a-d}{r}}{\binom{a}{r}}=\prod_{j=0}^{r-1}\frac{a-d-j}{a-j}.$

This telescoping product is exactly what ratio_binom_drop computes. The variables a_r and a_s track the current top entries for the two numerator terms, while ratio_r and ratio_s hold the normalized values $R_k$ and $S_k$. Once a top entry falls below $r$, that term is set to zero and the loop terminates naturally.

Complexity Analysis

Each loop iteration updates two products of length $r=m-1$, so one iteration costs $O(m)$ time. If $T$ tail levels are nonzero, the full computation costs $O(Tm)$ time and $O(1)$ additional memory. Since the top parameters decrease by $m-1$ or $m$ at every step, $T$ is on the order of $n/(m-1)$. The method is therefore compact in memory and efficient enough for the Project Euler input.

References

Problem page: https://projecteuler.net/problem=398
Order statistics: Wikipedia — Order statistic
Compositions and stars-and-bars: Wikipedia — Stars and bars
Binomial coefficient identities: Wikipedia — Binomial coefficient