Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 103: Special Subset Sums: Optimum

Project Euler Problem 103 Solution

EulerSolve provides an optimized solution for Project Euler Problem 103, Special Subset Sums: Optimum, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary We seek an increasing seven-element set $A=\{a_1,a_2,\dots,a_7\}$ of positive integers with the smallest possible total sum such that two conditions hold. First, no two non-empty disjoint subsets of $A$ may have the same sum. Second, whenever two disjoint subsets have different sizes, the larger subset must also have the larger sum. The optimum set is $\{20,31,38,39,40,42,45\}$, so the required concatenated string is $20313839404245$. The supplied implementations do not use a closed-form formula. Instead, they combine a sharp structural lemma for the size condition, an explicit disjoint-subset-sum test, and a search seeded by the known optimum for six elements. Mathematical Approach Write $\sigma(B)=\sum_{x\in B}x$. Once the candidate set is sorted increasingly, Problem 103 becomes a finite search with strong pruning: one rule collapses to a few linear inequalities, and the other can be checked exactly by enumerating subset masks. Reducing the size rule to prefix-versus-suffix inequalities Assume $a_1<a_2<\cdots<a_n$. Among all subsets of size $m+1$, the smallest possible sum is $a_1+\cdots+a_{m+1}$. Among all subsets of size $m$, the largest possible sum is $a_{n-m+1}+\cdots+a_n$....

Detailed mathematical approach

Problem Summary

We seek an increasing seven-element set $A=\{a_1,a_2,\dots,a_7\}$ of positive integers with the smallest possible total sum such that two conditions hold. First, no two non-empty disjoint subsets of $A$ may have the same sum. Second, whenever two disjoint subsets have different sizes, the larger subset must also have the larger sum. The optimum set is $\{20,31,38,39,40,42,45\}$, so the required concatenated string is $20313839404245$.

The supplied implementations do not use a closed-form formula. Instead, they combine a sharp structural lemma for the size condition, an explicit disjoint-subset-sum test, and a search seeded by the known optimum for six elements.

Mathematical Approach

Write $\sigma(B)=\sum_{x\in B}x$. Once the candidate set is sorted increasingly, Problem 103 becomes a finite search with strong pruning: one rule collapses to a few linear inequalities, and the other can be checked exactly by enumerating subset masks.

Reducing the size rule to prefix-versus-suffix inequalities

Assume $a_1<a_2<\cdots<a_n$. Among all subsets of size $m+1$, the smallest possible sum is $a_1+\cdots+a_{m+1}$. Among all subsets of size $m$, the largest possible sum is $a_{n-m+1}+\cdots+a_n$. Therefore the condition

$|B|>|C| \Longrightarrow \sigma(B)>\sigma(C)$

is guaranteed once we verify

$a_1+\cdots+a_{m+1} > a_{n-m+1}+\cdots+a_n \qquad \left(m=1,2,\dots,\left\lfloor\frac{n-1}{2}\right\rfloor\right).$

For $n=7$, this becomes only three inequalities:

$a_1+a_2>a_7,\qquad a_1+a_2+a_3>a_6+a_7,\qquad a_1+a_2+a_3+a_4>a_5+a_6+a_7.$

This lemma is the main fast filter in every implementation.

The seven-element template from the optimum six-element set

The standard near-optimum construction starts from the optimum special sum set of size 6, namely $\{11,18,19,20,22,25\}$. Let the middle element be $b=20$. The usual template for the next size is

$\{b\}\cup\{b+x:x\in\{11,18,19,20,22,25\}\}=\{20,31,38,39,40,42,45\}.$

This already has total sum $255$, so it immediately provides an upper bound: any better solution would need sum strictly less than 255. The search procedures are organized around trying to beat that bound.

Worked Example: checking the template against the size rule

For $A=\{20,31,38,39,40,42,45\}$, the three required comparisons are

$20+31=51>45,\qquad 20+31+38=89>42+45=87,\qquad 20+31+38+39=128>40+42+45=127.$

So the larger-subset-larger-sum rule already holds. This is important because it means the candidate survives the cheapest structural test before any expensive subset enumeration is performed.

Testing unique sums only for disjoint subsets

The second rule is subtler. We do not need all subset sums to be distinct; we only need equality to be impossible for two disjoint non-empty subsets. For $n=7$, there are $2^7-1=127$ non-empty subsets, so the implementations enumerate masks $M$ and compute $\sigma(M)$. A candidate fails as soon as two masks satisfy

$M\cap N=\varnothing,\qquad \sigma(M)=\sigma(N).$

In the fastest implementation, subset sums are filled incrementally by removing the least significant chosen element, which gives the recurrence

$\sigma(M)=\sigma\!\bigl(M\setminus\{\ell\}\bigr)+a_\ell.$

At this problem size, exact enumeration is completely practical.

Branch-and-bound with a triangular lower bound

Suppose an increasing prefix $a_1<\cdots<a_k$ has already been chosen, its current sum is $S_{\mathrm{cur}}$, and the next value must be at least $v$. If $r=7-k$ elements remain, then the smallest possible completion is $v,v+1,\dots,v+r-1$, so every extension has total at least

$S_{\mathrm{cur}}+v+(v+1)+\cdots+(v+r-1)=S_{\mathrm{cur}}+rv+\frac{r(r-1)}{2}.$

If this lower bound is already at least the best known sum, the entire branch can be discarded. The exhaustive implementations also reuse the prefix-versus-suffix inequalities on partial prefixes: once a partial set violates one of those inequalities, adding larger future values cannot repair it.

Why the optimum is certified

Starting from the upper bound 255 supplied by the template, the exhaustive search enumerates all strictly increasing seven-element candidates that could possibly beat it, pruning branches that are already hopeless. Every full candidate is then checked against both special-sum conditions. Since no valid set with smaller total sum exists, the template itself is optimal, giving $\{20,31,38,39,40,42,45\}$ and the string $20313839404245$.

How the Code Works

The C++, Python, and Java implementations all treat candidates as increasing sequences and all use the same two mathematical validators: the prefix-suffix inequalities for the size rule and a disjoint-subset-sum check for the uniqueness rule.

The implementations differ mainly in the search layer. The C++ version performs a full branch-and-bound search with a strong arithmetic lower bound for the unfinished tail. The Java version also searches recursively from the template upper bound, but with a simpler next-value window. The Python version explores a tight neighborhood around the standard seven-element template built from the six-element optimum. In all three languages, the same special-sum conditions determine whether a candidate survives.

When a candidate reaches length 7, the implementation verifies that it is sorted, applies the size-rule filter, enumerates the non-empty subset masks, and rejects the candidate immediately if a disjoint equal-sum pair appears. Any valid set with a smaller total replaces the current best answer.

Complexity Analysis

For a fixed set of size $n$, the size-rule test costs $O(n)$. The subset stage needs $O(2^n)$ masks, and the straightforward disjoint-pair comparison used here is exponential as well; stated conservatively, the full validator is at most $O(4^n)$, which is still tiny for $n=7$.

The outer search is also exponential in principle, because it ranges over candidate sets. The reason the problem is still easy in practice is that the upper bound $255$ is already extremely tight, and the lower bound $S_{\mathrm{cur}}+rv+\frac{r(r-1)}{2}$ removes large parts of the tree before the expensive subset test is even reached. For this fixed seven-element case, the optimum is certified quickly.

Footnotes and References

Problem page: Project Euler 103
Subset sum problem: Wikipedia - Subset sum problem
Power set: Wikipedia - Power set
Branch and bound: Wikipedia - Branch and bound
Backtracking: Wikipedia - Backtracking