Problem 201: Subsets with a Unique Sum
View on Project EulerProject Euler Problem 201 Solution
EulerSolve provides an optimized solution for Project Euler Problem 201, Subsets with a Unique Sum, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(A=\{1^2,2^2,\dots,100^2\}\). Among all 50-element subsets of \(A\), some subset sums occur exactly once and others occur more than once. The task is to add precisely those sums that are realized by one and only one 50-element subset. Brute force is hopeless because the number of 50-subsets is \(\binom{100}{50}\). The important observation is that the final answer does not depend on the exact multiplicity of a sum once that multiplicity reaches 2. We only need to know whether a sum is impossible, unique, or already non-unique. Mathematical Approach Write \(v_i=i^2\). For each prefix \(\{v_1,\dots,v_t\}\), each subset size \(j\), and each sum \(s\), let \(C_t(j,s)\) be the number of \(j\)-element subsets of that prefix whose elements add up to \(s\). Then the quantity required by the problem is $\sum_{s: C_{100}(50,s)=1} s.$ The whole solution comes from turning this exact counting function into a three-state dynamic program. Fixed-Size Subset Counts The ordinary 0/1 subset-sum recurrence with a cardinality dimension is $C_t(j,s)=C_{t-1}(j,s)+C_{t-1}(j-1,s-v_t).$ The first term skips the new square \(v_t\); the second term uses it. The base condition is $C_0(0,0)=1,$ because the empty subset is the unique 0-element subset with sum 0, while every other state with \(t=0\) is impossible....
Detailed mathematical approach
Problem Summary
Let \(A=\{1^2,2^2,\dots,100^2\}\). Among all 50-element subsets of \(A\), some subset sums occur exactly once and others occur more than once. The task is to add precisely those sums that are realized by one and only one 50-element subset.
Brute force is hopeless because the number of 50-subsets is \(\binom{100}{50}\). The important observation is that the final answer does not depend on the exact multiplicity of a sum once that multiplicity reaches 2. We only need to know whether a sum is impossible, unique, or already non-unique.
Mathematical Approach
Write \(v_i=i^2\). For each prefix \(\{v_1,\dots,v_t\}\), each subset size \(j\), and each sum \(s\), let \(C_t(j,s)\) be the number of \(j\)-element subsets of that prefix whose elements add up to \(s\). Then the quantity required by the problem is
$\sum_{s: C_{100}(50,s)=1} s.$
The whole solution comes from turning this exact counting function into a three-state dynamic program.
Fixed-Size Subset Counts
The ordinary 0/1 subset-sum recurrence with a cardinality dimension is
$C_t(j,s)=C_{t-1}(j,s)+C_{t-1}(j-1,s-v_t).$
The first term skips the new square \(v_t\); the second term uses it. The base condition is
$C_0(0,0)=1,$
because the empty subset is the unique 0-element subset with sum 0, while every other state with \(t=0\) is impossible.
Why Three States Are Enough
The final answer only distinguishes the cases \(0\), \(1\), and \(\ge 2\). So define the saturated count
$M_t(j,s)=\min(2,C_t(j,s))\in\{0,1,2\}.$
Then the recurrence can be clipped after every update:
$M_t(j,s)=\min\bigl(2,M_{t-1}(j,s)+M_{t-1}(j-1,s-v_t)\bigr).$
This loses no relevant information. For example, among the first seven squares, the sum 62 already appears at least twice:
$62=1+25+36=4+9+49.$
Once a state has reached value 2, it can never become unique again, so the exact count beyond that threshold is irrelevant to the final sum.
0/1 Updates and Reachable Windows
Each square may be used at most once. Therefore, when the recurrence is implemented in place, subset sizes must be processed in descending order. That guarantees every new \(j\)-subset still reads from states representing only the first \(t-1\) squares.
The implementations also keep, for each subset size \(j\), the smallest and largest reachable sums seen so far. If the next square is \(x\), then every newly created \(j\)-subset sum that uses \(x\) must lie between
$L_{j-1}+x \quad \text{and} \quad H_{j-1}+x,$
where \(L_{j-1}\) and \(H_{j-1}\) are the current lower and upper bounds for \((j-1)\)-element sums. These bounds do not claim that every intermediate value is reachable; they merely bracket the active region so the inner loop skips sums that are certainly impossible.
The global table width is determined by the largest possible 50-subset sum, namely
$S_{\max}=\sum_{i=51}^{100} i^2.$
Worked Examples
A small example with no collisions is the checkpoint case \(n=5\), \(k=3\), based on the set \(\{1,4,9,16,25\}\). The 3-element subset sums are
$14,21,26,29,30,35,38,42,45,50.$
All ten are distinct, so every reachable sum has saturated value 1 and therefore
$U(5,3)=14+21+26+29+30+35+38+42+45+50=330.$
This example shows the “unique” state. The earlier identity for 62 shows the “non-unique” state. Those two examples together explain why the dynamic program only needs the values 0, 1, and 2.
Extracting the Final Quantity
After all 100 squares have been processed, the desired answer is simply
$\sum_{s: M_{100}(50,s)=1} s.$
No subset reconstruction is required. The dynamic program has already preserved exactly the information needed to decide whether each 50-subset sum is unique.
How the Code Works
The C++, Python, and Java implementations all realize the same mathematics. They first generate the list \(1^2,2^2,\dots,100^2\), compute the maximum possible 50-subset sum \(S_{\max}\), and allocate one flat table representing all states \((j,s)\) with \(0\le j\le 50\) and \(0\le s\le S_{\max}\).
Compact State Storage
Because every state is only 0, 1, or 2, one byte per cell is enough. For the Euler instance,
$S_{\max}=\sum_{i=51}^{100} i^2=295425,$
so the table contains \((50+1)(295425+1)\) cells, which is practical with byte-sized storage.
In-Place Dynamic Programming Sweep
For each square \(x\), the implementation walks the subset size downward. Inside the currently relevant sum window, it reads the state for \((j-1,s-x)\), adds that contribution into the state for \((j,s)\), and then clips the result back to 2. States whose predecessor value is 0 are skipped immediately.
The size loop must go downward to preserve the 0/1 meaning of the recurrence. The sum loop is also performed in descending order inside the active window, which fits the same in-place update discipline and keeps the implementation simple.
Maintaining Bounds and Finishing
After each square is processed, the lower and upper reachable sums for every subset size are widened only when the new square can actually create additional totals. At the end, the implementation scans the reachable window for subset size 50 and adds exactly those sums whose stored state is 1. States 0 and 2 are ignored for opposite reasons: one is impossible, the other is not unique.
Complexity Analysis
Let \(k=50\) and let \(S_{\max}\) be the largest possible \(k\)-subset sum. The worst-case running time is
$O(nkS_{\max}),$
which is the standard pseudo-polynomial complexity of fixed-cardinality subset-sum dynamic programming. The moving lower and upper bounds reduce the amount of work in practice, but they do not change the asymptotic upper bound.
The memory usage is
$O(kS_{\max}),$
because only one table over subset size and sum is stored. The method is efficient precisely because it replaces an impossible search over \(\binom{100}{50}\) subsets by a structured DP over reachable sums.
Footnotes and References
- Project Euler problem page: Project Euler - Problem 201
- Subset sum problem: Wikipedia - Subset sum problem
- 0/1 knapsack dynamic programming: cp-algorithms - Knapsack Problem
- Dynamic programming overview: Wikipedia - Dynamic programming