Problem 355: Maximal Coprime Subset
View on Project EulerProject Euler Problem 355 Solution
EulerSolve provides an optimized solution for Project Euler Problem 355, Maximal Coprime Subset, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(\operatorname{Co}(n)\) denote the maximum possible sum of a subset \(A\subseteq\{1,2,\dots,n\}\) such that any two distinct chosen numbers are coprime: $\operatorname{Co}(n)=\max\left\{\sum_{x\in A}x : A\subseteq\{1,\dots,n\},\ \gcd(a,b)=1\ \forall a\ne b\in A\right\}.$ The local C++, Python, and Java solutions all use the same structure: start from the best one-prime-per-slot baseline, then add only profitable replacements of one small-prime power and one large prime by a larger mixed value. Those replacements form a maximum-weight bipartite matching. Mathematical Approach Step 1: One Prime, One Slot If two selected numbers were both divisible by the same prime \(p\), their gcd would be at least \(p\), so they could not be pairwise coprime. Therefore each prime may appear in at most one chosen number. For a fixed prime \(p\le n\), the best number using only that prime is the largest prime power not exceeding \(n\): $P_p=\max\{p^k : k\ge 1,\ p^k\le n\}.$ This gives the baseline set $B_0=\{1\}\cup\{P_p : p\in\mathbb{P},\ p\le n\},$ Here \(\mathbb{P}\) denotes the set of prime numbers. which is pairwise coprime because different members are powers of different primes. Its sum is $S_0=1+\sum_{p\le n}P_p,$ where the sum runs over primes \(p\le n\)....
Detailed mathematical approach
Problem Summary
Let \(\operatorname{Co}(n)\) denote the maximum possible sum of a subset \(A\subseteq\{1,2,\dots,n\}\) such that any two distinct chosen numbers are coprime:
$\operatorname{Co}(n)=\max\left\{\sum_{x\in A}x : A\subseteq\{1,\dots,n\},\ \gcd(a,b)=1\ \forall a\ne b\in A\right\}.$
The local C++, Python, and Java solutions all use the same structure: start from the best one-prime-per-slot baseline, then add only profitable replacements of one small-prime power and one large prime by a larger mixed value. Those replacements form a maximum-weight bipartite matching.
Mathematical Approach
Step 1: One Prime, One Slot
If two selected numbers were both divisible by the same prime \(p\), their gcd would be at least \(p\), so they could not be pairwise coprime. Therefore each prime may appear in at most one chosen number.
For a fixed prime \(p\le n\), the best number using only that prime is the largest prime power not exceeding \(n\):
$P_p=\max\{p^k : k\ge 1,\ p^k\le n\}.$
This gives the baseline set
$B_0=\{1\}\cup\{P_p : p\in\mathbb{P},\ p\le n\},$
Here \(\mathbb{P}\) denotes the set of prime numbers.
which is pairwise coprime because different members are powers of different primes. Its sum is
$S_0=1+\sum_{p\le n}P_p,$
where the sum runs over primes \(p\le n\).
Step 2: Why Only Two-Prime Replacements Can Beat the Baseline
Suppose a candidate number \(x\le n\) has distinct prime divisors in a set \(T\). Choosing \(x\) forbids us from keeping the baseline representatives \(P_r\) for \(r\in T\).
If \(P_r=r^k\), then \(r^{k+1}>n\), hence
$P_r>\frac{n}{r}.$
Therefore
$\sum_{r\in T}P_r>n\sum_{r\in T}\frac{1}{r}.$
If \(x\) has at least three distinct prime factors, the smallest possible reciprocal sum is
$\frac{1}{2}+\frac{1}{3}+\frac{1}{5}>1,$
so \(\sum_{r\in T}P_r>n\ge x\). Such a number can never improve the baseline. This is why the implementations only search upgrades involving exactly two primes.
Step 3: Small and Large Primes
Split the primes at \(\lfloor\sqrt{n}\rfloor\): small primes satisfy \(p\le \sqrt{n}\), while large primes satisfy \(q>\sqrt{n}\).
Two large primes cannot appear in the same chosen number because then their product would exceed \(n\). Also, for a large prime \(q\), we have \(P_q=q\) since \(q^2>n\).
The local solver therefore encodes profitable mixed replacements as numbers of the form
$M(p,q)=q\cdot \max\{p^a : a\ge 1,\ p^a q\le n\},\qquad p\le \sqrt{n}\lt q.$
The helper highest_power_with_limit(p, n / q) computes the inner maximum. Replacing \(P_p\) and \(P_q\) by \(M(p,q)\) changes the sum by
$w(p,q)=M(p,q)-P_p-P_q.$
Only positive values of \(w(p,q)\) are useful, so only those become edges in the optimization graph.
Step 4: Reduction to Maximum-Weight Matching
Each small prime can be used in at most one mixed number, and each large prime can also be used at most once. That is exactly a matching constraint.
Create a bipartite graph with small primes on the left and large primes on the right. Add an edge \((p,q)\) with weight \(w(p,q)\) whenever \(w(p,q)>0\). The implementations add extra zero-weight dummy columns so that a small prime may remain unmatched and simply keep \(P_p\).
If \(\mathcal{M}\) is a matching, its total gain is
$G(\mathcal{M})=\sum_{(p,q)\in \mathcal{M}}w(p,q).$
The answer computed by the code is
$\boxed{\operatorname{Co}(n)=S_0+\max_{\mathcal{M}}G(\mathcal{M}).}$
The maximum is obtained with the Hungarian algorithm on a rectangular assignment matrix.
Worked Example: \(n=30\)
The baseline representatives are
$1,\ 16,\ 27,\ 25,\ 7,\ 11,\ 13,\ 17,\ 19,\ 23,\ 29,$
so
$S_0=188.$
Here \(\lfloor\sqrt{30}\rfloor=5\), so the small primes are \(2,3,5\). The only positive gain edge is \((2,7)\):
$M(2,7)=7\cdot 4=28,\qquad w(2,7)=28-16-7=5.$
Hence
$\operatorname{Co}(30)=188+5=193,$
which matches the checkpoint embedded in the C++ program.
Why Matching Is Needed: \(n=100\)
For \(n=100\), the baseline sum is \(S_0=1263\). Several small primes have multiple profitable partners, so a greedy choice is not enough. The maximum matching selected by the implementations uses
$64+11\to 88,\qquad 25+19\to 95,\qquad 49+13\to 91,$
with gains \(13\), \(51\), and \(29\). The total gain is \(93\), giving
$\operatorname{Co}(100)=1263+93=1356.$
How the Code Works
The three solution files implement the same mathematics. They sieve primes up to \(n\), compute every \(P_p\) with max_prime_power_leq_n, sum the baseline, and split primes at \(\lfloor\sqrt{n}\rfloor\).
For each feasible pair \((p,q)\) with \(p\) small, \(q\) large, and \(pq\le n\), they evaluate the best mixed value \(M(p,q)\). Positive gains are written into a weight matrix of size
$s\times(\ell+s),$
where \(s\) is the number of small primes and \(\ell\) is the number of large primes. The extra \(s\) columns are the dummy unmatched choices. The function hungarian_max_weight_bipartite returns the maximum additional gain.
The C++ version exposes this pipeline through prepare_solver_data, build_gain_edges, and solve_co_optimized. It also parallelizes edge generation over the small-prime side and verifies the checkpoints \(\operatorname{Co}(10)=30\), \(\operatorname{Co}(30)=193\), and \(\operatorname{Co}(100)=1356\). The Python and Java versions follow the same matrix construction in a single thread.
Complexity Analysis
Let \(s=\pi(\lfloor\sqrt{n}\rfloor)\) and \(\ell=\pi(n)-s\). The sieve costs \(O(n\log\log n)\) time and \(O(n)\) memory. Building the gain matrix costs \(O(E\log n)\), where \(E\) is the number of feasible pairs \((p,q)\) with \(p\le\sqrt{n}\lt q\) and \(pq\le n\); the logarithmic factor comes from multiplying \(p\) repeatedly up to the limit \(n/q\).
The Hungarian algorithm on an \(s\times(\ell+s)\) matrix costs \(O(s^2(\ell+s))\) time and \(O(s(\ell+s))\) memory. For the actual input \(n=200000\), we have \(s=\pi(447)=86\), so the matching stage is small compared with the full prime range.
References
- Problem page: https://projecteuler.net/problem=355
- Coprime integers: Wikipedia - Coprime integers
- Prime power: Wikipedia - Prime power
- Sieve of Eratosthenes: Wikipedia - Sieve of Eratosthenes
- Hungarian algorithm: Wikipedia - Hungarian algorithm