Problem 874: Maximal Prime Score
View on Project EulerProject Euler Problem 874 Solution
EulerSolve provides an optimized solution for Project Euler Problem 874, Maximal Prime Score, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(p_0=2,p_1=3,\dots,p_{k-1}\) be the first \(k\) primes. We choose \(n\) indices \(a_1,\dots,a_n\in\{0,\dots,k-1\}\) and define the score $S=\sum_{i=1}^{n} p_{a_i}.$ The admissibility condition is $\sum_{i=1}^{n} a_i\equiv 0 \pmod{k}.$ If the congruence were ignored, the best possible choice would be to take the largest available prime \(p_{k-1}\) every time. The real task is therefore to understand the smallest penalty needed to correct the residue class of the index sum while keeping the total score as large as possible. Mathematical Approach The C++, Python, and Java implementations all convert the constrained maximization into a minimum-loss problem on residue classes modulo \(k\). Step 1: Start from the unconstrained maximum Write $P=p_{k-1}.$ If every term were chosen to be \(P\), the score would simply be $nP.$ Any legal solution can only differ from this ideal by replacing some copies of \(P\) with smaller primes from the same list. Step 2: Encode each replacement by a deficit For every selected index define the deficit $d_i=(k-1)-a_i,\qquad 0\le d_i\le k-1.$ Then \(d_i=0\) means that the largest prime \(P\) was kept, while \(d_i>0\) means that we moved down by \(d_i\) positions in the prime list....
Detailed mathematical approach
Problem Summary
Let \(p_0=2,p_1=3,\dots,p_{k-1}\) be the first \(k\) primes. We choose \(n\) indices \(a_1,\dots,a_n\in\{0,\dots,k-1\}\) and define the score
$S=\sum_{i=1}^{n} p_{a_i}.$
The admissibility condition is
$\sum_{i=1}^{n} a_i\equiv 0 \pmod{k}.$
If the congruence were ignored, the best possible choice would be to take the largest available prime \(p_{k-1}\) every time. The real task is therefore to understand the smallest penalty needed to correct the residue class of the index sum while keeping the total score as large as possible.
Mathematical Approach
The C++, Python, and Java implementations all convert the constrained maximization into a minimum-loss problem on residue classes modulo \(k\).
Step 1: Start from the unconstrained maximum
Write
$P=p_{k-1}.$
If every term were chosen to be \(P\), the score would simply be
$nP.$
Any legal solution can only differ from this ideal by replacing some copies of \(P\) with smaller primes from the same list.
Step 2: Encode each replacement by a deficit
For every selected index define the deficit
$d_i=(k-1)-a_i,\qquad 0\le d_i\le k-1.$
Then \(d_i=0\) means that the largest prime \(P\) was kept, while \(d_i>0\) means that we moved down by \(d_i\) positions in the prime list. The score loss caused by one such move is
$\ell(d)=P-p_{k-1-d},\qquad \ell(0)=0.$
Hence the total score becomes
$S=\sum_{i=1}^{n} p_{a_i}=nP-\sum_{i=1}^{n}\ell(d_i).$
Maximizing \(S\) is therefore equivalent to minimizing the total loss \(\sum \ell(d_i)\).
Step 3: Rewrite the modular condition in deficit form
Since \(a_i=(k-1)-d_i\), we have
$\sum_{i=1}^{n} a_i=n(k-1)-\sum_{i=1}^{n} d_i.$
The congruence condition becomes
$n(k-1)-\sum_{i=1}^{n} d_i\equiv 0 \pmod{k}.$
Using \(k-1\equiv -1\pmod{k}\), this simplifies to
$\sum_{i=1}^{n} d_i\equiv -n \pmod{k}.$
Define the target residue
$R\equiv -n \pmod{k},\qquad R=(k-(n\bmod k))\bmod k.$
The original problem is now: choose deficits \(d_i\in\{0,\dots,k-1\}\) whose sum is congruent to \(R\) modulo \(k\), while making the total loss as small as possible.
Step 4: Separate the harmless zero deficits
The value \(d=0\) neither changes the residue class nor adds any loss. Such choices are useful only for padding the number of selected terms up to exactly \(n\). The real optimization is therefore to find the cheapest collection of nonzero deficits whose sum is congruent to \(R\) modulo \(k\).
At this point only the current residue matters, not the order in which the nonzero deficits were chosen. That observation leads directly to a shortest-path model.
Step 5: Build a shortest-path problem on residues
Create a directed graph with vertices \(0,1,\dots,k-1\), where each vertex represents the current residue of the accumulated deficit sum modulo \(k\).
For every nonzero deficit \(d\in\{1,\dots,k-1\}\), add the transition
$u\longrightarrow (u+d)\bmod k$
with edge weight \(\ell(d)\).
If \(D(r)\) denotes the minimum path cost from residue \(0\) to residue \(r\), then \(D(r)\) is exactly the minimum total loss needed to achieve residue \(r\). Therefore
$\boxed{S_{\max}=nP-D(R).}$
Step 6: Why the shortest path matches the fixed-length selection problem
Every edge corresponds to one nonzero deficit, so a path with \(t\) edges represents \(t\) genuine downgrades from the largest prime. Because every loss \(\ell(d)\) with \(d>0\) is positive, an optimal path never needs to repeat a vertex. If a residue appeared twice, the intermediate segment would be a positive-cost cycle, and deleting that cycle would keep the same final residue while lowering the total loss.
Thus a minimum-cost path can be taken to be simple, so it uses at most \(k-1\) edges. In the solved instance we have \(n\ge k-1\), so after finding the optimal path we can add enough zero deficits to reach exactly \(n\) chosen terms. Those extra terms do not change either the residue or the score. This proves that the residue shortest-path solution is exact for the original problem.
Worked Example: \((k,n)=(5,8)\)
The first five primes are \(2,3,5,7,11\), so \(P=11\). The nonzero deficits have losses
$\ell(1)=11-7=4,\quad \ell(2)=11-5=6,\quad \ell(3)=11-3=8,\quad \ell(4)=11-2=9.$
The target residue is
$R=(5-(8\bmod 5))\bmod 5=2.$
One way to reach residue \(2\) is to take a single deficit \(d=2\), giving total loss \(6\). Another way is \(d=1+d=1\), but that costs \(4+4=8\), which is worse. So the minimum loss is \(D(2)=6\).
Hence
$S_{\max}=8\cdot 11-6=82.$
An optimal selection is to take the prime \(11\) seven times and the prime \(5\) once. The index sum is \(7\cdot 4+2=30\), and indeed \(30\equiv 0\pmod{5}\).
How the Code Works
The implementations first generate enough primes with a sieve. They begin from a standard prime-number-theorem estimate for the search bound and enlarge that bound if it does not yet contain enough primes. This provides the first \(k\) primes used in the score, and also the next prime needed by the final input choice for \(n\).
They then compute the largest available prime \(P\), build the loss table \(\ell(d)\) for \(1\le d\le k-1\), and evaluate the target residue \(R=(-n)\bmod k\).
After that, the implementation runs Dijkstra's algorithm on the implicit residue graph. The distance array stores the best loss known so far for each residue. Whenever a residue is finalized, all \(k-1\) nonzero deficit transitions are relaxed.
Once the minimum loss \(D(R)\) is known, the result is returned as \(nP-D(R)\). One implementation also compares this fast method with a small exact dynamic program on tiny test cases, but the final computation uses only the shortest-path formulation.
Complexity Analysis
The residue graph has \(k\) vertices and \(k(k-1)\) directed edges, generated implicitly. Dijkstra on this dense graph costs
$O(k^2\log k)$
time with a binary heap.
The sieve stage uses an upper bound \(B\) of order \(k(\log k+\log\log k)\), so prime generation costs \(O(B\log\log B)\) time and \(O(B)\) memory. For the actual parameters, the shortest-path stage is the dominant part.
The permanent arrays for the primes, losses, and finalized distances are \(O(k)\). With a standard heap implementation that pushes a fresh candidate whenever a better distance is found, the heap itself can grow to \(O(k^2)\) entries in the worst case.
Footnotes and References
- Problem page: https://projecteuler.net/problem=874
- Dijkstra's algorithm: Wikipedia - Dijkstra's algorithm
- Modular arithmetic: Wikipedia - Modular arithmetic
- Prime number theorem: Wikipedia - Prime number theorem
- Shortest path problem: Wikipedia - Shortest path problem