Problem 95: Amicable Chains
View on Project EulerProject Euler Problem 95 Solution
EulerSolve provides an optimized solution for Project Euler Problem 95, Amicable Chains, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(s(n)\) be the sum of the proper divisors of \(n\): $s(n)=\sum_{\substack{d\mid n\\1\le d<n}} d.$ Starting from \(a_0=n\), define an aliquot sequence by \(a_{k+1}=s(a_k)\). The problem asks for the smallest member of the longest cycle produced by this iteration, under the condition that every element of the cycle is at most \(10^6\). Thus we are searching the directed cycles of the map \(n\mapsto s(n)\) inside the bounded interval \([1,10^6]\). A direct per-start factorization strategy would waste most of the work. The implementations instead precompute every value \(s(n)\) once, then analyze the resulting functional graph of outgoing edges. Mathematical Approach Write \(N=10^6\) and \(f(n)=s(n)\). Each integer \(n\in[1,N]\) has exactly one successor \(f(n)\), although that successor may lie outside the interval. The full problem therefore becomes a cycle search in a directed graph with out-degree 1. The sum-of-proper-divisors map Every proper divisor \(d\) of \(m\) contributes exactly once to \(s(m)\). Instead of factoring each \(m\) separately, we distribute each possible divisor to all of its multiples....
Detailed mathematical approach
Problem Summary
Let \(s(n)\) be the sum of the proper divisors of \(n\):
$s(n)=\sum_{\substack{d\mid n\\1\le d<n}} d.$
Starting from \(a_0=n\), define an aliquot sequence by \(a_{k+1}=s(a_k)\). The problem asks for the smallest member of the longest cycle produced by this iteration, under the condition that every element of the cycle is at most \(10^6\). Thus we are searching the directed cycles of the map \(n\mapsto s(n)\) inside the bounded interval \([1,10^6]\).
A direct per-start factorization strategy would waste most of the work. The implementations instead precompute every value \(s(n)\) once, then analyze the resulting functional graph of outgoing edges.
Mathematical Approach
Write \(N=10^6\) and \(f(n)=s(n)\). Each integer \(n\in[1,N]\) has exactly one successor \(f(n)\), although that successor may lie outside the interval. The full problem therefore becomes a cycle search in a directed graph with out-degree 1.
The sum-of-proper-divisors map
Every proper divisor \(d\) of \(m\) contributes exactly once to \(s(m)\). Instead of factoring each \(m\) separately, we distribute each possible divisor to all of its multiples. For a fixed \(d\), the affected numbers are \(2d,3d,\dots,\left\lfloor N/d\right\rfloor d\), so the sieve performs the updates
$s(kd)\leftarrow s(kd)+d\qquad\text{for }k=2,3,\dots,\left\lfloor\frac{N}{d}\right\rfloor.$
After all \(d\) have been processed, the array already contains the exact values of \(s(1),s(2),\dots,s(N)\). This is the real mathematical object used by the code: the aliquot map has been materialized as a lookup table.
Why the sieve is fast enough
The total number of updates is
$\sum_{d=1}^{N}\left(\left\lfloor\frac{N}{d}\right\rfloor-1\right),$
which is \(O(N\log N)\) by the harmonic-series estimate. That is dramatically smaller than refactoring every number from scratch, and it makes the later graph traversal cheap because each successor query becomes an \(O(1)\) array access.
The bounded functional graph
Once the table is known, every start value generates a deterministic walk
$n,\ f(n),\ f(f(n)),\ \dots.$
For a walk that stays inside \([1,N]\), exactly one of the following eventually happens: the next value leaves the interval, the walk enters a region explored earlier, or the walk repeats a value already seen on the current path. The third case is the only one that produces a new bounded cycle.
This viewpoint matters because a graph with out-degree 1 cannot branch forward: from any starting value there is only one possible future. That is the invariant that makes the global marking step correct.
Recovering the cycle from the first repeated value
Suppose the current walk has recorded distinct in-range values
$p_0,p_1,\dots,p_{m-1}.$
If the next value is \(p_r\) for some \(r\), then the suffix
$p_r,p_{r+1},\dots,p_{m-1}$
is exactly the cycle. Its length is
$\ell=m-r,$
and the candidate answer from this cycle is
$\min\{p_r,p_{r+1},\dots,p_{m-1}\}.$
The prefix \(p_0,\dots,p_{r-1}\) is only a tail leading into that cycle. This is why the implementations keep, alongside the current path, a table recording the first position where each value on that path appeared.
Why completed paths can be marked globally
After one walk stops, every in-range value on that walk can be marked as fully processed. The reason is structural: in a functional graph, starting again from any one of those values can only follow the same tail, reach the same cycle, leave the interval in the same way, or merge into a component that was already known. No new bounded cycle can be discovered through them.
That single invariant converts the second phase from repeated trial walks into an almost linear sweep: each in-range value is appended to an explicit path at most once before it becomes permanently classified.
Worked example: the small checkpoint
A concrete example appears already at the smaller bound \(N=300\). The sieve gives
$s(220)=1+2+4+5+10+11+20+22+44+55+110=284,$
$s(284)=1+2+4+71+142=220.$
So the walk from 220 is
$220\to 284\to 220.$
The repeated value 220 first appeared at position 0, so the whole recorded path is the cycle. Its length is 2 and its smallest member is 220. This is exactly the kind of cycle extraction the implementations use at the full one-million bound as well.
How the Code Works
The C++, Python, and Java implementations first build an array of proper-divisor sums for all values up to the limit. They then scan start values from 2 upward. If a start value has already been classified during an earlier walk, it is skipped immediately.
For each new start value, the implementation grows a current path and a position table for values on that path. The walk stops when the next value leaves \([1,N]\), reaches a globally processed value, or repeats inside the current path. In the repeat case, the repeated suffix yields one bounded cycle, from which the implementation computes the cycle length and its minimum member. The global best answer is updated by the rule: prefer the longer cycle, and on equal length prefer the smaller minimum element. After that, every value seen on the current path is marked as processed.
Self-loops, corresponding to perfect numbers, need no special branch in the implementation. They are simply cycles of length 1, and any longer cycle automatically beats them under the same update rule.
Complexity Analysis
The sieve phase costs
$\sum_{d=1}^{N}\left\lfloor\frac{N}{d}\right\rfloor=O(N\log N),$
which dominates the running time. The path-following phase is \(O(N)\) after the sieve, because each in-range value is inserted into an explicit path at most once before it becomes globally processed. Therefore the overall time complexity is \(O(N\log N)\).
Memory usage is linear. The divisor-sum table and the global processed marker both require \(O(N)\) space, and the temporary path plus its position table are also bounded by \(O(N)\) in the worst case.
Footnotes and References
- Problem page: https://projecteuler.net/problem=95
- Aliquot sequence: Wikipedia - Aliquot sequence
- Amicable numbers: Wikipedia - Amicable numbers
- Sociable number: Wikipedia - Sociable number
- Divisor function: Wikipedia - Divisor function
- Perfect number: Wikipedia - Perfect number