Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 871: Drifting Subsets

Project Euler Problem 871 Solution

EulerSolve provides an optimized solution for Project Euler Problem 871, Drifting Subsets, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For a fixed $n$, define $f_n(x)\equiv x^3+x+1 \pmod{n}, \qquad V=\{0,1,\dots,n-1\}.$ The quantity $D(n)$ is the maximum size of a subset $S\subseteq V$ such that the map $f_n$ sends every selected element to a distinct unselected element. Equivalently, the set must satisfy $f_n(S)\cap S=\varnothing \qquad \text{and} \qquad |f_n(S)|=|S|.$ Since $V$ is finite, the second condition simply says that $f_n$ is injective on $S$. The overall Project Euler task is to evaluate $\sum_{i=1}^{100} D(100000+i).$ Mathematical Approach The map $f_n$ naturally defines a functional graph, and the whole problem becomes a graph optimization problem with a very local constraint. Step 1: Convert the congruence into a functional graph Draw one directed edge from each residue $x\in V$ to $f_n(x)$. Every vertex has out-degree exactly $1$, so each connected component of the graph consists of one directed cycle together with rooted in-trees feeding into that cycle. Let $P(u)=\{x\in V:f_n(x)=u\}$ be the set of predecessors of $u$. The drifting condition is equivalent to the local rule $|S\cap (\{u\}\cup P(u))|\le 1 \qquad \text{for every } u\in V.$ Indeed, if two chosen elements had the same image $u$, then two members of $P(u)$ would lie in $S$....

Detailed mathematical approach

Problem Summary

For a fixed $n$, define

$f_n(x)\equiv x^3+x+1 \pmod{n}, \qquad V=\{0,1,\dots,n-1\}.$

The quantity $D(n)$ is the maximum size of a subset $S\subseteq V$ such that the map $f_n$ sends every selected element to a distinct unselected element. Equivalently, the set must satisfy

$f_n(S)\cap S=\varnothing \qquad \text{and} \qquad |f_n(S)|=|S|.$

Since $V$ is finite, the second condition simply says that $f_n$ is injective on $S$. The overall Project Euler task is to evaluate

$\sum_{i=1}^{100} D(100000+i).$

Mathematical Approach

The map $f_n$ naturally defines a functional graph, and the whole problem becomes a graph optimization problem with a very local constraint.

Step 1: Convert the congruence into a functional graph

Draw one directed edge from each residue $x\in V$ to $f_n(x)$. Every vertex has out-degree exactly $1$, so each connected component of the graph consists of one directed cycle together with rooted in-trees feeding into that cycle.

Let

$P(u)=\{x\in V:f_n(x)=u\}$

be the set of predecessors of $u$. The drifting condition is equivalent to the local rule

$|S\cap (\{u\}\cup P(u))|\le 1 \qquad \text{for every } u\in V.$

Indeed, if two chosen elements had the same image $u$, then two members of $P(u)$ would lie in $S$. If a chosen element mapped to another chosen element, then both a vertex and one of its predecessors would lie in $S$. So the global condition reduces to a one-vertex-at-a-time exclusion rule.

Step 2: Peel away everything that is not on a cycle

Repeatedly remove vertices of in-degree $0$. This standard indegree-peeling process deletes every tree vertex and leaves exactly the directed cycles. The removed vertices appear in an order in which all predecessors of a peeled vertex are processed before the vertex itself, which is exactly what the dynamic program needs.

After this peeling step, each remaining cycle vertex becomes the root of an attached incoming tree formed by the removed vertices that eventually flow into it.

Step 3: Dynamic programming on a tree

Fix a peeled vertex $u$, and let $C(u)$ be its peeled predecessors, meaning the tree vertices $c$ with $f_n(c)=u$. Define:

$A_u=\text{best value in the tree of }u\text{ when }u\notin S,$

$B_u=\text{best value in the tree of }u\text{ when }u\in S.$

If $u$ is selected, then none of its predecessors may be selected, because they would either share the image $u$ or land inside the chosen set. Therefore

$B_u=1+\sum_{c\in C(u)} A_c.$

If $u$ is not selected, then at most one predecessor may be selected, because all predecessors map to the same target $u$. So we may take the baseline in which every predecessor stays unselected, and optionally upgrade one predecessor to its selected state:

$A_u=\sum_{c\in C(u)} A_c+\max\left(0,\max_{c\in C(u)}(B_c-A_c)\right).$

This recurrence is the key tree formula used by all three implementations.

Step 4: Compress each cycle vertex into a baseline and a gain

Now consider one directed cycle $c_0,c_1,\dots,c_{k-1}$ with

$f_n(c_i)=c_{i+1} \quad \text{for } i \pmod{k}.$

For each cycle vertex $c_i$, let $T_i$ be the non-cycle predecessors attached to it. Their tree contribution can be summarized by two numbers:

$a_i=\sum_{t\in T_i} A_t,$

$g_i=\max\left(0,\max_{t\in T_i}(B_t-A_t)\right).$

Here $a_i$ is the baseline if all attached predecessors stay unselected, and $g_i$ is the best optional bonus from selecting one attached predecessor. If we decide not to select any cycle vertex at all, then the entire component contributes

$C_0=\sum_{i=0}^{k-1}(a_i+g_i).$

Step 5: Reduce the cycle itself to a maximum-weight independent set

Suppose we now decide to select the cycle vertex $c_i$. Relative to the baseline $C_0$, this has three effects:

$+1 \text{ for selecting } c_i,$

$-g_i \text{ because no attached predecessor of } c_i \text{ may be selected},$

$-g_{i+1} \text{ because } c_i \text{ already occupies the unique chosen predecessor slot into } c_{i+1}.$

So the net weight of selecting $c_i$ is

$w_i=1-g_i-g_{i+1}.$

Also, two adjacent cycle vertices cannot both be selected: if $c_i\in S$, then $f_n(c_i)=c_{i+1}\notin S$. Hence we must choose a maximum-weight independent set on a cycle with vertex weights $w_i$.

For a path, the usual recurrence is

$P_j=\max(P_{j-1},P_{j-2}+u_j),$

which computes the best non-adjacent sum of weights $u_j$. For a cycle of length $k>1$, split into two cases:

$\text{Case A: do not select } c_0,$

$\text{Case B: select } c_0 \text{ and therefore forbid } c_1 \text{ and } c_{k-1}.$

This yields

$\text{best}=\max\Bigl(0,\ \operatorname{Path}(w_1,\dots,w_{k-1}),\ w_0+\operatorname{Path}(w_2,\dots,w_{k-2})\Bigr).$

The contribution of the whole component is then

$D_{\text{comp}}=C_0+\text{best}.$

If $k=1$, the cycle is a self-loop, so the cycle vertex maps to itself and can never be selected. In that special case the contribution is just $a_0+g_0$.

Step 6: Worked example for $n=10$

For $n=10$, the map is

$0\mapsto 1,\ 1\mapsto 3,\ 2\mapsto 1,\ 3\mapsto 1,\ 4\mapsto 9,\ 5\mapsto 1,\ 6\mapsto 3,\ 7\mapsto 1,\ 8\mapsto 1,\ 9\mapsto 9.$

So there are two cycle components: the 2-cycle $1\leftrightarrow 3$, and the self-loop $9\mapsto 9$.

The attached tree leaves are $\{0,2,5,7,8\}$ into $1$, $\{6\}$ into $3$, and $\{4\}$ into $9$. Every such leaf has

$A=0,\qquad B=1.$

For the cycle $1\leftrightarrow 3$, this gives

$a_1=0,\ g_1=1,\qquad a_2=0,\ g_2=1,$

so

$C_0=(0+1)+(0+1)=2,$

and both cycle weights are

$w=1-1-1=-1.$

Thus the best cycle correction is $0$, and this component contributes $2$.

For the self-loop at $9$, we have $a=0$ and $g=1$, and the loop vertex itself cannot be selected. So that component contributes $1$.

Therefore

$D(10)=2+1=3.$

One optimal subset is $\{0,4,6\}$, whose image is $\{1,9,3\}$; the image is disjoint from the subset and all three images are distinct.

How the Code Works

The C++, Python, and Java implementations all follow the same pipeline.

First, they build the successor of every residue $x$, count in-degrees, and store reverse predecessor lists. Second, they run indegree peeling with a queue, which identifies the tree vertices and leaves only the cycle vertices unpeeled.

Next, they process the peeled vertices in queue order and compute the two tree-DP states described above. Once that is done, every cycle vertex already knows the optimal contribution of its attached trees.

Finally, they walk through each remaining directed cycle, assemble the baseline $a_i+g_i$ values, translate the cycle choice into a maximum-weight independent-set problem, solve the two path cases, and add the resulting component contribution to $D(n)$.

After computing one $D(n)$, the implementation repeats the procedure for $n=100001,100002,\dots,100100$ and sums the answers.

Complexity Analysis

For one value of $n$, graph construction touches each vertex once and creates exactly one outgoing edge per vertex, so it costs $O(n)$ time and $O(n)$ memory. Indegree peeling is also $O(n)$, because every vertex enters the queue at most once and every edge is examined a constant number of times.

The tree dynamic program processes each peeled predecessor relation once, and the cycle stage visits every unpeeled vertex exactly once. Therefore the total cost for a single $D(n)$ remains

$O(n)\text{ time and }O(n)\text{ memory.}$

The final Project Euler sum covers 100 consecutive values of $n$, so the full runtime is linear in the total number of processed residues across those 100 instances.

Footnotes and References

Project Euler problem page: Project Euler 871 - Drifting Subsets
Functional graphs: Wikipedia - Functional graph
Indegree peeling and topological ideas: Wikipedia - Topological sorting
Independent sets in graphs: Wikipedia - Independent set
Dynamic programming background: Wikipedia - Dynamic programming