Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 977: Iterated Functions

View on Project Euler

Project Euler Problem 977 Solution

EulerSolve provides an optimized solution for Project Euler Problem 977, Iterated Functions, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary For a fixed \(n\), we count sequences \((a_1,a_2,\dots,a_n)\) with each \(a_i \in \{1,2,\dots,n\}\) and with the iterated-function condition $a_{i+1}=a_{a_i}\qquad (1 \le i < n).$ Project Euler 977 asks for this count when \(n=10^6\), reported modulo \(10^9+7\). The brute-force search space has size \(n^n\), so the only viable route is to understand the structure forced by the recurrence and then sum the resulting combinatorial classes in closed form. Mathematical Approach The key idea is that the recurrence does not describe arbitrary sequences: it describes the forward orbit of a single self-map, and every valid sequence eventually becomes periodic. The solution classifies sequences by the first place where that periodic regime begins. From the recurrence to an orbit Define a function \(f:\{1,\dots,n\}\to\{1,\dots,n\}\) by \(f(i)=a_i\). Then $a_{i+1}=a_{a_i}=f(a_i)=f(f(i)).$ Inductively this gives $a_i=f^i(1)\qquad (i \ge 1),$ so the sequence is exactly the orbit of 1 under repeated application of \(f\). More generally, the orbit of index \(m\) is just a shifted suffix: $f^t(m)=a_{m+t-1}\qquad (t \ge 1).$ This has an important consequence: if two positions carry the same value, then their entire future tails coincide. In particular, once a suffix starts repeating with some period, everything after that point is rigid....

Detailed mathematical approach

Problem Summary

For a fixed \(n\), we count sequences \((a_1,a_2,\dots,a_n)\) with each \(a_i \in \{1,2,\dots,n\}\) and with the iterated-function condition

$a_{i+1}=a_{a_i}\qquad (1 \le i < n).$

Project Euler 977 asks for this count when \(n=10^6\), reported modulo \(10^9+7\). The brute-force search space has size \(n^n\), so the only viable route is to understand the structure forced by the recurrence and then sum the resulting combinatorial classes in closed form.

Mathematical Approach

The key idea is that the recurrence does not describe arbitrary sequences: it describes the forward orbit of a single self-map, and every valid sequence eventually becomes periodic. The solution classifies sequences by the first place where that periodic regime begins.

From the recurrence to an orbit

Define a function \(f:\{1,\dots,n\}\to\{1,\dots,n\}\) by \(f(i)=a_i\). Then

$a_{i+1}=a_{a_i}=f(a_i)=f(f(i)).$

Inductively this gives

$a_i=f^i(1)\qquad (i \ge 1),$

so the sequence is exactly the orbit of 1 under repeated application of \(f\). More generally, the orbit of index \(m\) is just a shifted suffix:

$f^t(m)=a_{m+t-1}\qquad (t \ge 1).$

This has an important consequence: if two positions carry the same value, then their entire future tails coincide. In particular, once a suffix starts repeating with some period, everything after that point is rigid.

Classify by the first periodic suffix

Let \(s\) be the first index such that the suffix

$a_s,a_{s+1},\dots,a_n$

is periodic from its first term. Write

$N=n-s+1$

for the suffix length, and let its exact period be \(l\). Then the suffix positions split into residue classes modulo \(l\):

$S_u=\{\,s+u-1+ml : m \ge 0,\ s+u-1+ml \le n\,\}\qquad (1 \le u \le l).$

If \(N=ql+r\) with \(0 \le r < l\), then the first \(r\) classes have size \(q+1\) and the remaining \(l-r\) classes have size \(q\).

Count one periodic suffix

Because the suffix has period \(l\), every position in the same class \(S_u\) carries the same value; call it \(c_u\). The recurrence forces a simple rule:

$c_u \in S_{u+1}\quad (1 \le u < l),\qquad c_l \in S_1.$

So choosing the suffix means choosing one element from the next residue class for each \(u\). The number of choices is therefore

$P(N,l)=\prod_{u=1}^{l}|S_u|=q^{\,l-r}(q+1)^r,$

where

$q=\left\lfloor \frac{N}{l}\right\rfloor,\qquad r=N \bmod l.$

This is the basic factor that appears everywhere in the implementations.

Attach the non-periodic prefix

If the periodic part starts at the very beginning, so \(s=1\) and \(N=n\), there is no prefix to attach. Those sequences contribute

$A(n)=\sum_{l=1}^{n} P(n,l).$

Now assume \(s>1\). Then \(a_{s-1}\) must be chosen so that

$a_s=a_{a_{s-1}}=c_1.$

The positions whose value is \(c_1\) are exactly the elements of \(S_1\), so \(a_{s-1}\) must lie in \(S_1\). But one choice is forbidden: the special element \(c_l \in S_1\). If we took \(a_{s-1}=c_l\), then the same \(l\)-periodic pattern would already start at position \(s-1\), contradicting the minimality of \(s\).

Therefore the number of admissible attachments is

$|S_1|-1=\left\lceil \frac{N}{l}\right\rceil - 1,$

which is \(q-1\) when \(r=0\) and \(q\) when \(r>0\). Once \(a_{s-1}\) is fixed, every earlier position must be the tautological forward link

$a_i=i+1\qquad (1 \le i \le s-2),$

because any earlier nontrivial copy would make the periodic regime begin even sooner.

So the full count is

$F(n)=\sum_{l=1}^{n} P(n,l)+\sum_{N=1}^{n-1}\sum_{l=1}^{N}\left(\left\lceil \frac{N}{l}\right\rceil-1\right)P(N,l).$

Worked Example: \(n=7\), suffix length \(N=4\), period \(l=2\)

Take \(s=4\), so the periodic suffix is \(a_4,a_5,a_6,a_7\). The residue classes are

$S_1=\{4,6\},\qquad S_2=\{5,7\}.$

To build a 2-periodic suffix, choose \(c_1 \in S_2\) and \(c_2 \in S_1\). There are

$P(4,2)=2\cdot 2=4$

choices. For example, \(c_1=5\) and \(c_2=6\) give the suffix

$5,6,5,6.$

The previous term \(a_3\) must lie in \(S_1\), but it cannot equal \(c_2=6\), otherwise the 2-periodic pattern would already begin at position 3. So \(a_3=4\) is forced, and then the earlier prefix is the rigid chain \(a_1=2\), \(a_2=3\). One valid sequence is therefore

$ (2,3,4,5,6,5,6). $

All four suffix choices work in the same way, so this class contributes

$\left(\left\lceil \frac{4}{2}\right\rceil-1\right)P(4,2)=1 \cdot 4=4$

sequences, exactly as the formula predicts.

Regroup the double sum by quotient blocks

The direct formula above is already correct, and the slower validation routines evaluate it exactly. The fast solver reorganizes the second double sum. For fixed \(l\), write

$N=ql+r,\qquad 0 \le r < l,$

so \(q=\lfloor N/l \rfloor\). Then

$P(N,l)=q^{\,l-r}(q+1)^r.$

When \(N\) runs through the block \(ql,ql+1,\dots,(q+1)l-1\), the attachment factor is \(q-1\) at \(r=0\) and \(q\) for \(r \ge 1\). If

$R=\min(l-1,n-1-ql),$

the whole block contributes

$B_{l,q}=(q-1)q^l+\sum_{r=1}^{R} q^{\,l+1-r}(q+1)^r.$

The inner sum is telescoping:

$\sum_{r=1}^{R} q^{\,l+1-r}(q+1)^r=q^{\,l+1-R}(q+1)^{R+1}-q^{\,l+1}(q+1).$

This is exactly the closed form used by the production implementations.

How the Code Works

Power-table precomputation

The C++, Python, and Java implementations precompute all powers that can appear later. For a base \(b\), the largest exponent ever needed is at most

$\left\lfloor \frac{n-1}{b-1}\right\rfloor + 1,$

so every required value \(b^e \bmod (10^9+7)\) can be read in \(O(1)\) time from a packed table. This avoids an enormous number of repeated modular exponentiations inside the main summation.

Two layers of counting

The implementations first evaluate

$A(n)=\sum_{l=1}^{n}P(n,l),$

which counts sequences whose periodic regime begins at position 1. They then add the correction term for later starts, but not as a naive triple loop over \((N,l,r)\). Instead, for each \(l\) they iterate over quotient plateaus \(q=\lfloor N/l \rfloor\), compute the block tail length \(R\), and add the closed block sum \(B_{l,q}\). That is why the code matches the mathematical formula but runs far faster.

Validation and parallel execution

Each implementation also contains small checkpoints: exhaustive enumeration confirms that the count for \(n=7\) is 174, and a direct evaluation of the ungathered double sum confirms that the count for \(n=100\) is 305741269. After that, the large computation for \(n=10^6\) uses the precomputed power table and the regrouped block formulas. The C++ and Java implementations split the \(l\)-range across worker threads, while the Python implementation performs the same arithmetic serially.

Complexity Analysis

The dominant work is not exponential anymore. Building the packed power tables costs

$\sum_{b=2}^{n+1} O\!\left(\frac{n}{b-1}\right)=O(n \log n),$

and the block summation for the correction term has the same order because

$\sum_{l=1}^{n-1}\left\lfloor\frac{n-1}{l}\right\rfloor = O(n \log n).$

So the fast method runs in \(O(n \log n)\) time and uses \(O(n \log n)\) memory for the power tables. In practice it is efficient because every block contribution is reduced to a handful of modular multiplications and table lookups.

Footnotes and References

  1. Problem page: https://projecteuler.net/problem=977
  2. Iterated function: Wikipedia - Iterated function
  3. Functional graph: Wikipedia - Functional graph
  4. Eventually periodic sequence: Wikipedia - Eventually periodic points
  5. Floor and ceiling functions: Wikipedia - Floor and ceiling functions
  6. Geometric series: Wikipedia - Geometric series
  7. Modular arithmetic: Wikipedia - Modular arithmetic

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 976 · All Project Euler solutions · Next: Problem 978

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮