Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 458: Permutations of Project

Project Euler Problem 458 Solution

EulerSolve provides an optimized solution for Project Euler Problem 458, Permutations of Project, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Let $T(n)$ be the number of length-$n$ words over the seven distinct letters of “project” such that no contiguous block of length $7$ is a permutation of those seven letters. We must compute $T(10^{12}) \bmod 10^9$. Because the alphabet has size $7$, a forbidden block of length $7$ is exactly a block whose seven letters are all different. So the problem is to count words in which every length-$7$ window contains at least one repeated letter. Mathematical Approach Step 1: Only the newest 7-window can become invalid Suppose we already have a valid word and append one more letter. Every old window of length $7$ stays unchanged, so the only new constraint comes from the final window that ends at the appended position. Therefore, to know which letters may be appended, we only need information about the previous $6$ characters. This gives the problem a finite-state structure: the future depends only on the suffix of length $6$, not on the entire earlier prefix. Step 2: Replace concrete letters by a canonical equality pattern The actual names of the letters do not matter; only the equality pattern inside the last six positions matters. Two suffixes are equivalent if equal positions match equal positions after consistently renaming letters....

Detailed mathematical approach

Problem Summary

Let $T(n)$ be the number of length-$n$ words over the seven distinct letters of “project” such that no contiguous block of length $7$ is a permutation of those seven letters. We must compute $T(10^{12}) \bmod 10^9$.

Because the alphabet has size $7$, a forbidden block of length $7$ is exactly a block whose seven letters are all different. So the problem is to count words in which every length-$7$ window contains at least one repeated letter.

Mathematical Approach

Step 1: Only the newest 7-window can become invalid

Suppose we already have a valid word and append one more letter. Every old window of length $7$ stays unchanged, so the only new constraint comes from the final window that ends at the appended position. Therefore, to know which letters may be appended, we only need information about the previous $6$ characters.

This gives the problem a finite-state structure: the future depends only on the suffix of length $6$, not on the entire earlier prefix.

Step 2: Replace concrete letters by a canonical equality pattern

The actual names of the letters do not matter; only the equality pattern inside the last six positions matters. Two suffixes are equivalent if equal positions match equal positions after consistently renaming letters. For example, the suffix $a\,b\,a\,c\,b\,d$ becomes the canonical pattern

$\left(0,1,0,2,1,3\right).$

Reading left to right, the first new letter receives label $0$, the next unseen letter receives label $1$, and so on. These canonical patterns are exactly the set partitions of a 6-element set, so the number of states is the Bell number

$B_6 = 203.$

That is why the implementations work with a $203$-state transfer matrix instead of all $7^6$ concrete suffixes.

Step 3: Weighted transitions between states

Take a canonical suffix pattern using $k$ distinct labels. This means the current suffix contains exactly $k$ distinct letters.

If we append a letter already present in the suffix, the new 7-window still contains at most $k \le 6$ distinct letters, so it is always valid. There are exactly $k$ such concrete choices. After dropping the oldest character and canonicalizing the new suffix again, each of those choices contributes weight $1$ to some next state.

If we append a letter not present in the suffix, then the new 7-window contains $k+1$ distinct letters. This is allowed only when $k \le 5$. There are $7-k$ such letters, and after canonical renaming they all lead to the same next pattern, so that transition receives weight $7-k$.

When $k=6$, no new letter may be appended, because that would create a window with all seven letters distinct, which is exactly the forbidden case.

Therefore, if $v_m$ is the column vector whose entries count valid words of length $m$ ending in each canonical state, then

$v_{m+1}=A\,v_m \pmod{10^9},$

where $A$ is the $203\times203$ weighted transition matrix.

Step 4: Initial vector at length 6

For length $6$, every word is valid because no length-$7$ window exists yet. If a canonical pattern uses $k$ labels, then we must assign $k$ distinct actual letters to those labels. The number of such assignments is the falling factorial

$ (7)_k = 7\cdot 6\cdot 5\cdots(7-k+1)=\frac{7!}{(7-k)!}. $

So the entry of $v_6$ corresponding to a state with $k$ labels is exactly $(7)_k$.

Step 5: Fast exponentiation in the word length

For $n \le 6$, we simply have

$T(n)=7^n.$

For $n \ge 6$, repeated application of the transition matrix gives

$v_n=A^{n-6}v_6 \pmod{10^9},\qquad T(n)=\sum_{s}(v_n)_s \pmod{10^9}.$

The huge exponent $n-6$ is handled by exponentiation by squaring, so the dependence on $n$ is only logarithmic.

Checkpoint Example

For $n=7$, the only invalid words are the permutations of the seven distinct letters, one for each ordering of the alphabet. Hence

$T(7)=7^7-7!=823543-5040=818503.$

Applying one more transfer step yields

$T(8)=5699281,$

which matches the checkpoints verified by the implementations.

How the Code Works

The C++, Python, and Java implementations generate all 203 canonical suffix patterns, index them, and build the weighted transition matrix described above. They then create the length-$6$ start vector using the falling-factorial count $(7)_k$, raise the matrix to the power $n-6$ by repeated squaring modulo $10^9$, and finally sum the entries of the resulting state vector.

Complexity Analysis

Let $S=203$ be the number of states. State generation and transition construction are fixed-size preprocessing for this problem. The dominant cost is matrix exponentiation, which requires $O(S^3\log n)$ time and $O(S^2)$ memory. Since $S$ is small and fixed, this easily handles $n=10^{12}$.

Footnotes and References

Problem page: https://projecteuler.net/problem=458
Bell numbers and set partitions: Wikipedia — Bell number
Falling factorials: Wikipedia — Falling and rising factorials
Transfer-matrix method: Wikipedia — Transfer-matrix method
Exponentiation by squaring: Wikipedia — Exponentiation by squaring