Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 305: Reflexive Position

View on Project Euler

Project Euler Problem 305 Solution

EulerSolve provides an optimized solution for Project Euler Problem 305, Reflexive Position, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Consider the infinite concatenation $C=12345678910111213\dots$ For a given \(n\), let \(P=\text{str}(n)\). Define \(f(n)\) as the starting position of the \(n\)-th occurrence of \(P\) inside \(C\). The Project Euler sum is taken over \(n=3^k\). Mathematical Approach 1) Why this is a string-search problem on one continuous stream The pattern \(P\) can appear: 1. entirely inside one integer, 2. across the boundary between consecutive integers, 3. overlapping with itself. So we must search in the continuous digit stream \(123456789101112\dots\), not number by number. Resetting the match state at integer boundaries would miss valid occurrences. 2) KMP automaton for \(P=\text{str}(n)\) The code builds a Knuth-Morris-Pratt automaton for the decimal pattern \(P\). A state \(q\in\{0,\dots,L\}\), where \(L=|P|\), means: “the last processed digits currently match the first \(q\) digits of \(P\)”. For every state \(q\) and digit \(d\in\{0,\dots,9\}\), the automaton stores: $q'=\delta(q,d),$ and an output bit $o(q,d)\in\{0,1\},$ which tells whether reading digit \(d\) completes one new occurrence of \(P\). Because the KMP fallback is applied after a full match, overlapping occurrences are counted correctly. 3) Operator view of a digit block For any finite digit string \(w\), define an operator \(T_w\)....

Detailed mathematical approach

Problem Summary

Consider the infinite concatenation

$C=12345678910111213\dots$

For a given \(n\), let \(P=\text{str}(n)\). Define \(f(n)\) as the starting position of the \(n\)-th occurrence of \(P\) inside \(C\). The Project Euler sum is taken over \(n=3^k\).

Mathematical Approach

1) Why this is a string-search problem on one continuous stream

The pattern \(P\) can appear:

1. entirely inside one integer,

2. across the boundary between consecutive integers,

3. overlapping with itself.

So we must search in the continuous digit stream \(123456789101112\dots\), not number by number. Resetting the match state at integer boundaries would miss valid occurrences.

2) KMP automaton for \(P=\text{str}(n)\)

The code builds a Knuth-Morris-Pratt automaton for the decimal pattern \(P\). A state \(q\in\{0,\dots,L\}\), where \(L=|P|\), means: “the last processed digits currently match the first \(q\) digits of \(P\)”.

For every state \(q\) and digit \(d\in\{0,\dots,9\}\), the automaton stores:

$q'=\delta(q,d),$

and an output bit

$o(q,d)\in\{0,1\},$

which tells whether reading digit \(d\) completes one new occurrence of \(P\).

Because the KMP fallback is applied after a full match, overlapping occurrences are counted correctly.

3) Operator view of a digit block

For any finite digit string \(w\), define an operator \(T_w\). Starting from automaton state \(q\), this operator tells us two things after scanning \(w\):

$T_w(q)=\bigl(q_{\text{end}}(q),c_w(q)\bigr),$

where \(q_{\text{end}}(q)\) is the ending state and \(c_w(q)\) is the number of new matches found while reading \(w\).

This is exactly what the code stores as end_state and add_count.

4) Why operators compose so cleanly

If a block is split as \(w=uv\), then scanning \(u\) first and \(v\) second gives

$T_{uv}(q)=\Bigl(q_v(q_u(q)),\,c_u(q)+c_v\bigl(q_u(q)\bigr)\Bigr).$

So composition is associative. This is the central algebraic fact used by the solver: once we know the operator of a block, we never need to rescan its digits.

5) Full blocks and partial blocks

Let a decimal prefix already be fixed. Then:

1. full_block(prefix, rem_digits) means the concatenation of all numbers with that prefix followed by all \(10^{\text{rem\_digits}}\) possible suffixes;

2. partial_block(prefix, rem_digits, upper_suffix) means the same, but only up to a given suffix bound.

These operators are memoized. Since many queries reuse the same prefix-and-length structure, caching removes the exponential blowup.

6) Building the operator for \([1..N]\)

To process all integers from \(1\) to \(N\), the code does not concatenate them explicitly. Instead it composes operators in this order:

1. all complete digit-length blocks with fewer digits than \(N\);

2. within the top digit length, all complete leading-digit blocks smaller than the first digit of \(N\);

3. one final partial block for the prefix that leads exactly up to \(N\).

The result is an operator we may denote by

$T_{[1..N]}.$

Applying it to automaton state \(0\) yields the total number of occurrences of \(P\) in the finite prefix \(123\dots N\).

7) The monotone counting function \(A(N)\)

Define

$A(N)=\text{number of occurrences of }P\text{ in }123\dots N.$

Clearly \(A(N)\) is monotone nondecreasing in \(N\). Therefore we can binary-search the smallest \(N\) such that

$A(N)\ge n.$

That identifies the exact integer in which the \(n\)-th occurrence finishes.

8) Locating the occurrence inside the final number

After binary search finds this minimal \(N\), the code computes the automaton state after scanning \(123\dots(N-1)\). Then it scans the digits of \(N\) once more, counting only the matches produced inside that last number.

If the desired occurrence ends at digit index \(i\) inside \(\text{str}(N)\), and \(L=|P|\), then its global starting position is

$\text{digits\_before}(N-1)+i-L+2.$

Here

$\text{digits\_before}(N)=\sum_{d=1}^{D-1}9\cdot 10^{d-1}\cdot d+\bigl(N-10^{D-1}+1\bigr)D,$

where \(D\) is the number of digits of \(N\).

9) Checkpoints

The implementation verifies several known values:

$f(1)=1,\qquad f(5)=81,\qquad f(12)=271,\qquad f(7780)=111111365.$

It also brute-forces all cases \(1\le n\le 30\) against a direct stream simulation, which is a strong validation of the operator approach.

How the Code Works

The class PatternEngine does all heavy lifting. It builds the KMP automaton, interns operators so identical operators are reused, memoizes operator composition, and provides:

1. range_operator(N) for the whole prefix \(123\dots N\),

2. occ_and_state(N) for total match count and ending automaton state,

3. nth_occurrence_position(target) for the final binary search and local scan.

The outer solver simply evaluates this engine for patterns \(3,9,27,\dots,3^{13}\) and sums the returned positions.

Complexity Analysis

If \(L=|P|\), one operator composition costs \(O(L)\). Thanks to memoized full/partial blocks, evaluating \(A(N)\) is much closer to polylogarithmic in \(N\) than to linear in the total digit stream length. That is the whole reason the problem is feasible: the code never generates the massive prefix of \(C\) explicitly.

Further Reading

  1. Problem page: https://projecteuler.net/problem=305
  2. Knuth-Morris-Pratt algorithm: https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm
  3. Champernowne constant: https://en.wikipedia.org/wiki/Champernowne_constant

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 304 · All Project Euler solutions · Next: Problem 306

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮