Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 308: An Amazing Prime-generating Automaton

Project Euler Problem 308 Solution

EulerSolve provides an optimized solution for Project Euler Problem 308, An Amazing Prime-generating Automaton, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary The automaton in Problem 308 is a graphical encoding of Conway's prime-generating FRACTRAN program. If we simulate it literally, the number of steps explodes long before we reach the $10001$-st prime output. The key is to understand one complete candidate cycle analytically and count steps without walking through every transition. Mathematical Approach 1) State interpretation: prime exponents, not raw integers The FRACTRAN machine can be represented by exponent counts of the primes $2,3,5,7,11,13,17,19,23,29.$ So a machine state is equivalent to a vector of exponents, or to the integer $N=2^a3^b5^c7^d11^e13^f17^g19^h23^i29^j.$ The automaton “prints” a value exactly when the state is a pure power of two: $N=2^m.$ Then the printed number is the exponent $m$. The brute-force checkpoint in the C++ code confirms that the first outputs are $2,3,5,7,11,13,17,19,23,29,\dots$ with step indices $19,69,281,710,2375,3893,8102,11361,19268,36981,\dots$ 2) Canonical start-state for candidate $n$ The program does not test primes by trial division in the usual sense....

Detailed mathematical approach

Problem Summary

The automaton in Problem 308 is a graphical encoding of Conway's prime-generating FRACTRAN program. If we simulate it literally, the number of steps explodes long before we reach the $10001$-st prime output. The key is to understand one complete candidate cycle analytically and count steps without walking through every transition.

Mathematical Approach

1) State interpretation: prime exponents, not raw integers

The FRACTRAN machine can be represented by exponent counts of the primes

$2,3,5,7,11,13,17,19,23,29.$

So a machine state is equivalent to a vector of exponents, or to the integer

$N=2^a3^b5^c7^d11^e13^f17^g19^h23^i29^j.$

The automaton “prints” a value exactly when the state is a pure power of two:

$N=2^m.$

Then the printed number is the exponent $m$. The brute-force checkpoint in the C++ code confirms that the first outputs are

$2,3,5,7,11,13,17,19,23,29,\dots$

with step indices

$19,69,281,710,2375,3893,8102,11361,19268,36981,\dots$

2) Canonical start-state for candidate $n$

The program does not test primes by trial division in the usual sense. Instead, the FRACTRAN dynamics repeatedly reaches a canonical state that encodes “we are now testing candidate $n$.” In the optimized derivation, the first step index of that canonical state is denoted by

$T_n.$

The source code uses

$T_2=2,$

meaning that after two initial transitions from the seed, the machine has entered the standard start configuration for candidate $2$.

3) What one candidate cycle does

Once candidate $n$ starts, the automaton behaves like a deterministic divisor-testing routine.

Setup: it builds the control structure for the current candidate.

Middle phase: it tests divisors $d$ in descending order.

Exit: if a divisor is found, the machine moves on to candidate $n+1$; if no divisor exists, it follows a special prime branch and emits $2^n$.

This is why the whole process can be summarized by two exact arithmetic quantities:

$\Delta(n)=T_{n+1}-T_n,$

the total step count from the start of candidate $n$ to the start of candidate $n+1$, and

$H(p),$

the offset from $T_p$ to the moment where prime candidate $p$ is actually emitted.

4) Why the largest proper divisor appears

Let

$s=\operatorname{spf}(n)$

be the smallest prime factor of $n$. Then the first divisor encountered while scanning downward is the largest proper divisor

$d_*=\frac{n}{s}.$

All integers

$d=n-1,n-2,\dots,d_*+1$

are guaranteed to be non-divisors, and the scan stops exactly at $d_*$. That is why the composite-case formula naturally splits into “all non-divisor tests above $d_*$” plus “the final successful divisor branch at $d_*$.”

5) Exact cost formulas

The code has already flattened the entire state graph of one candidate cycle into exact transition counts.

For a general candidate $n$, the total jump to the next candidate is

$\Delta(n)=(2n-1)+\sum_{d=d_*+1}^{n-1}\bigl(6n+2+2\lfloor n/d\rfloor\bigr)+\bigl(5n+d_*+2\lfloor n/d_*\rfloor+2\bigr).$

The three parts are:

Setup: $2n-1$.

Rejected divisor tests: one term $6n+2+2\lfloor n/d\rfloor$ for every non-divisor above $d_*$.

Terminal composite branch: the final successful branch at $d_*$.

If $p$ is prime, there is no divisor in $2,\dots,p-1$, so the machine instead reaches the prime-output branch:

$H(p)=(2p-1)+\sum_{d=2}^{p-1}\bigl(6p+2+2\lfloor p/d\rfloor\bigr)+(6p+2).$

The last term $6p+2$ is the prime branch itself.

6) Accumulating global time

Once $\Delta(n)$ is known, the candidate-start times satisfy the prefix recurrence

$T_{n+1}=T_n+\Delta(n),\qquad T_2=2.$

If $p_k$ is the $k$-th prime, then the answer to the Project Euler problem is simply

$T_{p_k}+H(p_k).$

For example, candidate $2$ starts at step $T_2=2$, and

$H(2)=17,$

so the first prime hit occurs at

$2+17=19.$

Likewise, $\Delta(2)=20$, so $T_3=22$; then $H(3)=47$, giving the second prime hit at

$22+47=69.$

These match the brute-force checkpoints exactly.

7) Fast evaluation of the floor sums

The expensive-looking part is

$\sum \lfloor n/d\rfloor.$

The implementation evaluates such sums in quotient blocks: if several consecutive divisors have the same quotient $q=\lfloor n/d\rfloor$, they are added at once. This is the standard harmonic-grouping trick and makes each floor-sum sublinear instead of linear.

How the Code Works

1) Prime bound. The solver first estimates an upper bound for the $k$-th prime using $n(\log n+\log\log n)$-type asymptotics, then enlarges it if necessary.

2) Linear sieve. A sieve produces both the prime list and the smallest-prime-factor table $\operatorname{spf}(n)$.

3) Floor-sum helpers. floor_sum_upto() and floor_sum_range() implement quotient-block summation.

4) Closed formulas. candidate_delta() computes $\Delta(n)$, and prime_hit_offset() computes $H(p)$.

5) Prefix walk. The main loop advances $T_n$ candidate by candidate until it reaches the target prime $p_k$, then returns $T_{p_k}+H(p_k)$.

6) Strong validation. The C++ code also contains a literal automaton simulator and checks that the first 10 prime hits and several small target indices agree with the closed formulas.

Complexity Analysis

The sieve costs $O(B)$ time and memory up to the chosen prime bound $B$. Each candidate then needs only a constant amount of algebra plus a few harmonic floor-sums, each taking about $O(\sqrt n)$ block steps. This is enormously faster than literal FRACTRAN simulation, which would need to traverse about $1.5\times10^{15}$ transitions for the final answer.