Problem 399: Squarefree Fibonacci Numbers
View on Project EulerProject Euler Problem 399 Solution
EulerSolve provides an optimized solution for Project Euler Problem 399, Squarefree Fibonacci Numbers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(F_1=F_2=1\) and \(F_{n+2}=F_{n+1}+F_n\). The task is to locate the index \(n\) of the \(N\)-th Fibonacci number that is squarefree, meaning that no prime square divides \(F_n\). After that index is known, the program prints \(F_n\) in a compressed decimal format: the last 16 digits and a scientific-notation leading value with one decimal place. The crucial observation is that the search is carried out on the index line \(n=1,2,3,\dots\), not on the enormous Fibonacci values themselves. Mathematical Approach 1. Move the Squarefree Test to the Index Line A positive integer is squarefree exactly when it is not divisible by \(p^2\) for any prime \(p\). Therefore $\forall p,\ p^2 \nmid F_n.$ So instead of examining the factorization of \(F_n\) directly, we ask a sharper question: for a fixed prime \(p\), which indices \(n\) force \(p^2\mid F_n\)? Once those forbidden indices are understood, the problem becomes a sieve over the natural numbers. 2....
Detailed mathematical approach
Problem Summary
Let \(F_1=F_2=1\) and \(F_{n+2}=F_{n+1}+F_n\). The task is to locate the index \(n\) of the \(N\)-th Fibonacci number that is squarefree, meaning that no prime square divides \(F_n\). After that index is known, the program prints \(F_n\) in a compressed decimal format: the last 16 digits and a scientific-notation leading value with one decimal place. The crucial observation is that the search is carried out on the index line \(n=1,2,3,\dots\), not on the enormous Fibonacci values themselves.
Mathematical Approach
1. Move the Squarefree Test to the Index Line
A positive integer is squarefree exactly when it is not divisible by \(p^2\) for any prime \(p\). Therefore
$\forall p,\ p^2 \nmid F_n.$
So instead of examining the factorization of \(F_n\) directly, we ask a sharper question: for a fixed prime \(p\), which indices \(n\) force \(p^2\mid F_n\)? Once those forbidden indices are understood, the problem becomes a sieve over the natural numbers.
2. Rank of Apparition and the Bad Period \(p\,z(p)\)
For a prime \(p\), define the rank of apparition
$z(p)=\min\{m\ge 1 : p\mid F_m\}.$
A standard divisibility property of Fibonacci numbers says that
$p\mid F_n \iff z(p)\mid n.$
The implementation then uses the Fibonacci valuation law
$v_p(F_{kz(p)})=v_p(F_{z(p)})+v_p(k),$
which turns a divisibility question about \(F_n\) into a divisibility question about the index \(n\). In the standard situation used by the sieve, the first index where a second factor of \(p\) appears is \(p\,z(p)\). That leads to the bad period
$b_p=p\,z(p),$
and every multiple of \(b_p\) is marked as non-squarefree.
Small examples make this concrete:
$p=2:\ z(2)=3,\ b_2=6,\qquad F_6=8,$
$p=3:\ z(3)=4,\ b_3=12,\qquad F_{12}=144,$
$p=5:\ z(5)=5,\ b_5=25,\qquad 25\mid F_{25}.$
3. Efficient Computation of \(z(p)\)
Directly searching for the first Fibonacci number divisible by \(p\) would be too slow. The implementation instead uses the classical theorem
$z(p)\mid p-\left(\frac{5}{p}\right)\qquad (p\ne 5),$
where \(\left(\frac{5}{p}\right)\) is the Legendre symbol. Euler's criterion provides that symbol via
$5^{(p-1)/2}\equiv \left(\frac{5}{p}\right)\pmod p.$
Thus \(z(p)\) must divide the known candidate \(r=p-\left(\frac{5}{p}\right)\). The algorithm factors \(r\), then tries to remove each prime factor \(q\) as many times as possible. Whenever
$F_{r/q}\equiv 0 \pmod p,$
the candidate can be reduced from \(r\) to \(r/q\). After all possible reductions, the remaining value is exactly \(z(p)\). The special primes \(2\) and \(5\) are handled directly with \(z(2)=3\) and \(z(5)=5\).
4. Fast Doubling Makes the Residue Tests Practical
All Fibonacci evaluations in the implementation are performed with fast doubling. If \((F_m,F_{m+1})\) is known, then
$F_{2m}=F_m(2F_{m+1}-F_m),$
$F_{2m+1}=F_m^2+F_{m+1}^2.$
These identities reduce the computation of \(F_t\bmod M\) to \(O(\log t)\) modular operations. That is essential both while shrinking the candidate for \(z(p)\) and later when computing the last 16 digits of the final Fibonacci number.
5. Reduce Redundant Periods and Sieve the Indices
After every bad period \(b_p\le K\) has been generated for a search bound \(K\), the periods are sorted, deduplicated, and reduced. If one period divides another, the larger one is redundant because all its multiples are already covered. For example, \(12\) adds nothing once \(6\) is present.
With the reduced set \(B\), the program marks
$n\in \bigcup_{b\in B}\{b,2b,3b,\dots\}\qquad (1\le n\le K).$
The \(N\)-th unmarked index is exactly the index of the \(N\)-th squarefree Fibonacci number.
Only primes \(p\le K/3\) need to be examined. Indeed \(z(p)\ge 3\) for every prime, so \(p\,z(p)\ge 3p\). If \(3p>K\), that bad period cannot affect the search window at all.
6. Recover the Requested Decimal Output
Once the desired index \(n\) is known, the last 16 digits come from
$F_n\bmod 10^{16}.$
For the leading scientific notation, the implementation uses Binet's formula
$F_n=\frac{\varphi^n-\psi^n}{\sqrt{5}},\qquad \varphi=\frac{1+\sqrt{5}}{2},\qquad \psi=\frac{1-\sqrt{5}}{2}.$
Because \(|\psi|<1\), large \(n\) satisfy
$\log_{10}(F_n)\approx n\log_{10}\varphi-\log_{10}\sqrt{5}.$
If \(x=\log_{10}(F_n)\), then
$e=\lfloor x\rfloor,\qquad m=10^{x-e}\in [1,10).$
Rounding \(m\) to one decimal place gives the displayed mantissa. If that rounding produces \(10.0\), the mantissa is renormalized to \(1.0\) and the exponent is increased by 1.
How the Code Works
The C++, Python, and Java implementations follow the same algorithmic pipeline. First they build a smallest-prime-factor sieve up to the prime limit implied by the search bound. Then they compute bad periods \(p\,z(p)\), remove divisibility-redundant periods, and mark every multiple of the remaining periods until the target count of unmarked indices is reached.
The Fibonacci values themselves are never expanded as giant exact decimal strings. Modular fast doubling gives the last 16 digits directly, while high-precision logarithms provide the scientific-notation mantissa and exponent. The C++ implementation additionally parallelizes the prime loop across hardware threads, but the underlying mathematics is the same in all three languages.
A built-in checkpoint validates the workflow on a smaller case: the 200th surviving index produces the formatted output 1608739584170445,9.7e53.
Complexity Analysis
Let \(K\) be the search bound and let \(B\) be the reduced set of bad periods. Building the smallest-prime-factor sieve up to about \(K/3\) is linear or near-linear in practice and uses \(O(K)\) auxiliary space up to constant factors. Computing all ranks of apparition requires modular Fibonacci tests that cost \(O(\log K)\) each, after which the marking phase costs
$O\!\left(\sum_{b\in B}\frac{K}{b}\right)$
operations, followed by one linear scan to locate the \(N\)-th unmarked index. The dominant memory usage is the boolean mark array of size \(K+1\) together with the sieve tables and the reduced period list.
References
- Problem page: https://projecteuler.net/problem=399
- Wall, D. D. (1960). Fibonacci series modulo m. American Mathematical Monthly, 67(6), 525-532.
- Fibonacci divisibility and rank of apparition: Wikipedia — Fibonacci number
- Legendre symbol and Euler's criterion: Wikipedia — Legendre symbol
- Fast doubling identities: cp-algorithms — Fibonacci numbers
- Closed-form approximation: Wikipedia — Fibonacci number, closed-form expression