Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 771: Pseudo Geometric Sequences

View on Project Euler

Project Euler Problem 771 Solution

EulerSolve provides an optimized solution for Project Euler Problem 771, Pseudo Geometric Sequences, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Let \(G(N)\) be the number of increasing positive-integer sequences $a_1\lt a_2\lt \cdots\lt a_t\le N,\qquad t\ge 5.$ such that every interior triple is almost geometric in the sense that $\left|a_i^2-a_{i-1}a_{i+1}\right|\le 2\qquad (2\le i\le t-1).$ The required answer is \(G(10^{18})\bmod (10^9+7)\). Direct enumeration is hopeless, so the implementations rely on a structural classification: every admissible sequence belongs either to a simple infinite family or to a short finite exception list. Mathematical Approach The quantity $\Delta_i=a_i^2-a_{i-1}a_{i+1}$ measures how far a triple is from being exactly geometric. True geometric progressions have \(\Delta_i=0\), while the pseudo-geometric condition only allows the five values \(-2,-1,0,1,2\). Step 1: Rewrite the condition in terms of local defect For a sequence to be admissible, every interior index must satisfy $\Delta_i\in\{-2,-1,0,1,2\}.$ This is an extremely rigid condition. The mathematical classification used by the implementations shows that long solutions do not wander arbitrarily: they fall into a few deterministic families. That is why the program counts families, not individual sequences. Once a maximal family contains \(t\) valid terms, the number of contiguous subsequences of length at least \(5\) is $C(t)=\begin{cases} \dfrac{(t-4)(t-3)}{2}, & t\ge 5,\\ 0, & t\lt 5....

Detailed mathematical approach

Problem Summary

Let \(G(N)\) be the number of increasing positive-integer sequences

$a_1\lt a_2\lt \cdots\lt a_t\le N,\qquad t\ge 5.$

such that every interior triple is almost geometric in the sense that

$\left|a_i^2-a_{i-1}a_{i+1}\right|\le 2\qquad (2\le i\le t-1).$

The required answer is \(G(10^{18})\bmod (10^9+7)\). Direct enumeration is hopeless, so the implementations rely on a structural classification: every admissible sequence belongs either to a simple infinite family or to a short finite exception list.

Mathematical Approach

The quantity

$\Delta_i=a_i^2-a_{i-1}a_{i+1}$

measures how far a triple is from being exactly geometric. True geometric progressions have \(\Delta_i=0\), while the pseudo-geometric condition only allows the five values \(-2,-1,0,1,2\).

Step 1: Rewrite the condition in terms of local defect

For a sequence to be admissible, every interior index must satisfy

$\Delta_i\in\{-2,-1,0,1,2\}.$

This is an extremely rigid condition. The mathematical classification used by the implementations shows that long solutions do not wander arbitrarily: they fall into a few deterministic families. That is why the program counts families, not individual sequences.

Once a maximal family contains \(t\) valid terms, the number of contiguous subsequences of length at least \(5\) is

$C(t)=\begin{cases} \dfrac{(t-4)(t-3)}{2}, & t\ge 5,\\ 0, & t\lt 5. \end{cases}$

This identity is just the sum

$C(t)=\sum_{\ell=5}^{t}(t-\ell+1),$

because for each allowed length \(\ell\), there are \(t-\ell+1\) windows.

Step 2: Consecutive integers form one master family

If we take any consecutive run

$s,s+1,s+2,\dots,s+\ell-1,$

then for every interior triple we get

$ (x+1)^2-x(x+2)=1. $

So every consecutive block is admissible. All such blocks are contained in the master chain

$1,2,3,\dots,N,$

which has \(N\) terms, hence its total contribution is simply

$C(N)=\frac{(N-4)(N-3)}{2}\qquad (N\ge 5).$

This explains the quadratic baseline visible in the implementations.

Step 3: Genuine geometric progressions are counted with totients

If the defect is always \(0\), then the sequence is an exact geometric progression. Every increasing integer geometric progression of length \(e+1\) can be written uniquely as

$c\,q^e,\ c\,q^{e-1}u,\ c\,q^{e-2}u^2,\ \dots,\ c\,u^e,$

where

$u\gt q\ge 1,\qquad \gcd(u,q)=1,\qquad e\ge 4,\qquad c\ge 1.$

For a fixed numerator \(u\), there are exactly \(\varphi(u)\) admissible denominators \(q\) with \(1\le q\lt u\) and \(\gcd(u,q)=1\). The last term is \(c\,u^e\), so the bound \(a_t\le N\) gives

$1\le c\le \left\lfloor \frac{N}{u^e}\right\rfloor.$

Therefore the total geometric contribution is

$G_{\mathrm{geo}}(N)=\sum_{u=2}^{\lfloor N^{1/4}\rfloor}\varphi(u)\sum_{\substack{e\ge 4\\u^e\le N}}\left\lfloor\frac{N}{u^e}\right\rfloor.$

The fourth-root cutoff appears because even the shortest admissible geometric sequence has length \(5\), so its numerator base must satisfy \(u^4\le N\).

Step 4: The remaining infinite families satisfy linear recurrences

After removing the exact geometric case, the remaining long families are generated by second-order recurrences of one of the two forms

$a_{n+1}=m a_n-a_{n-1},$

or

$a_{n+1}=m a_n+a_{n-1}.$

The generic branches start from \((1,m)\). Two additional branches of the minus type start from \((1,2)\) with coefficient \(3\) and from \((1,3)\) with coefficient \(4\). Two additional branches of the plus type start from \((1,2)\) with coefficient \(1\) and from \((1,3)\) with coefficient \(2\).

Each choice produces one increasing family until the next term either fails to increase or exceeds \(N\). If that family has \(t\) terms, then it contributes \(C(t)\).

Examples include

$1,2,3,5,8,\dots$

from the plus branch with coefficient \(1\), and

$1,m,m^2-1,m^3-2m,\dots$

from the minus branch with seed \((1,m)\).

Step 5: One hybrid family and ten sporadic exceptions remain

Besides the recurrence families, the implementations add one hybrid chain

$1,2,6,18,54,\dots$

whose tail is geometric with ratio \(3\). If this chain has \(t\) terms up to \(N\), only the subsequences that still contain the leading \(1\) are new, because every later window is already an ordinary geometric progression. Hence the hybrid contribution is

$t-4$

rather than \(C(t)\).

Finally there are ten isolated sporadic sequences of length exactly \(5\). Each contributes \(1\) once its last term is at most \(N\). That is why the implementations only need the ten terminal values of those exceptions.

Worked Example: \(N=10\)

The check value \(G(10)=26\) can be reconstructed directly from the family decomposition.

The consecutive family contributes

$C(10)=\frac{(10-4)(10-3)}{2}=21.$

The geometric block contributes nothing because the shortest possible nontrivial numerator base already gives \(2^4=16\gt 10\).

Among the recurrence families, only

$1,2,3,5,8$

reaches length \(5\) under the bound \(10\), so this contributes \(1\).

The hybrid chain starts with \(1,2,6,18,\dots\), so it is still too short at \(N=10\).

Among the ten sporadic exceptions, four have last term at most \(10\). Therefore

$G(10)=21+0+1+0+4=26,$

which matches the implementation check exactly.

How the Code Works

The C++, Python, and Java implementations all follow the same pipeline. First they compute \(R=\lfloor N^{1/4}\rfloor\) carefully, correcting the floating-point estimate so that \(R^4\le N\lt (R+1)^4\). Next they add the consecutive-family contribution \(C(N)\).

Then they build a totient table up to \(R\) and evaluate the geometric sum

$\sum_{u=2}^{R}\varphi(u)\sum_{\substack{e\ge 4\\u^e\le N}}\left\lfloor\frac{N}{u^e}\right\rfloor.$

After that, they iterate through every admissible recurrence coefficient and seed pair, generate terms until monotonic growth stops or the next term exceeds \(N\), and add \(C(t)\) for the resulting family length \(t\). The hybrid chain and the ten sporadic corrections are appended at the end. All arithmetic is reduced modulo \(10^9+7\) after each accumulation step. The C++ implementation parallelizes the geometric summation across numerator ranges, while the Python and Java implementations evaluate the same formula serially.

Complexity Analysis

Let \(R=\lfloor N^{1/4}\rfloor\). Computing the totient table costs \(O(R\log\log R)\) time and \(O(R)\) memory. The geometric double sum visits all powers \(u^e\le N\), so its total work is also near \(O(R\log\log R)\). The recurrence families are enumerated only for coefficients up to \(R\), and each family grows exponentially, so the total recurrence work is well bounded by \(O(R\log N)\) in practice. The hybrid and sporadic parts are constant-time. Overall the method uses \(O(R)\) memory and runs close to fourth-root complexity rather than depending linearly on \(N\).

Footnotes and References

  1. Project Euler problem page: https://projecteuler.net/problem=771
  2. Geometric progression: Wikipedia - Geometric progression
  3. Euler's totient function: Wikipedia - Euler's totient function
  4. Linear recurrence with constant coefficients: Wikipedia - Linear recurrence with constant coefficients
  5. Lucas sequence: Wikipedia - Lucas sequence

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 770 · All Project Euler solutions · Next: Problem 772

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮