Problem 844: $k$-Markov Numbers
View on Project EulerProject Euler Problem 844 Solution
EulerSolve provides an optimized solution for Project Euler Problem 844, $k$-Markov Numbers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary The problem studies the \(k\)-variable Markov-type equation $x_1^2+x_2^2+\cdots+x_k^2=kx_1x_2\cdots x_k.$ A Vieta-style move chooses one coordinate and replaces it by $x_i'=k\prod_{j\ne i}x_j-x_i,$ which preserves the equation. Starting from \((1,1,\dots,1)\), let \(\mathcal{V}_k(N)\) be the set of distinct integers \(x\le N\) that appear in at least one reachable positive tuple, and define $M_k(N)=\sum_{x\in \mathcal{V}_k(N)}x.$ The required quantity is $S(K,N)=\sum_{k=3}^{K} M_k(N)\pmod{1405695061}.$ The key difficulty is that small \(k\) produce a branching state graph, while large \(k\) collapse into a short deterministic chain. Mathematical Approach The implementation uses a hybrid argument: exact breadth-first exploration while genuine branching is still possible, and a closed-form tail once the bound \(N\) rules out tuples with three nontrivial entries. Step 1: The Vieta move preserves the Diophantine surface Fix all coordinates except \(x_i\). Then the defining equation becomes a quadratic in one variable: $X^2-k\left(\prod_{j\ne i}x_j\right)X+\sum_{j\ne i}x_j^2=0.$ If one root is \(X=x_i\), Vieta's formulas show that the other root is $X'=k\prod_{j\ne i}x_j-x_i.$ So replacing \(x_i\) by \(x_i'\) keeps the tuple on the same equation. The start tuple \((1,\dots,1)\) is valid because both sides equal \(k\)....
Detailed mathematical approach
Problem Summary
The problem studies the \(k\)-variable Markov-type equation
$x_1^2+x_2^2+\cdots+x_k^2=kx_1x_2\cdots x_k.$
A Vieta-style move chooses one coordinate and replaces it by
$x_i'=k\prod_{j\ne i}x_j-x_i,$
which preserves the equation. Starting from \((1,1,\dots,1)\), let \(\mathcal{V}_k(N)\) be the set of distinct integers \(x\le N\) that appear in at least one reachable positive tuple, and define
$M_k(N)=\sum_{x\in \mathcal{V}_k(N)}x.$
The required quantity is
$S(K,N)=\sum_{k=3}^{K} M_k(N)\pmod{1405695061}.$
The key difficulty is that small \(k\) produce a branching state graph, while large \(k\) collapse into a short deterministic chain.
Mathematical Approach
The implementation uses a hybrid argument: exact breadth-first exploration while genuine branching is still possible, and a closed-form tail once the bound \(N\) rules out tuples with three nontrivial entries.
Step 1: The Vieta move preserves the Diophantine surface
Fix all coordinates except \(x_i\). Then the defining equation becomes a quadratic in one variable:
$X^2-k\left(\prod_{j\ne i}x_j\right)X+\sum_{j\ne i}x_j^2=0.$
If one root is \(X=x_i\), Vieta's formulas show that the other root is
$X'=k\prod_{j\ne i}x_j-x_i.$
So replacing \(x_i\) by \(x_i'\) keeps the tuple on the same equation. The start tuple \((1,\dots,1)\) is valid because both sides equal \(k\). Every tuple examined by the exact search is generated from this seed by repeated applications of the same invariant move.
Step 2: Compress states by hiding all coordinates equal to \(1\)
The equation and the move are symmetric in the coordinates, so order does not matter. The implementation therefore stores only the entries greater than \(1\), sorted increasingly. A stored state
$a_1\le a_2\le \cdots \le a_t,\qquad a_r>1,$
represents the full \(k\)-tuple with \(k-t\) hidden ones. This compression is exact because the empty product is \(1\), and the move formulas become:
If a hidden \(1\) is replaced, the new value is
$y=k\prod_{r=1}^{t}a_r-1.$
If the stored value \(a_i\) is replaced, the new value is
$y=k\prod_{r\ne i}a_r-a_i.$
If \(y=1\), that coordinate simply disappears from the stored state. If \(y>1\), it is inserted back and the state is sorted again. This turns a \(k\)-tuple problem into a search over small sorted multisets.
Step 3: Exact exploration for the genuinely branching range
For fixed \(k\), a breadth-first search starts from the empty stored state, which represents \((1,\dots,1)\). Each reachable state with all entries at most \(N\) is visited once, and every distinct value appearing in any visited state is inserted into \(\mathcal{V}_k(N)\). Because states are stored canonically as sorted multisets, permuting coordinates does not create duplicates.
This phase is exact: it does not rely on heuristics, only on the move formulas above and on the bound \(x\le N\). The sum \(M_k(N)\) is therefore the sum of distinct reachable values, not the sum over states.
Step 4: Find the threshold where three nontrivial coordinates become impossible
Let
$a_0=1,\qquad a_1=k-1.$
The smallest possible state with one stored entry is therefore \((a_1)\). Replacing one more hidden \(1\) gives the smallest possible two-entry state, whose new value is
$a_2=ka_1-a_0=k^2-k-1.$
Now ask for the smallest possible state with three stored entries. The minimal way to create it is to start from the minimal two-entry state \((a_1,a_2)\) and replace yet another hidden \(1\). That produces
$b_3=ka_1a_2-1=k(k-1)(k^2-k-1)-1.$
Hence if \(b_3>N\), then no reachable tuple contributing to \(M_k(N)\) can contain three coordinates greater than \(1\). This gives the exact cutoff
$\kappa_4=\max\left\{k\ge 3: k(k-1)(k^2-k-1)-1\le N\right\}.$
The implementation explores \(k\le \kappa_4\) exactly and handles \(k>\kappa_4\) by formulas.
Step 5: Above the cutoff, the state graph collapses to a linear chain
Once three stored entries are impossible, every relevant state has at most two coordinates greater than \(1\). In that regime the reachable values follow the recurrence
$a_{n+1}=ka_n-a_{n-1},\qquad a_0=1,\qquad a_1=k-1.$
The first terms are
$a_2=k^2-k-1,\qquad a_3=k^3-k^2-2k+1.$
For \(k\ge 3\), the sequence is strictly increasing: if \(a_n>a_{n-1}\), then
$a_{n+1}=ka_n-a_{n-1}>2a_n-a_{n-1}>a_n.$
The next term is
$a_4=ka_3-a_2=k^4-k^3-3k^2+2k+1,$
and it satisfies
$a_4-b_3=(k-2)(k^2-k-1)>0.$
So if \(k>\kappa_4\), then \(b_3>N\), hence \(a_4>N\) and every later chain term is also \(>N\). Therefore only \(1\), \(a_1\), \(a_2\), and \(a_3\) can contribute in the large-\(k\) range.
Define the two remaining thresholds
$\kappa_2=\max\left\{k\ge 3: k^2-k-1\le N\right\},$
$\kappa_3=\max\left\{k\ge 3: k^3-k^2-2k+1\le N\right\}.$
Then for every \(k>\kappa_4\),
$M_k(N)=k+\begin{cases} k^2-k-1,&k\le \kappa_2,\\ 0,&k>\kappa_2, \end{cases} +\begin{cases} k^3-k^2-2k+1,&k\le \kappa_3,\\ 0,&k>\kappa_3. \end{cases}$
The term \(k\) is simply \(1+(k-1)\), the guaranteed contribution of the seed value and the first nontrivial value.
Worked Example: \(k=4\) and \(N=100\)
For \(k=4\), the chain begins with
$a_0=1,\qquad a_1=3,\qquad a_2=11,\qquad a_3=41.$
The smallest possible third stored entry is
$b_3=4\cdot 3\cdot 11-1=131>100.$
So no tuple with three entries greater than \(1\) can contribute. The only reachable values at or below \(100\) are therefore
$1,\ 3,\ 11,\ 41,$
which gives
$M_4(100)=1+3+11+41=56.$
Together with the \(k=3\) contribution, the full check value is
$S(4,100)=229,$
matching the program's built-in sanity test.
How the Code Works
The C++, Python, and Java implementations all follow the same structure. First, they compute the quartic cutoff \(\kappa_4\) and run an exact breadth-first search separately for each \(k\) from \(3\) up to \(\min(K,\kappa_4)\). A queue stores canonical states, a visited set prevents repeats, and a second set records the distinct integers that have appeared so they are counted only once in \(M_k(N)\).
During each transition, the implementation multiplies only up to the cap needed to decide whether the next value could still be at most \(N\). That prevents unnecessary big intermediate products and prunes branches as soon as they are certainly too large. When several equal entries appear in a state, only one representative replacement needs to be tried, because replacing identical coordinates yields the same canonical successor.
After the exact range, the implementation switches to the chain formulas. Three binary searches locate \(\kappa_2\), \(\kappa_3\), and \(\kappa_4\). The remaining contribution is then written as a sum of piecewise polynomials in \(k\), so it can be accumulated from closed forms for
$\sum k,\qquad \sum k^2,\qquad \sum k^3$
modulo \(1405695061\). Modular inverses of \(2\) and \(6\) are used to evaluate the usual prefix-sum formulas inside the modulus.
Complexity Analysis
Let \(T_k(N)\) be the number of canonical states visited in the exact search for a fixed \(k\). The exact phase costs
$O\left(\sum_{k=3}^{\min(K,\kappa_4)} T_k(N)\cdot c_k\right),$
where \(c_k\) is the average work per state for capped products, successor construction, and set operations. The memory usage of that phase is \(O(T_k(N))\) per active \(k\), dominated by the visited states and the set of distinct numbers.
The important structural fact is that \(\kappa_4\) is defined by a quartic inequality, so the exact outer loop covers only \(O(N^{1/4})\) values of \(k\). The large-\(k\) tail needs only three binary searches, hence \(O(\log N)\) time, plus \(O(1)\) arithmetic for the closed-form interval sums. In practice the runtime is dominated by the exact searches near the cutoff, and the tail is negligible.
Footnotes and References
- Problem page: https://projecteuler.net/problem=844
- Markov numbers and the classical three-variable case: Wikipedia — Markov number
- Vieta jumping and root replacement arguments: Wikipedia — Vieta jumping
- Breadth-first search: Wikipedia — Breadth-first search
- Linear recurrences: Wikipedia — Recurrence relation