Problem 705: Total Inversion Count of Divided Sequences
View on Project EulerProject Euler Problem 705 Solution
EulerSolve provides an optimized solution for Project Euler Problem 705, Total Inversion Count of Divided Sequences, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Concatenate the decimal expansions of all primes below \(N\). The implementations only branch on nonzero digits, so let \(d_1,\dots,d_L\in\{1,\dots,9\}\) be that resulting stream in left-to-right order. At each position \(i\), we choose one positive divisor of \(d_i\), obtaining a generated sequence \(a_1,\dots,a_L\) with \(a_i\in D(d_i)\). For one generated sequence, its inversion count is $\operatorname{inv}(a_1,\dots,a_L)=\#\{(i,j):1\le i<j\le L,\ a_i>a_j\}.$ The goal is to sum this inversion count over all divisor choices and return the total modulo $M=10^9+7.$ Mathematical Approach Directly enumerating every generated sequence is impossible, because each new digit multiplies the number of branches by its divisor count. The key observation is that we do not need the sequences themselves; we only need aggregate information that is sufficient to update the global inversion total when one more digit is appended. Step 1: Fix the Local Divisor Sets For each nonzero digit \(d\), define $D(d)=\{x\in\{1,\dots,9\}: x\mid d\},\qquad \tau(d)=|D(d)|.$ These sets are fixed once and for all: $\begin{aligned} D(1)&=\{1\}, & D(2)&=\{1,2\}, & D(3)&=\{1,3\},\\ D(4)&=\{1,2,4\}, & D(5)&=\{1,5\}, & D(6)&=\{1,2,3,6\},\\ D(7)&=\{1,7\}, & D(8)&=\{1,2,4,8\}, & D(9)&=\{1,3,9\}....
Detailed mathematical approach
Problem Summary
Concatenate the decimal expansions of all primes below \(N\). The implementations only branch on nonzero digits, so let \(d_1,\dots,d_L\in\{1,\dots,9\}\) be that resulting stream in left-to-right order. At each position \(i\), we choose one positive divisor of \(d_i\), obtaining a generated sequence \(a_1,\dots,a_L\) with \(a_i\in D(d_i)\).
For one generated sequence, its inversion count is
$\operatorname{inv}(a_1,\dots,a_L)=\#\{(i,j):1\le i<j\le L,\ a_i>a_j\}.$
The goal is to sum this inversion count over all divisor choices and return the total modulo
$M=10^9+7.$
Mathematical Approach
Directly enumerating every generated sequence is impossible, because each new digit multiplies the number of branches by its divisor count. The key observation is that we do not need the sequences themselves; we only need aggregate information that is sufficient to update the global inversion total when one more digit is appended.
Step 1: Fix the Local Divisor Sets
For each nonzero digit \(d\), define
$D(d)=\{x\in\{1,\dots,9\}: x\mid d\},\qquad \tau(d)=|D(d)|.$
These sets are fixed once and for all:
$\begin{aligned} D(1)&=\{1\}, & D(2)&=\{1,2\}, & D(3)&=\{1,3\},\\ D(4)&=\{1,2,4\}, & D(5)&=\{1,5\}, & D(6)&=\{1,2,3,6\},\\ D(7)&=\{1,7\}, & D(8)&=\{1,2,4,8\}, & D(9)&=\{1,3,9\}. \end{aligned}$
So every processed digit contributes only a tiny local choice set, even though the number of full sequences grows multiplicatively.
Step 2: Track the Right Global Aggregates
After processing the first \(i\) digits, let
$T_i=\prod_{k=1}^{i}\tau(d_k)$
be the number of generated prefixes. Let \(I_i\) be the sum of inversion counts over all those prefixes. For each value \(t\in\{1,\dots,9\}\), define
$A_i(t)=\sum_{\text{all generated prefixes }a_1,\dots,a_i}\#\{k\le i:a_k=t\}.$
This is the total number of occurrences of \(t\) across all prefixes, not merely the number of prefixes ending in \(t\). That distinction matters: when a new value is appended, it can form inversions with any earlier larger value, regardless of position.
Step 3: Derive the Inversion Recurrence
Suppose the next processed digit is \(d_i\), with divisor set \(D(d_i)\). Every old prefix branches into \(\tau(d_i)\) continuations, so every old inversion is copied that many times:
$\text{old contribution}=\tau(d_i)\,I_{i-1}.$
New inversions are exactly the pairs whose right endpoint is the newly appended value. If we append \(v\in D(d_i)\), then the number of new inversions contributed over all old prefixes equals the number of earlier entries larger than \(v\):
$G_{i-1}(v)=\sum_{t=v+1}^{9}A_{i-1}(t).$
Summing over all allowed appended values yields
$I_i=\tau(d_i)\,I_{i-1}+\sum_{v\in D(d_i)}G_{i-1}(v).$
Step 4: Derive the Occurrence Recurrence
For a fixed value \(t\in\{1,\dots,9\}\), every old occurrence of \(t\) is duplicated in each of the \(\tau(d_i)\) branches, contributing \(\tau(d_i)A_{i-1}(t)\).
There is also a fresh last-position contribution. If \(t\in D(d_i)\), then each old prefix creates exactly one new branch ending with \(t\), so we gain \(T_{i-1}\) additional occurrences of \(t\). Hence
$A_i(t)=\tau(d_i)\,A_{i-1}(t)+\mathbf{1}_{t\in D(d_i)}\,T_{i-1}.$
The branch count obeys the equally simple recurrence
$T_i=\tau(d_i)\,T_{i-1},\qquad T_0=1,\qquad I_0=0,\qquad A_0(t)=0.$
Together, these formulas completely describe the evolution of the total inversion sum.
Step 5: Why the State Stays Tiny
The chosen values always lie in \(\{1,\dots,9\}\), and every divisor set has size at most \(4\). Therefore the quantities \(G_{i-1}(v)\) can be obtained from a descending suffix sum over only nine counters \(A_{i-1}(1),\dots,A_{i-1}(9)\).
The implementations skip digit \(0\), so only digits \(1\) through \(9\) enter the dynamic program. This means the combinatorial explosion of \(\prod \tau(d_i)\) is absorbed into modular arithmetic on a constant-size state rather than explicit branching.
Step 6: Worked Example for \(N=20\)
The primes below \(20\) are
$2,3,5,7,11,13,17,19,$
so the processed nonzero digit stream is
$2,3,5,7,1,1,1,3,1,7,1,9.$
The first four digits have divisor sets \(\{1,2\}\), \(\{1,3\}\), \(\{1,5\}\), and \(\{1,7\}\). After these four steps there are
$T_4=2^4=16$
generated prefixes, and the inversion totals are
$I_1=0,\qquad I_2=1,\qquad I_3=6,\qquad I_4=24.$
At that point the total occurrence counts are
$A_4(1)=32,\qquad A_4(2)=A_4(3)=A_4(5)=A_4(7)=8,$
with all other \(A_4(t)\) equal to \(0\). Each following digit \(1\) has only one allowed value, so the cross term is always
$A_4(2)+A_4(3)+A_4(5)+A_4(7)=32,$
which gives
$I_5=56,\qquad I_6=88,\qquad I_7=120.$
The next digit is \(3\), so the old total doubles and the new cross term is \(48\), yielding
$I_8=2\cdot 120+48=288.$
The next digit \(1\) gives \(I_9=368\), the next digit \(7\) gives
$I_{10}=2\cdot 368+80=816,$
and the next digit \(1\) gives \(I_{11}=1008\). Before the final digit \(9\), the suffix totals are
$\sum_{t>1}A_{11}(t)=192,\qquad \sum_{t>3}A_{11}(t)=96,\qquad \sum_{t>9}A_{11}(t)=0.$
Since \(D(9)=\{1,3,9\}\), the last cross term is \(192+96+0=288\), so
$I_{12}=3\cdot 1008+288=3312.$
This matches the checkpoint used by the implementations.
How the Code Works
The C++, Python, and Java implementations all use the same mathematical state. They precompute the divisor choices for digits \(1\) through \(9\), then generate all primes below \(N\) with an odd-only sieve while handling prime \(2\) explicitly. Each prime is decomposed into decimal digits from most significant to least significant so the dynamic program sees the correct left-to-right stream.
For each nonzero digit, the implementation first builds the descending sums \(\sum_{t>v}A(t)\), then updates the global inversion total with the recurrence for \(I_i\), then updates the occurrence totals with the recurrence for \(A_i(t)\), and finally multiplies the sequence count by \(\tau(d_i)\). All arithmetic is reduced modulo \(10^9+7\).
The C++ and Java implementations perform this dynamic program directly. The Python implementation is a thin wrapper that makes sure the compiled solver is available, runs it, and returns the same numeric answer.
Complexity Analysis
Let \(L\) be the number of processed nonzero digits appearing in the decimal expansions of primes below \(N\). In the given implementations, prime generation costs \(O(N\log\log N)\) time and \(O(N)\) memory. The digit dynamic program performs constant work per processed digit, because the alphabet has size \(9\) and every divisor set has size at most \(4\). Therefore the DP contributes \(O(L)\) time and \(O(1)\) extra memory.
Overall, the method runs in
$O(N\log\log N+L)$
time and uses \(O(N)\) memory, while avoiding explicit construction of the exponentially many generated sequences.
Footnotes and References
- Problem page: https://projecteuler.net/problem=705
- Inversion in discrete mathematics: Wikipedia — Inversion (discrete mathematics)
- Divisor function and divisibility background: Wikipedia — Divisor
- Sieve of Eratosthenes: Wikipedia — Sieve of Eratosthenes
- Dynamic programming: Wikipedia — Dynamic programming