Problem 912: Where are the Odds?
View on Project EulerProject Euler Problem 912 Solution
EulerSolve provides an optimized solution for Project Euler Problem 912, Where are the Odds?, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(a_1 \lt a_2 \lt \cdots\) be the increasing sequence of positive integers whose binary expansion never contains the substring \(111\). The problem asks for the value of \(F(10^{16})\), where $F(N)=\sum_{i=1}^{N} i^2\,\mathbf{1}_{\{a_i\equiv 1\pmod 2\}} \pmod{10^9+7}.$ The square is applied to the rank \(i\), not to the admissible integer \(a_i\). So the real question is: among the first \(N\) valid binary strings, which positions correspond to odd values? A brute-force scan is hopeless at \(N=10^{16}\), so the solution has to count whole families of valid strings at once. Mathematical Approach The key idea is to order the admissible integers by bit-length, describe allowed suffixes with a tiny automaton, and attach enough statistics to each suffix block that an entire block can be added to the answer in one formula. Bit-length blocks already match numerical order Every positive integer with \(L\) bits is larger than every positive integer with fewer than \(L\) bits. Therefore the global ordering of the sequence naturally splits into consecutive blocks: first all admissible 1-bit numbers, then all admissible 2-bit numbers, then all admissible 3-bit numbers, and so on. Inside a fixed bit-length, ordinary numerical order is exactly lexicographic order on the remaining bits once the leading \(1\) is fixed....
Detailed mathematical approach
Problem Summary
Let \(a_1 \lt a_2 \lt \cdots\) be the increasing sequence of positive integers whose binary expansion never contains the substring \(111\). The problem asks for the value of \(F(10^{16})\), where
$F(N)=\sum_{i=1}^{N} i^2\,\mathbf{1}_{\{a_i\equiv 1\pmod 2\}} \pmod{10^9+7}.$
The square is applied to the rank \(i\), not to the admissible integer \(a_i\). So the real question is: among the first \(N\) valid binary strings, which positions correspond to odd values? A brute-force scan is hopeless at \(N=10^{16}\), so the solution has to count whole families of valid strings at once.
Mathematical Approach
The key idea is to order the admissible integers by bit-length, describe allowed suffixes with a tiny automaton, and attach enough statistics to each suffix block that an entire block can be added to the answer in one formula.
Bit-length blocks already match numerical order
Every positive integer with \(L\) bits is larger than every positive integer with fewer than \(L\) bits. Therefore the global ordering of the sequence naturally splits into consecutive blocks: first all admissible 1-bit numbers, then all admissible 2-bit numbers, then all admissible 3-bit numbers, and so on.
Inside a fixed bit-length, ordinary numerical order is exactly lexicographic order on the remaining bits once the leading \(1\) is fixed. That observation is what makes a left-child/right-child recursion possible: all completions beginning with the next bit \(0\) come before all completions beginning with the next bit \(1\).
The trailing-ones automaton
When a binary word is being built from left to right, the only information needed to avoid the forbidden pattern \(111\) is the current length of the trailing run of \(1\)'s. Let
$c\in\{0,1,2\}$
be that run length. Appending a \(0\) is always legal and resets the state to \(0\). Appending a \(1\) is legal only when \(c\lt 2\), and then moves the state to \(c+1\):
$c\xrightarrow{0}0,\qquad c\xrightarrow{1}c+1\quad(c=0,1).$
For each state \(c\) and each remaining length \(r\), define \(\mathcal{B}_{c,r}\) to be the ordered block of all valid \(r\)-bit suffixes that can be appended from state \(c\). The full block of admissible \(L\)-bit positive integers is therefore \(\mathcal{B}_{1,L-1}\), because the leading bit is forced to be \(1\), so before the suffix starts the trailing-ones count is already \(1\).
When \(r=0\), the block contains exactly one finished number. That number is odd precisely when its final bit is \(1\), which in this state description is equivalent to \(c\gt 0\).
Four statistics attached to one block
For a block \(\mathcal{B}_{c,r}\), let \(J_{c,r}\) be the set of local positions occupied by odd elements inside that block, using positions \(1,2,\dots,|\mathcal{B}_{c,r}|\). The implementations keep four aggregates:
$C_{c,r}=|\mathcal{B}_{c,r}|,$
$Q_{c,r}=|J_{c,r}|,$
$A_{c,r}=\sum_{j\in J_{c,r}} j,$
$B_{c,r}=\sum_{j\in J_{c,r}} j^2.$
These are exactly the data needed later. \(C_{c,r}\) tells us how large the block is, \(Q_{c,r}\) counts its odd elements, and \(A_{c,r},B_{c,r}\) are the first and second moments of their local positions.
The base case is immediate:
$C_{c,0}=1,\qquad Q_{c,0}=\mathbf{1}_{\{c\gt 0\}},\qquad A_{c,0}=Q_{c,0},\qquad B_{c,0}=Q_{c,0}.$
If the finished number is odd, its unique local position is \(1\), so both the first and second moments are also \(1\).
Merging the 0-child and the 1-child
For \(r\gt 0\), the block \(\mathcal{B}_{c,r}\) splits according to the next bit. The left child is always \(\mathcal{B}_{0,r-1}\). The right child exists only when \(c\lt 2\), in which case it is \(\mathcal{B}_{c+1,r-1}\). Let
$\Delta=C_{0,r-1}$
be the size of the left child. Since every left-child completion is numerically smaller than every right-child completion, all local positions from the right child are shifted upward by \(\Delta\) when the two children are concatenated.
That gives the block-size and odd-count recurrences
$C_{c,r}=C_{0,r-1}+\mathbf{1}_{\{c\lt 2\}}\,C_{c+1,r-1},$
$Q_{c,r}=Q_{0,r-1}+\mathbf{1}_{\{c\lt 2\}}\,Q_{c+1,r-1}.$
For the position moments, each odd local position \(j\) in the right child becomes \(\Delta+j\), so
$A_{c,r}=A_{0,r-1}+\mathbf{1}_{\{c\lt 2\}}\left(A_{c+1,r-1}+\Delta\,Q_{c+1,r-1}\right),$
$B_{c,r}=B_{0,r-1}+\mathbf{1}_{\{c\lt 2\}}\left(B_{c+1,r-1}+2\Delta\,A_{c+1,r-1}+\Delta^2Q_{c+1,r-1}\right).$
As a side consequence, the total sizes \(T_L=C_{1,L-1}\) begin
$1,\,2,\,3,\,6,\,11,\,20,\dots,$
and satisfy a tribonacci-type recurrence. The code does not need that closed form, but it explains why the number of relevant bit-lengths grows only logarithmically with \(N\).
Prefix statistics for the final partial block
When we sum the first \(N\) admissible integers, every bit-length block before the last one is taken completely, but the last block may be needed only partially. So the same statistics must also be available for the first \(k\) elements of a block. Denote this prefix aggregate by
$\operatorname{Pref}(c,r,k).$
If \(\Lambda=C_{0,r-1}\) is the left-child size, then the prefix follows the block structure exactly:
$\operatorname{Pref}(c,r,k)= \begin{cases} (0,0,0,0),&k=0,\\ (C_{c,r},Q_{c,r},A_{c,r},B_{c,r}),&k\ge C_{c,r},\\ \operatorname{Pref}(0,r-1,k),&1\le k\le \Lambda,\\ (C_{0,r-1},Q_{0,r-1},A_{0,r-1},B_{0,r-1})\oplus \operatorname{Pref}(c+1,r-1,k-\Lambda),&k\gt \Lambda, \end{cases}$
where \(\oplus\) means “put the whole left block before the chosen prefix of the right block” and use the same index shift by \(\Lambda\) as above. This is the crucial step that avoids enumerating the elements inside the final partial block.
Turning local odd positions into the global answer
Suppose all shorter bit-length blocks together contain \(P\) admissible numbers, and suppose the chosen part of the current block has local odd positions \(j\). Their global ranks are then \(P+j\). If that chosen part has statistics \((C,Q,A,B)\), its total contribution is
$\sum (P+j)^2=QP^2+2PA+B.$
This identity is the reason for storing the first and second moments of the odd positions. Once \((Q,A,B)\) are known for a whole block or for a prefix block, the entire contribution of that piece is obtained immediately.
Worked example: the first ten admissible integers
The first admissible positive integers are
$1,\ 10,\ 11,\ 100,\ 101,\ 110,\ 1000,\ 1001,\ 1010,\ 1011,\dots$
in binary, that is
$1,\ 2,\ 3,\ 4,\ 5,\ 6,\ 8,\ 9,\ 10,\ 11,\dots$
in decimal. Among these first ten terms, the odd ones occur at ranks
$1,\ 3,\ 5,\ 8,\ 10.$
Therefore
$F(10)=1^2+3^2+5^2+8^2+10^2=199.$
The block view reproduces this exactly. The 1-bit block is just \(1\), so it contributes the global rank \(1\). The 2-bit block is \(10,11\); its only odd local position is \(2\), which shifts to global rank \(3\). The 3-bit block is \(100,101,110\); again the only odd local position is \(2\), which shifts to global rank \(5\).
The 4-bit block is \(1000,1001,1010,1011,1100,1101\), whose odd local positions are \(2,4,6\). But to reach the first ten admissible integers we need only the first four members of this block, namely \(1000,1001,1010,1011\). Their odd local positions are \(2\) and \(4\), and after shifting by the previous six admissible numbers they become global ranks \(8\) and \(10\). That miniature example is exactly what the prefix routine does at large scale.
How the Code Works
Memoizing complete blocks
The C++, Python, and Java implementations memoize every state \((c,r)\) exactly once. For each state they keep the exact block size, the exact odd count, and modular versions of the odd count and the first two position moments. Exact sizes are needed to decide how a prefix splits between the two children; modular moments are enough for the final sum modulo \(10^9+7\).
Descending through the final prefix
If the requested \(N\) stops in the middle of some bit-length block, the implementation recursively descends through that block. At each level it compares the desired prefix length with the left-child size. If the prefix fits entirely inside the left child, it keeps descending left. Otherwise it takes the whole left child, subtracts its size from the prefix length, and continues inside the right child before merging the two pieces with the shift formulas above.
Accumulating the answer block by block
The outer loop scans bit-lengths in increasing order. It keeps a running value \(P\) for how many admissible integers have already been accounted for. For each complete block, or for the one partial block at the end, it reads the statistics \((Q,A,B)\) and adds
$QP^2+2PA+B \pmod{10^9+7}.$
No implementation enumerates the admissible integers up to rank \(10^{16}\); they all work only with block sizes and position moments. One implementation also verifies small cases such as \(F(10)=199\) against brute force before evaluating the large target.
Complexity Analysis
Let \(L_\ast\) be the largest bit-length needed before the first \(N\) admissible integers are covered. There are only \(3(L_\ast+1)\) full automaton states, and each one is computed once. The sweep over complete bit-length blocks costs \(O(L_\ast)\), and the recursive descent for the final partial block also costs \(O(L_\ast)\).
So the total running time is \(O(L_\ast)\), and the memory usage is also \(O(L_\ast)\). Because the block sizes grow exponentially with \(L\) in tribonacci fashion, we have \(L_\ast=O(\log N)\). For the actual problem, that means both time and space are \(O(\log N)\).
Footnotes and References
- Project Euler, Problem 912: https://projecteuler.net/problem=912
- Finite-state machine: Wikipedia - Finite-state machine
- Regular language: Wikipedia - Regular language
- Tribonacci number: Wikipedia - Tribonacci number
- Combinatorics on words: Wikipedia - Combinatorics on words