Problem 259: Reachable Numbers
View on Project EulerProject Euler Problem 259 Solution
EulerSolve provides an optimized solution for Project Euler Problem 259, Reachable Numbers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary A positive integer is called reachable if it can be obtained from the digits 1,2,3,4,5,6,7,8,9 used in this exact order and exactly once each, under the following rules: 1. Between two consecutive digits we may place one of the four binary operations + , - , * , / , or we may concatenate the digits. 2. Parentheses may be inserted arbitrarily. 3. Unary minus is not allowed. The goal is to find the sum of all distinct positive integers that are reachable exactly. Mathematical Approach 1. Encoding Every Admissible Expression There are 8 gaps between the 9 digits. In each gap we choose one symbol from $\{+,-,\times,\div,\#\},$ where \(\#\) means decimal concatenation. Therefore the total number of raw operator patterns is $5^8=390625.$ For example, the pattern 1#2 + 3#4 * 5 - 6#7#8 / 9 collapses to the token list $12,\ 34,\ 5,\ 678,\ 9$ with operator list $+,\ \times,\ -,\ \div.$ So after concatenations are resolved, every pattern becomes a sequence $v_0,v_1,\dots,v_{m-1}\in \mathbb Z_{>0}$ together with operators $o_0,o_1,\dots,o_{m-2}\in\{+,-,\times,\div\},$ where \(1\le m\le 9\). 2. Why Parentheses Lead to Interval Dynamic Programming Once the token list is fixed, the only remaining freedom is the parenthesization....
Detailed mathematical approach
Problem Summary
A positive integer is called reachable if it can be obtained from the digits
1,2,3,4,5,6,7,8,9
used in this exact order and exactly once each, under the following rules:
1. Between two consecutive digits we may place one of the four binary operations +, -, *, /, or we may concatenate the digits.
2. Parentheses may be inserted arbitrarily.
3. Unary minus is not allowed.
The goal is to find the sum of all distinct positive integers that are reachable exactly.
Mathematical Approach
1. Encoding Every Admissible Expression
There are 8 gaps between the 9 digits. In each gap we choose one symbol from
$\{+,-,\times,\div,\#\},$
where \(\#\) means decimal concatenation.
Therefore the total number of raw operator patterns is
$5^8=390625.$
For example, the pattern
1#2 + 3#4 * 5 - 6#7#8 / 9
collapses to the token list
$12,\ 34,\ 5,\ 678,\ 9$
with operator list
$+,\ \times,\ -,\ \div.$
So after concatenations are resolved, every pattern becomes a sequence
$v_0,v_1,\dots,v_{m-1}\in \mathbb Z_{>0}$
together with operators
$o_0,o_1,\dots,o_{m-2}\in\{+,-,\times,\div\},$
where \(1\le m\le 9\).
2. Why Parentheses Lead to Interval Dynamic Programming
Once the token list is fixed, the only remaining freedom is the parenthesization.
Any fully parenthesized binary expression over
$v_l,v_{l+1},\dots,v_r$
has a last operation performed at some split position \(k\), where
$l\le k<r.$
That last step combines:
1. a fully parenthesized expression on the left interval \([l,k]\),
2. a fully parenthesized expression on the right interval \([k+1,r]\),
3. the fixed operator \(o_k\) between those two intervals.
This is exactly the same structural idea as matrix-chain dynamic programming: every binary tree has a final split.
3. The Interval DP State
For every interval \([l,r]\), define
$R_{l,r}=\{\text{all exact rational values obtainable from }v_l,\dots,v_r\}.$
The base case is immediate:
$R_{i,i}=\{v_i\}.$
For longer intervals, we split at every possible \(k\):
$R_{l,r}=\bigcup_{k=l}^{r-1}\{a\;o_k\;b\mid a\in R_{l,k},\ b\in R_{k+1,r}\}.$
The only forbidden operation is division by zero, so in the division case we keep only pairs with
$b\ne 0.$
This recurrence covers every legal parenthesization exactly because every binary expression tree has one unique last split.
4. Why Exact Rational Arithmetic Is Necessary
Intermediate values do not have to be integers. In fact, some reachable positive integers can only be built by passing through fractions.
The official statement gives the example
$(1/23)\times((4\times 5)-6)\times(78-9)=42.$
Indeed,
$((4\times 5)-6)=14,\qquad (78-9)=69,$
so the expression becomes
$\frac{1}{23}\times 14\times 69=\frac{966}{23}=42.$
If we stored only integers, or if we used floating-point approximations, we would lose valid results. Therefore the solver stores every intermediate value as an exact reduced rational number.
5. A Small DP Example
Consider the short token list
$1,\ 2,\ 3$
with operators
$+,\ \times.$
Then
$R_{0,0}=\{1\},\qquad R_{1,1}=\{2\},\qquad R_{2,2}=\{3\}.$
For the interval \([0,1]\), we get
$R_{0,1}=\{1+2\}=\{3\}.$
For \([1,2]\),
$R_{1,2}=\{2\times 3\}=\{6\}.$
Finally, interval \([0,2]\) has two splits:
$((1+2)\times 3)=9,$
$1+(2\times 3)=7.$
Hence
$R_{0,2}=\{7,9\}.$
This small example shows exactly why a single left-to-right evaluation is not enough: different parenthesizations genuinely create different results.
6. Deduplication Inside Each Interval
Different splits can lead to the same rational number. Also, the same result can appear through different parenthesizations. So every \(R_{l,r}\) must be stored as a set of unique exact values.
The code uses reduced fractions, so for example
$\frac12=\frac24$
is recognized automatically as one value, not two.
This local deduplication is important because otherwise the number of intermediate objects would explode much faster.
7. Which Values Contribute to the Final Sum
The DP for one token pattern returns every reachable rational number for the full interval \([0,m-1]\).
A result contributes to the final answer if and only if:
1. its denominator is 1,
2. its numerator is positive.
So the acceptance condition is exactly
$q=1\qquad\text{and}\qquad p>0$
for a reduced fraction \(p/q\).
All such integers from all 390625 patterns are inserted into one global set, and only then are they summed. This prevents counting the same reachable integer more than once when it arises from different operator patterns.
8. Why Concatenation Is Handled First
Concatenation is not a binary arithmetic operation on two already-evaluated subexpressions. It is a lexical operation on adjacent digits before the arithmetic tree is built.
That is why the code first converts each base-5 pattern into a token list of integers. Only after this preprocessing step does the interval DP begin.
Mathematically, this separates the problem into two layers:
1. choose how digits are grouped into decimal numbers,
2. choose how those numbers are combined by arithmetic and parentheses.
9. Catalan Structure Without Explicit Tree Generation
If a token pattern has \(m\) numbers, then the number of full binary parenthesizations is the Catalan number
$C_{m-1}=\frac{1}{m}\binom{2m-2}{m-1}.$
In the worst case \(m=9\), this gives
$C_8=1430.$
The DP does not explicitly generate these 1430 trees one by one. Instead, it memoizes interval results, so every interval is solved once and then reused wherever needed.
10. How the Code Organizes the Global Search
The function collect_for_range processes a block of pattern codes. Each pattern code is read in base 5, which tells the program what to place in each of the 8 gaps.
After tokenization, all_results performs the interval DP and returns all exact reachable rationals for that pattern.
Each thread keeps a local vector of positive integers found in its assigned range. Those vectors are sorted and deduplicated locally, then merged globally at the end.
The final sum uses cpp_int rather than 64-bit integers, because the total sum of all reachable positive integers is too large to assume a small fixed machine type.
11. Why Threading Is Safe Here
The 390625 patterns are completely independent. No pattern needs information from another. So the full search space can be split into disjoint ranges and processed in parallel without changing the mathematical result.
Each worker computes exact reachable values for its own code range, deduplicates locally, and the main thread performs one final union.
How the Code Works
The helper op_from_code converts base-5 digits into one of the five gap choices. The function collect_for_range decodes a pattern, resolves concatenations into a vector of integers values, and builds the matching operator list ops.
The function all_results(values, ops) allocates a 2D DP table where entry \([l][r]\) stores every exact fraction reachable on that token interval. For each split \(k\), it combines every left value with every right value using operator ops[k], skipping division by zero, and deduplicates with a set.
After all threads finish, the main routine merges their integer outputs, removes duplicates once more globally, and sums the remaining values with arbitrary-precision integer arithmetic.
Complexity Analysis
The total number of token patterns is fixed at
$5^8=390625.$
For one pattern with \(m\) numeric tokens, interval DP has
$O(m^2)$
subintervals and up to
$O(m)$
split points per interval. The real cost is dominated by the sizes of the intermediate rational sets and the cartesian products formed when combining them.
Since \(m\le 9\), the token dimension is very small; what makes the problem nontrivial is the huge number of patterns and the need for exact deduplication. Memory usage is proportional to the total number of fractions stored across all DP intervals for one pattern, plus the thread-local and global sets of reachable integers.
Footnotes and References
- Problem page: https://projecteuler.net/problem=259
- Dynamic programming and interval splitting: Wikipedia - Dynamic programming
- Matrix-chain style split recurrence: Wikipedia - Matrix chain multiplication
- Rational numbers and exact fractions: Wikipedia - Rational number
- Catalan numbers: Wikipedia - Catalan number