Problem 93: Arithmetic Expressions
View on Project EulerProject Euler Problem 93 Solution
EulerSolve provides an optimized solution for Project Euler Problem 93, Arithmetic Expressions, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Choose four distinct digits from \(\{1,2,\dots,9\}\). Using each chosen digit exactly once, together with the binary operations \(+\), \(-\), \(\times\), and \(\div\), and allowing parentheses, we ask which four-digit set can generate the longest consecutive run of positive integers \(1,2,3,\dots,n\). For a fixed digit set \(D=\{a,b,c,d\}\), let \(I(D)\) be the set of positive integers obtainable from valid expressions. The score of \(D\) is $L(D)=\max\{n\ge 0:\{1,2,\dots,n\}\subseteq I(D)\}.$ The goal is to maximize \(L(D)\) over all four-element subsets of \(\{1,\dots,9\}\), then output the winning digits in ascending order as one four-digit string. Mathematical Approach The key point is that the problem is exhaustive but finite. Every legal expression can be described by a small collection of discrete choices: an ordering of the digits, a choice of three operators, and a choice of one full binary-tree shape. Once that state space is written down explicitly, the search becomes a complete enumeration rather than symbolic guesswork. Every valid expression is a full binary tree Fix an ordered tuple \((x_1,x_2,x_3,x_4)\) of the four chosen digits. Any legal expression using these four numbers exactly once and only binary operations has three internal operation nodes and four leaves, so it is a full binary tree with four leaves....
Detailed mathematical approach
Problem Summary
Choose four distinct digits from \(\{1,2,\dots,9\}\). Using each chosen digit exactly once, together with the binary operations \(+\), \(-\), \(\times\), and \(\div\), and allowing parentheses, we ask which four-digit set can generate the longest consecutive run of positive integers \(1,2,3,\dots,n\).
For a fixed digit set \(D=\{a,b,c,d\}\), let \(I(D)\) be the set of positive integers obtainable from valid expressions. The score of \(D\) is
$L(D)=\max\{n\ge 0:\{1,2,\dots,n\}\subseteq I(D)\}.$
The goal is to maximize \(L(D)\) over all four-element subsets of \(\{1,\dots,9\}\), then output the winning digits in ascending order as one four-digit string.
Mathematical Approach
The key point is that the problem is exhaustive but finite. Every legal expression can be described by a small collection of discrete choices: an ordering of the digits, a choice of three operators, and a choice of one full binary-tree shape. Once that state space is written down explicitly, the search becomes a complete enumeration rather than symbolic guesswork.
Every valid expression is a full binary tree
Fix an ordered tuple \((x_1,x_2,x_3,x_4)\) of the four chosen digits. Any legal expression using these four numbers exactly once and only binary operations has three internal operation nodes and four leaves, so it is a full binary tree with four leaves.
For four leaves there are exactly five such shapes, corresponding to the five fully parenthesized forms
$(((x_1 \circ_1 x_2)\circ_2 x_3)\circ_3 x_4),$
$((x_1\circ_1 (x_2\circ_2 x_3))\circ_3 x_4),$
$((x_1\circ_1 x_2)\circ_2 (x_3\circ_3 x_4)),$
$x_1\circ_1 ((x_2\circ_2 x_3)\circ_3 x_4),$
$x_1\circ_1 (x_2\circ_2 (x_3\circ_3 x_4)).$
The number \(5\) is the Catalan number \(C_3\). This is the mathematical reason there are exactly five parenthesization patterns to test and no others.
Count the finite search space exactly
For one unordered digit set \(D\), the leaves can be arranged in \(4!=24\) orders. Each of the three internal nodes can carry any of the four operations, giving \(4^3=64\) operator triples. Combining these with the five tree shapes yields
$N_{\text{expr}}=4!\cdot 4^3\cdot 5=24\cdot 64\cdot 5=7680$
candidate expressions for one digit set.
The outer search runs over all
$\binom{9}{4}=126$
choices of four distinct digits from \(\{1,\dots,9\}\), so the entire problem requires only
$126\cdot 7680=967680$
candidate evaluations. That count is small enough for brute force, but it is also complete: every admissible expression belongs to exactly one leaf order, one operator assignment, and one of the five binary-tree shapes above. Different descriptions can still collapse to the same numerical value, but duplicates do not matter because we keep results in a set.
Work over rational numbers, then keep positive integers
Division means intermediate results need not be integers, so the natural domain is \(\mathbb{Q}\), not \(\mathbb{Z}\). If
$x=\frac{p}{q},\qquad y=\frac{r}{s},\qquad q\ne 0,\ s\ne 0,$
then the four operations are
$x+y=\frac{ps+rq}{qs},\qquad x-y=\frac{ps-rq}{qs},\qquad xy=\frac{pr}{qs},$
$\frac{x}{y}=\frac{ps}{qr}\qquad\text{provided }r\ne 0.$
Any branch with division by zero is invalid and is discarded. A final outcome contributes only when it lies in \(\mathbb{Z}_{>0}\).
This viewpoint explains the implementations cleanly. The C++ implementation represents intermediates as reduced fractions, so integrality is exact. The Python and Java implementations explore the same expression space numerically and accept a value only if it is positive and numerically integral at the end.
Reachable-value sets and the consecutive-prefix invariant
Let \(R(D)\subseteq \mathbb{Q}\) be the set of all valid outcomes for the digit set \(D\), and define
$I(D)=R(D)\cap \mathbb{Z}_{>0}.$
The score is the length of the initial positive-integer prefix contained in \(I(D)\):
$L(D)=\max\{n\ge 0:\{1,2,\dots,n\}\subseteq I(D)\}.$
Only the first gap matters. If \(m\notin I(D)\), then the run cannot extend beyond \(m-1\), regardless of how many larger integers are also representable. So once all outcomes for \(D\) are collected, the score is recovered by checking \(1,2,3,\dots\) until the first missing value appears.
Worked Example: why parenthesization is part of the state
Take the ordered digits \((1,2,3,4)\) and the operator triple \((+,\times,-)\). The five binary-tree shapes give
$((1+2)\times 3)-4=5,$
$\bigl(1+(2\times 3)\bigr)-4=3,$
$((1+2)\times(3-4))=-3,$
$1+\bigl((2\times 3)-4\bigr)=3,$
$1+\bigl(2\times(3-4)\bigr)=-1.$
The digits are the same and the multiset of operations is the same, yet the results differ because the tree shape changes the evaluation order. That is why the five Catalan shapes are not optional boilerplate; they are the actual mathematical cases to enumerate. One implementation also uses the known fact that the digits \(\{1,2,3,4\}\) achieve the consecutive run \(1\) through \(28\) as a checkpoint.
How the Code Works
Enumerating the candidate digit sets
The C++, Python, and Java implementations iterate over all combinations \(a<b<c<d\) from \(\{1,\dots,9\}\). This covers each four-digit set exactly once and keeps the eventual answer in ascending order automatically.
Evaluating every legal expression form
For each digit set, the implementation loops over all 24 digit permutations, all 64 operator triples, and the five parenthesization patterns listed above. Each pattern performs exactly three binary operations. Any attempt to divide by zero terminates that branch immediately.
The C++ version carries reduced fractions through the whole computation. The Python and Java versions use floating-point arithmetic, but they follow the same five-shape enumeration and keep only final values that are positive integers.
Scoring one set and tracking the best one
Every accepted positive integer is inserted into a set attached to the current digits. After the enumeration finishes, the implementation scans upward from \(1\) until it finds the first integer not present. That missing value determines \(L(D)\). If the new score exceeds the best score seen so far, the current digits become the new best answer.
Complexity Analysis
For one digit set, the algorithm performs exactly \(24\cdot 64\cdot 5=7680\) candidate evaluations. Across all \(\binom{9}{4}=126\) digit sets, that is \(967680\) evaluations in total. Each candidate uses only three arithmetic operations plus constant-time validity checks, so the overall running time is effectively linear in that explicit count.
Auxiliary space is small. At any moment the implementation stores only the set of distinct positive integers reachable from the current digit set, together with a few temporary rational or floating-point values. In asymptotic terms this is \(O(U)\), where \(U\) is the number of distinct positive integers generated for one four-digit choice, and \(U\) remains modest in practice.
Footnotes and References
- Problem page: https://projecteuler.net/problem=93
- Binary expression tree: Wikipedia - Binary expression tree
- Catalan number: Wikipedia - Catalan number
- Rational number: Wikipedia - Rational number
- Permutation: Wikipedia - Permutation