Problem 175: Fractions Involving the Number of Different Ways a Number Can Be Expressed as a Sum of Powers of 2
View on Project EulerProject Euler Problem 175 Solution
EulerSolve provides an optimized solution for Project Euler Problem 175, Fractions Involving the Number of Different Ways a Number Can Be Expressed as a Sum of Powers of 2, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let \(f(n)\) be the number of ways to write \(n\) as a sum of powers of 2 when each power may be used at most twice. These are the hyperbinary representations of \(n\). The Project Euler instance asks for the smallest positive integer \(n\) such that $\frac{f(n)}{f(n-1)}=\frac{123456789}{987654321},$ and it wants the answer in shortened binary form: if the ordinary binary expansion of \(n\) is split into maximal runs of equal bits, we output the run lengths from left to right. For example, \(241=11110001_2\) becomes \([4,3,1]\). The crucial point is that the implementations never search for \(n\) and never build a huge table of values for the real input. They turn the target fraction directly into those binary run lengths. Mathematical Approach The solution rests on two facts: the counting function \(f(n)\) has a very rigid parity recurrence, and the quotient \(r(n)=f(n)/f(n-1)\) therefore moves through the positive rationals by simple deterministic rules. Hyperbinary Representations and the Basic Recurrence Look at the coefficient of \(2^0=1\) in a representation of \(n\). If \(n=2m+1\) is odd, that coefficient must be exactly 1. After removing that 1 and dividing every remaining term by 2, we obtain a bijection with representations of \(m\). Hence $f(2m+1)=f(m).$ If \(n=2m\) is even, the coefficient of 1 can only be 0 or 2....
Detailed mathematical approach
Problem Summary
Let \(f(n)\) be the number of ways to write \(n\) as a sum of powers of 2 when each power may be used at most twice. These are the hyperbinary representations of \(n\). The Project Euler instance asks for the smallest positive integer \(n\) such that
$\frac{f(n)}{f(n-1)}=\frac{123456789}{987654321},$
and it wants the answer in shortened binary form: if the ordinary binary expansion of \(n\) is split into maximal runs of equal bits, we output the run lengths from left to right. For example, \(241=11110001_2\) becomes \([4,3,1]\).
The crucial point is that the implementations never search for \(n\) and never build a huge table of values for the real input. They turn the target fraction directly into those binary run lengths.
Mathematical Approach
The solution rests on two facts: the counting function \(f(n)\) has a very rigid parity recurrence, and the quotient \(r(n)=f(n)/f(n-1)\) therefore moves through the positive rationals by simple deterministic rules.
Hyperbinary Representations and the Basic Recurrence
Look at the coefficient of \(2^0=1\) in a representation of \(n\).
If \(n=2m+1\) is odd, that coefficient must be exactly 1. After removing that 1 and dividing every remaining term by 2, we obtain a bijection with representations of \(m\). Hence
$f(2m+1)=f(m).$
If \(n=2m\) is even, the coefficient of 1 can only be 0 or 2. In the first case, dividing by 2 gives a representation of \(m\). In the second case, removing the two 1s and dividing by 2 gives a representation of \(m-1\). Therefore
$f(2m)=f(m)+f(m-1)\qquad (m\ge 1).$
Together with \(f(0)=1\), this is exactly the recurrence used in the implementations:
$f(0)=1,\qquad f(2m+1)=f(m),\qquad f(2m)=f(m)+f(m-1).$
The Ratio Is Controlled by Parity
Define
$r(n)=\frac{f(n)}{f(n-1)}\qquad (n\ge 1).$
Apply the recurrence to the pair \((f(n),f(n-1))\). For an even index \(n=2m\),
$r(2m)=\frac{f(m)+f(m-1)}{f(m-1)}=r(m)+1.$
For an odd index \(n=2m+1\),
$r(2m+1)=\frac{f(m)}{f(m)+f(m-1)}=\frac{r(m)}{r(m)+1}.$
These formulas already tell us something important about the binary expansion of the unknown \(n\). If \(n\) is even, then \(r(n)>1\). If \(n\) is odd and \(n>1\), then \(0<r(n)<1\). So by comparing the numerator and denominator of the target fraction, we can tell whether the last binary digit is 0 or 1.
Why the Output Is a List of Run Lengths
Binary expansion and parity recursion fit together perfectly. Starting from \(r(1)=1\), appending a binary digit to the right updates the ratio by one of two maps:
$0:\ x\mapsto x+1,\qquad 1:\ x\mapsto \frac{x}{x+1}.$
So if the binary expansion of \(n\) is read from left to right, every 0 means “add 1”, and every 1 after the leading bit means “replace \(x\) by \(x/(x+1)\)”. Consecutive equal bits therefore correspond to repeating the same transformation many times in a row.
This is exactly why the problem asks for the shortened binary expansion. The natural object recovered by the inverse process is not the full integer \(n\), but the lengths of alternating runs of 1s and 0s.
Grouped Euclidean Reduction of the Target Fraction
Now run the recurrence backward on a reduced positive fraction \(p/q\).
If \(p>q\), then the last step must have come from an even index, because only the map \(x\mapsto x+1\) produces a value greater than 1. Undoing one such step sends
$\frac{p}{q}\longmapsto \frac{p-q}{q}.$
If the inequality \(p>q\) continues to hold, we can strip several trailing 0s at once. Write
$p=a q+p',\qquad a=\left\lfloor\frac{p-1}{q}\right\rfloor,\qquad 1\le p'\le q.$
Then \(a\) is the length of the final run of 0s, and the smaller subproblem is \(p'/q\).
If \(p\le q\), then the last step must have come from an odd index, because the map \(x\mapsto x/(x+1)\) produces a value in \((0,1]\). Undoing one such step sends
$\frac{p}{q}\longmapsto \frac{p}{q-p}.$
Again we group repeated identical steps. Write
$q=b p+q',\qquad b=\left\lfloor\frac{q}{p}\right\rfloor,\qquad 0\le q'<p.$
Then \(b\) is the length of the final run of 1s, and the smaller subproblem is \(p/q'\). If \(q'=0\), we have reached the initial block of 1s at the far left of the binary expansion. In that sense, the algorithm is a Euclidean algorithm whose quotients are exactly the answer we want to print.
Worked Example: Recovering \([4,3,1]\) from \(13/17\)
Take the checkpoint fraction \(13/17\).
Since \(13\le 17\), the binary expansion ends with a run of 1s. Divide \(17\) by \(13\):
$17=1\cdot 13+4.$
So the last run length is 1, and we continue with \(13/4\).
Now \(13>4\), so the previous run was a run of 0s. Divide \(13\) by \(4\) in the grouped form used above:
$13=3\cdot 4+1.$
This contributes a run length 3 and reduces the problem to \(1/4\).
Finally,
$4=4\cdot 1+0,$
so the leading run of 1s has length 4. Reading the recorded lengths from left to right gives
$[4,3,1].$
Indeed, \([4,3,1]\) is the shortened binary expansion of \(241=11110001_2\), and the checkpoint computation confirms
$\frac{f(241)}{f(240)}=\frac{13}{17}.$
How the Code Works
The implementations first reduce the input fraction by its greatest common divisor. This is mathematically harmless, because the ratio \(f(n)/f(n-1)\) depends only on the reduced rational number.
After that, the main routine repeatedly compares the numerator and denominator. If the numerator is larger, it extracts one grouped quotient for a 0-run and replaces the numerator by the adjusted remainder in the interval \(1,\dots,q\). If the denominator is larger or equal, it extracts one grouped quotient for a 1-run and replaces the denominator by the ordinary remainder modulo the numerator. Those quotients are exactly the run lengths of the shortened binary expansion.
The C++ version emits the run lengths by recursive unwinding, so they appear directly in left-to-right order. The Python and Java versions collect them while descending through the Euclidean reductions and reverse the list once at the end. The C++ implementation also includes small checkpoints: it verifies the recurrence on tiny values, checks the sample fraction \(13/17\), and confirms that the minimal corresponding integer has shortened binary expansion \([4,3,1]\).
For the real input, none of the implementations constructs the enormous integer \(n\) itself. They stop as soon as the fraction has been fully decomposed into its alternating run lengths, then print those lengths as a comma-separated list.
Complexity Analysis
The reduction steps are Euclidean steps, only grouped by runs instead of performed one subtraction at a time. Therefore the number of iterations is \(O(\log \max(p,q))\).
The output list has the same asymptotic size, so the memory usage is \(O(\log \max(p,q))\) as well. For the concrete Project Euler input, 64-bit integer arithmetic is sufficient throughout, and the method is extremely fast because it avoids both brute-force search over \(n\) and explicit computation of large values of \(f(n)\).
Footnotes and References
- Project Euler problem page: https://projecteuler.net/problem=175
- Euclidean algorithm: Wikipedia - Euclidean algorithm
- Stern's diatomic series: Wikipedia - Stern's diatomic series
- Calkin-Wilf tree: Wikipedia - Calkin-Wilf tree
- Continued fraction: Wikipedia - Continued fraction