Problem 682: $5$-Smooth Pairs
View on Project EulerProject Euler Problem 682 Solution
EulerSolve provides an optimized solution for Project Euler Problem 682, $5$-Smooth Pairs, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary A \(5\)-smooth number has the form \(2^a3^b5^c\) with \(a,b,c\ge 0\). If we write $\Omega(2^a3^b5^c)=a+b+c,\qquad \operatorname{sopfr}(2^a3^b5^c)=2a+3b+5c,$ then the counting problem solved by the implementations can be expressed through $C_k(s)=\#\left\{(a,b,c)\in \mathbb{Z}_{\ge 0}^3:\ a+b+c=k,\ 2a+3b+5c=s\right\}.$ The target sequence is $f(n)=\sum_{k\ge 0}\sum_{s=0}^{n} C_k(s)\,C_k(n-s).$ So \(f(n)\) counts ordered pairs of \(5\)-smooth numbers with the same total number of prime factors, while their weighted prime-factor sums add up to \(n\). The goal is to compute \(f(10^7)\bmod (10^9+7)\). Mathematical Approach The brute-force definition is useful for understanding the sequence, but it is far too slow at \(n=10^7\). The key is to convert the double count into one rational generating function and then read off a sparse linear recurrence. Step 1: Encode One 5-Smooth Layer For fixed \(k\), define the ordinary generating polynomial $P_k(q)=\sum_{s\ge 0} C_k(s)\,q^s=\sum_{a+b+c=k} q^{2a+3b+5c}.$ Introduce a second marker \(x\) for the value of \(k\). Summing over all triples gives $\sum_{k\ge 0} P_k(q)x^k=\sum_{a,b,c\ge 0} x^{a+b+c}q^{2a+3b+5c}=\frac{1}{(1-xq^2)(1-xq^3)(1-xq^5)}.$ This compactly stores every allowed exponent triple, with \(x\) recording the total exponent count and \(q\) recording the weighted sum \(2a+3b+5c\)....
Detailed mathematical approach
Problem Summary
A \(5\)-smooth number has the form \(2^a3^b5^c\) with \(a,b,c\ge 0\). If we write
$\Omega(2^a3^b5^c)=a+b+c,\qquad \operatorname{sopfr}(2^a3^b5^c)=2a+3b+5c,$
then the counting problem solved by the implementations can be expressed through
$C_k(s)=\#\left\{(a,b,c)\in \mathbb{Z}_{\ge 0}^3:\ a+b+c=k,\ 2a+3b+5c=s\right\}.$
The target sequence is
$f(n)=\sum_{k\ge 0}\sum_{s=0}^{n} C_k(s)\,C_k(n-s).$
So \(f(n)\) counts ordered pairs of \(5\)-smooth numbers with the same total number of prime factors, while their weighted prime-factor sums add up to \(n\). The goal is to compute \(f(10^7)\bmod (10^9+7)\).
Mathematical Approach
The brute-force definition is useful for understanding the sequence, but it is far too slow at \(n=10^7\). The key is to convert the double count into one rational generating function and then read off a sparse linear recurrence.
Step 1: Encode One 5-Smooth Layer
For fixed \(k\), define the ordinary generating polynomial
$P_k(q)=\sum_{s\ge 0} C_k(s)\,q^s=\sum_{a+b+c=k} q^{2a+3b+5c}.$
Introduce a second marker \(x\) for the value of \(k\). Summing over all triples gives
$\sum_{k\ge 0} P_k(q)x^k=\sum_{a,b,c\ge 0} x^{a+b+c}q^{2a+3b+5c}=\frac{1}{(1-xq^2)(1-xq^3)(1-xq^5)}.$
This compactly stores every allowed exponent triple, with \(x\) recording the total exponent count and \(q\) recording the weighted sum \(2a+3b+5c\).
Step 2: Enforce the Equal-\(k\) Pairing Condition
For one fixed \(k\), the coefficient of \(q^n\) in \(P_k(q)^2\) is exactly
$\sum_{s=0}^{n} C_k(s)\,C_k(n-s).$
Therefore the full generating function of the target sequence is
$F(q)=\sum_{n\ge 0} f(n)q^n=\sum_{k\ge 0} P_k(q)^2.$
To sum the squares while forcing the same value of \(k\) on both sides, use constant-term extraction:
$F(q)=\mathrm{CT}_x\!\left(\sum_{k\ge 0} P_k(q)x^k\right)\!\left(\sum_{m\ge 0} P_m(q)x^{-m}\right).$
Substituting the bivariate generating function from Step 1 yields
$F(q)=\mathrm{CT}_x\!\left(\frac{1}{(1-xq^2)(1-xq^3)(1-xq^5)(1-q^2/x)(1-q^3/x)(1-q^5/x)}\right).$
The coefficient of \(x^0\) is what keeps only the terms with identical total exponent count on both sides of the pair.
Step 3: Collapse the Constant Term to a Rational Function
For \(|q|<1\), the poles in the \(x\)-plane that lie inside the unit circle are \(x=q^2\), \(x=q^3\), and \(x=q^5\). Summing the three corresponding residues gives the exact rational form
$F(q)=\frac{1-q+q^4+q^5-2q^6+q^7+q^8-q^{11}+q^{12}}{1-q-q^6+q^9-q^{10}+q^{11}+q^{13}-q^{19}-q^{21}+q^{22}-q^{23}+q^{26}+q^{31}-q^{32}}.$
This single fraction is the decisive step: once \(F(q)\) is rational, the sequence \(f(n)\) must satisfy a linear recurrence with constant coefficients.
Step 4: Read Off the Sparse Recurrence
Write
$N(q)=1-q+q^4+q^5-2q^6+q^7+q^8-q^{11}+q^{12},$
$D(q)=1-q-q^6+q^9-q^{10}+q^{11}+q^{13}-q^{19}-q^{21}+q^{22}-q^{23}+q^{26}+q^{31}-q^{32}.$
Since \(D(q)F(q)=N(q)\), coefficient comparison gives
$\begin{aligned} f(n)=\nu_n&+f(n-1)+f(n-6)-f(n-9)+f(n-10)-f(n-11)-f(n-13)\\ &+f(n-19)+f(n-21)-f(n-22)+f(n-23)-f(n-26)-f(n-31)+f(n-32), \end{aligned}$
where \(f(m)=0\) for \(m<0\), and \(\nu_n\) comes from the numerator \(N(q)\):
$\nu_n=\begin{cases} 1, & n\in\{0,4,5,7,8,12\},\\ -1, & n\in\{1,11\},\\ -2, & n=6,\\ 0, & \text{otherwise.} \end{cases}$
The implementations use exactly this recurrence, always reducing modulo \(10^9+7\).
Worked Example: \(f(10)=4\)
At \(n=10\), the count is small enough to inspect directly.
For \(k=1\), the only possible weighted sum is \(5\), coming from \(5=5^1\). This contributes one ordered pair: \((5,5)\).
For \(k=2\), the available \(5\)-smooth numbers are
$2^2=4,\qquad 2\cdot 3=6,\qquad 3^2=9,$
with weighted sums \(4\), \(5\), and \(6\), respectively. To total \(10\), the ordered pairs are
$ (4,9),\qquad (6,6),\qquad (9,4). $
No other value of \(k\) works, so
$f(10)=1+3=4,$
which matches the small checkpoint used by the implementations.
How the Code Works
The C++, Python, and Java implementations do not enumerate exponent triples once \(n\) becomes large. Instead, they build the sequence from left to right using the recurrence above. At each index they start from the forcing term \(\nu_n\), add or subtract the 13 lagged values dictated by the denominator polynomial, and normalize the result modulo \(10^9+7\).
The current implementations store all values from \(0\) through \(n\), so looking up each lagged term is immediate. One implementation also performs a small direct enumeration up to \(120\) and checks the sample values \(f(10)=4\) and \(f(100)=3629\) before computing the final answer.
Complexity Analysis
The direct combinatorial definition is far too expensive for \(n=10^7\), because it repeatedly scans many exponent triples and many convolutions. The optimized method performs only a constant amount of work per index: one forcing lookup, 13 additions or subtractions, and one modular reduction. Therefore the running time is \(O(n)\).
As written, the implementations keep the whole sequence array, so the memory usage is \(O(n)\). Since the largest lag is \(32\), the same recurrence could be implemented with a rolling window and \(O(1)\) memory, but that optimization is not used here.
Footnotes and References
- Problem page: https://projecteuler.net/problem=682
- Smooth numbers: Wikipedia - Smooth number
- Generating functions: Wikipedia - Generating function
- Residue theorem: Wikipedia - Residue theorem
- Linear recurrences with constant coefficients: Wikipedia - Linear recurrence with constant coefficients