Problem 633: Square Prime Factors II
View on Project EulerProject Euler Problem 633 Solution
EulerSolve provides an optimized solution for Project Euler Problem 633, Square Prime Factors II, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For each nonnegative integer \(k\), let \(c_k\) be the natural density of positive integers \(n\) for which exactly \(k\) distinct primes satisfy \(p^2 \mid n\). Equivalently, if $\Omega_2(n)=\#\{p \text{ prime} : p^2 \mid n\},$ then \(c_k\) is the density of the set \(\{n\ge 1 : \Omega_2(n)=k\}\). The problem asks for \(c_7\), written in normalized scientific notation with five significant digits. Mathematical Approach The implementations treat each prime square event as one independent density contribution and accumulate the coefficient of degree \(7\) in an infinite Euler product. Step 1: Model One Prime at a Time Fix a prime \(p\). Among all positive integers, the density of multiples of \(p^2\) is $\frac{1}{p^2},$ because exactly one residue class modulo \(p^2\) is divisible by \(p^2\). Therefore the density of integers not divisible by \(p^2\) is $1-\frac{1}{p^2}.$ If we mark the event \(p^2 \mid n\) by a factor of \(z\), then prime \(p\) contributes the linear factor $\left(1-\frac{1}{p^2}\right)+\frac{z}{p^2}.$ The constant term means “this prime does not contribute to \(\Omega_2(n)\),” while the coefficient of \(z\) means “this prime does contribute once.” Step 2: Multiply Over Distinct Primes For a finite set of distinct primes \(S\), divisibility conditions by the squares \(\{p^2 : p\in S\}\) are independent at the density level....
Detailed mathematical approach
Problem Summary
For each nonnegative integer \(k\), let \(c_k\) be the natural density of positive integers \(n\) for which exactly \(k\) distinct primes satisfy \(p^2 \mid n\). Equivalently, if
$\Omega_2(n)=\#\{p \text{ prime} : p^2 \mid n\},$
then \(c_k\) is the density of the set \(\{n\ge 1 : \Omega_2(n)=k\}\). The problem asks for \(c_7\), written in normalized scientific notation with five significant digits.
Mathematical Approach
The implementations treat each prime square event as one independent density contribution and accumulate the coefficient of degree \(7\) in an infinite Euler product.
Step 1: Model One Prime at a Time
Fix a prime \(p\). Among all positive integers, the density of multiples of \(p^2\) is
$\frac{1}{p^2},$
because exactly one residue class modulo \(p^2\) is divisible by \(p^2\). Therefore the density of integers not divisible by \(p^2\) is
$1-\frac{1}{p^2}.$
If we mark the event \(p^2 \mid n\) by a factor of \(z\), then prime \(p\) contributes the linear factor
$\left(1-\frac{1}{p^2}\right)+\frac{z}{p^2}.$
The constant term means “this prime does not contribute to \(\Omega_2(n)\),” while the coefficient of \(z\) means “this prime does contribute once.”
Step 2: Multiply Over Distinct Primes
For a finite set of distinct primes \(S\), divisibility conditions by the squares \(\{p^2 : p\in S\}\) are independent at the density level. The reason is the Chinese remainder theorem: residue classes modulo
$\prod_{p\in S} p^2$
split cleanly into independent choices for each prime square modulus. Hence the truncated generating function is
$F_S(z)=\prod_{p\in S}\left(1-\frac{1}{p^2}+\frac{z}{p^2}\right).$
When this product is expanded, the coefficient of \(z^k\) is exactly the density of integers for which precisely \(k\) primes in \(S\) satisfy \(p^2 \mid n\).
Step 3: Pass to the Infinite Euler Product
Letting \(S\) grow through all primes gives
$F(z)=\prod_p\left(1-\frac{1}{p^2}+\frac{z}{p^2}\right)=\sum_{k\ge 0} c_k z^k.$
This product converges because
$\sum_p \frac{1}{p^2}$
converges. So each coefficient \(c_k\) is well defined. A useful sanity check is the degree-\(0\) coefficient:
$c_0=\prod_p\left(1-\frac{1}{p^2}\right)=\frac{1}{\zeta(2)}=\frac{6}{\pi^2},$
which is the classical density of squarefree integers. Problem 633 asks for the degree-\(7\) coefficient instead.
Step 4: Derive the Streaming Coefficient Recurrence
Suppose after processing some primes we have the truncated polynomial
$C(z)=d_0+d_1 z+\cdots+d_7 z^7.$
For the next prime \(p\), define
$a_p=\frac{1}{p^2}, \qquad b_p=1-a_p.$
Multiplying by the new factor \(b_p+a_p z\) gives
$C(z)(b_p+a_p z).$
Matching coefficients up to degree \(7\) yields
$d_0' = d_0 b_p,$
$d_k' = d_k b_p + d_{k-1} a_p \qquad (1 \le k \le 7).$
This is exactly the dynamic-programming update used in the implementations. The update must run from high degree down to low degree so that the old value of \(d_{k-1}\) is still available when degree \(k\) is updated.
Step 5: Justify the Prime Cutoff
The programs do not multiply over all primes literally. Instead they use every prime up to
$P_{\max}=10^7.$
This is numerically safe because the omitted tail is governed by the convergent series
$\sum_{p>P_{\max}} \frac{1}{p^2}.$
Each skipped prime contributes a factor extremely close to \(1\), so the resulting change in coefficients through degree \(7\) is far smaller than the requested five-significant-digit output. In other words, the infinite product is replaced by a finite product whose truncation error is below the display precision.
Worked Example: Only the Primes \(2\) and \(3\)
To see the mechanism concretely, pretend we only keep the primes \(2\) and \(3\). Then
$\left(1-\frac{1}{2^2}+\frac{z}{2^2}\right)\left(1-\frac{1}{3^2}+\frac{z}{3^2}\right)=\left(\frac{3}{4}+\frac{z}{4}\right)\left(\frac{8}{9}+\frac{z}{9}\right).$
Expanding gives
$\frac{2}{3}+\frac{11}{36}z+\frac{1}{36}z^2.$
So, with respect to the square divisibility events from only \(2^2\) and \(3^2\):
$c_0^{(2,3)}=\frac{2}{3}, \qquad c_1^{(2,3)}=\frac{11}{36}, \qquad c_2^{(2,3)}=\frac{1}{36}.$
The middle coefficient means that density \(\frac{11}{36}\) of integers are divisible by exactly one of \(4\) or \(9\). The full problem repeats this same multiplication across all primes and then reads off the coefficient of \(z^7\).
How the Code Works
The C++, Python, and Java implementations all follow the same structure. First they generate every prime up to \(10^7\) using an odd-only sieve, which stores compositeness information only for odd numbers to reduce memory usage. Then they keep an array of eight floating-point coefficients representing the truncated polynomial
$d_0+d_1 z+\cdots+d_7 z^7.$
The array starts as
$1+0z+0z^2+\cdots+0z^7,$
because before processing any primes, the empty product equals \(1\).
For each prime \(p\), the implementation computes
$a_p=\frac{1}{p^2}, \qquad b_p=1-\frac{1}{p^2},$
and updates the coefficient array from degree \(7\) down to degree \(1\) using
$d_k \leftarrow d_k b_p + d_{k-1} a_p,$
followed by
$d_0 \leftarrow d_0 b_p.$
After the last prime, the entry at degree \(7\) is the numerical approximation to \(c_7\). The final step rescales that positive number into the form
$m \times 10^e, \qquad 1 \le m < 10,$
and rounds the mantissa to four digits after the decimal point, which is five significant digits overall.
Complexity Analysis
Let \(P=10^7\) and \(K=7\). The prime sieve costs \(O(P \log\log P)\) time. The subsequent coefficient update costs \(O(K)\) per prime, so the dynamic-programming stage is
$O(\pi(P)\,K).$
Since \(K\) is fixed at \(7\), that stage is effectively linear in the number of generated primes. The coefficient array itself uses constant space \(O(K)\), while the sieve uses \(O(P)\) boolean storage, reduced in practice by storing only odd indices.
Footnotes and References
- Problem page: https://projecteuler.net/problem=633
- Euler product: Wikipedia - Euler product
- Natural density: Wikipedia - Natural density
- Squarefree integer: Wikipedia - Square-free integer
- Chinese remainder theorem: Wikipedia - Chinese remainder theorem
- Riemann zeta function at \(2\): Wikipedia - Riemann zeta function