Problem 347: Largest Integer Divisible by Two Primes
View on Project EulerProject Euler Problem 347 Solution
EulerSolve provides an optimized solution for Project Euler Problem 347, Largest Integer Divisible by Two Primes, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For each pair of distinct primes \(p \lt q\), define \(M(p,q,N)\) as the largest integer not exceeding \(N\) whose only prime divisors are exactly \(p\) and \(q\), with both primes appearing at least once. In other words, valid numbers have the form \(p^a q^b\) with \(a,b \ge 1\). The goal is to compute $S(N)=\sum_{p \lt q} M(p,q,N).$ The key difficulty is that we must optimize over exponents for every prime pair, but we must do so without factoring every integer up to \(N\). Mathematical Approach Fix distinct primes \(p\) and \(q\). Then $M(p,q,N)=\max\{p^a q^b : a,b \ge 1,\ p^a q^b \le N\},$ with the convention that \(M(p,q,N)=0\) when no such exponents exist. Structure of the Admissible Numbers By the Fundamental Theorem of Arithmetic, every positive integer has a unique prime factorization. Therefore a number counted for the pair \((p,q)\) must be of the form \(p^a q^b\) with no third prime factor. The smallest such number is \(pq\), so immediately: $pq \gt N \implies M(p,q,N)=0.$ This simple observation is what allows both nested prime loops in the code to terminate early. Optimizing the Exponents for a Fixed Pair Suppose \(p^a\) is fixed....
Detailed mathematical approach
Problem Summary
For each pair of distinct primes \(p \lt q\), define \(M(p,q,N)\) as the largest integer not exceeding \(N\) whose only prime divisors are exactly \(p\) and \(q\), with both primes appearing at least once. In other words, valid numbers have the form \(p^a q^b\) with \(a,b \ge 1\). The goal is to compute
$S(N)=\sum_{p \lt q} M(p,q,N).$
The key difficulty is that we must optimize over exponents for every prime pair, but we must do so without factoring every integer up to \(N\).
Mathematical Approach
Fix distinct primes \(p\) and \(q\). Then
$M(p,q,N)=\max\{p^a q^b : a,b \ge 1,\ p^a q^b \le N\},$
with the convention that \(M(p,q,N)=0\) when no such exponents exist.
Structure of the Admissible Numbers
By the Fundamental Theorem of Arithmetic, every positive integer has a unique prime factorization. Therefore a number counted for the pair \((p,q)\) must be of the form \(p^a q^b\) with no third prime factor. The smallest such number is \(pq\), so immediately:
$pq \gt N \implies M(p,q,N)=0.$
This simple observation is what allows both nested prime loops in the code to terminate early.
Optimizing the Exponents for a Fixed Pair
Suppose \(p^a\) is fixed. Then the largest admissible exponent of \(q\) is the greatest integer \(b\) such that
$q^b \le \frac{N}{p^a}.$
Equivalently,
$b_{\max}(a)=\left\lfloor \log_q\!\left(\frac{N}{p^a}\right)\right\rfloor,$
whenever \(p^a q \le N\). So for this fixed \(a\), the best candidate is
$n_a = p^a q^{\,b_{\max}(a)}.$
Hence
$M(p,q,N)=\max_{1 \le a \le \lfloor \log_p(N/q)\rfloor} n_a.$
The code does not actually use floating-point logarithms. Instead, it builds powers of \(p\) and \(q\) by repeated multiplication, which is exact and avoids rounding issues.
Why the Exponent Scan Is Sufficient
For a fixed value of \(p^a\), every feasible number has the form \(p^a q^b\). Since \(q \gt 1\), increasing \(b\) strictly increases the value. Therefore, once \(a\) is fixed, the best choice is always the largest admissible power of \(q\). There is no need to keep smaller \(q\)-powers for the same \(a\); only the maximum matters.
This turns a two-dimensional search over all exponent pairs into the following finite scan:
$p,\ p^2,\ p^3,\dots,\qquad\text{and for each one, }q,\ q^2,\ q^3,\dots$
stopping as soon as the product would exceed \(N\).
Prime-Pair Iteration
After generating all primes up to \(N\) with the Sieve of Eratosthenes, the program iterates over pairs \(p \lt q\). The condition \(pq \le N\) is enough to decide whether the pair can contribute anything at all. Thus:
$S(N)=\sum_{\substack{p \lt q\\ pq \le N}} M(p,q,N).$
The inner loop breaks once \(pq \gt N\), and the outer loop breaks once \(2p \gt N\), because \(q \ge 2\) for every prime. These are mathematically valid pruning rules, not merely implementation heuristics.
Worked Example: \(N = 100\)
The problem statement gives three useful checkpoints, and they explain the method well.
For \((p,q)=(2,3)\), the admissible values are
$6,\ 12,\ 18,\ 24,\ 36,\ 48,\ 54,\ 72,\ 96,$
so
$M(2,3,100)=96.$
For \((p,q)=(3,5)\), the admissible values are
$15,\ 45,\ 75,$
hence
$M(3,5,100)=75.$
For \((p,q)=(2,73)\), the smallest possible product is \(2\cdot 73=146 \gt 100\), so
$M(2,73,100)=0.$
Summing \(M(p,q,100)\) over all prime pairs gives the checkpoint
$S(100)=2262.$
Deriving the Loop Structure Used in Code
The helper routine max_for_pair(p, q, limit) follows the mathematics directly. It first loops over
$p,\ p^2,\ p^3,\dots,\ p^a \le \frac{N}{q},$
because if \(p^a \gt N/q\), then even \(p^a q\) is already too large. For each such power \(p^a\), it starts with \(p^a q\) and keeps multiplying by \(q\) while staying within the limit. The best value seen over the whole scan is precisely \(M(p,q,N)\).
The overflow-safe tests in code, such as value > limit / q before multiplying by \(q\), are simply
integer forms of the inequality \(value \cdot q \le N\).
How the Code Works
The C++, Java, and Python solutions all follow the same three-stage structure.
First, sieve_primes(limit) computes the prime list up to \(N\). Second,
max_for_pair(p, q, limit) evaluates \(M(p,q,N)\) by exponent scanning. Third, the main solver loops
over all prime pairs with \(pq \le N\) and accumulates the corresponding maxima.
The implementation also keeps a seen array before adding a value to the sum. Mathematically, distinct
prime pairs cannot produce the same nonzero number because prime factorization is unique, so this guard is
defensive rather than essential. It does not change the result.
Complexity Analysis
Generating primes up to \(N\) by sieve costs \(O(N\log\log N)\) time and \(O(N)\) memory.
For a fixed pair \((p,q)\), the exponent scan performs at most \(O(\log_p N \cdot \log_q N)\) loop iterations in the worst case, and usually far fewer because the loops stop as soon as the product exceeds \(N\).
Therefore the total running time is
$O\!\left(N\log\log N + \sum_{\substack{p \lt q\\ pq \le N}} \log_p N \cdot \log_q N\right),$
with \(O(N)\) memory for the sieve and the marking array. In practice this is dramatically faster than factoring every integer from \(1\) to \(N\).
Footnotes and References
- Problem page: https://projecteuler.net/problem=347
- Sieve of Eratosthenes: Wikipedia — Sieve of Eratosthenes
- Fundamental theorem of arithmetic: Wikipedia — Fundamental theorem of arithmetic
- Prime counting and asymptotic background: G. H. Hardy, E. M. Wright, An Introduction to the Theory of Numbers.
- Integer exponent bounds and logarithms: Wikipedia — Logarithm