Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 187: Semiprimes

View on Project Euler

Project Euler Problem 187 Solution

EulerSolve provides an optimized solution for Project Euler Problem 187, Semiprimes, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Project Euler 187 asks for the number of composite integers \(n \lt 10^8\) that have exactly two prime factors when multiplicity is counted. In other words, we must count the semiprimes below the limit, including prime squares such as \(4=2^2\), \(9=3^2\), and \(25=5^2\). Instead of factoring every integer below the limit, the solution counts prime pairs directly. If a number is semiprime, then it can be written as \(n=pq\) with primes \(p \le q\). Once that order is enforced, every valid semiprime corresponds to exactly one pair \((p,q)\). Mathematical Approach Let \(L\) denote the upper bound. The target quantity is $S(L)=\#\{n \lt L : n=pq,\ p \le q,\ p,q\text{ prime}\}.$ The key idea is to choose the smaller prime first and then count how many larger primes can be paired with it. Ordered prime pairs give a one-to-one model Every semiprime has a factorization \(n=pq\) with primes \(p \le q\). Because the smaller factor is placed first, the representation is unique. So the whole problem becomes a counting problem on ordered prime pairs satisfying $pq \lt L,\qquad p \le q.$ This removes the need for individual factorization. We never inspect a candidate integer and ask whether it is semiprime; we build exactly the semiprimes we want by counting admissible pairs....

Detailed mathematical approach

Problem Summary

Project Euler 187 asks for the number of composite integers \(n \lt 10^8\) that have exactly two prime factors when multiplicity is counted. In other words, we must count the semiprimes below the limit, including prime squares such as \(4=2^2\), \(9=3^2\), and \(25=5^2\).

Instead of factoring every integer below the limit, the solution counts prime pairs directly. If a number is semiprime, then it can be written as \(n=pq\) with primes \(p \le q\). Once that order is enforced, every valid semiprime corresponds to exactly one pair \((p,q)\).

Mathematical Approach

Let \(L\) denote the upper bound. The target quantity is

$S(L)=\#\{n \lt L : n=pq,\ p \le q,\ p,q\text{ prime}\}.$

The key idea is to choose the smaller prime first and then count how many larger primes can be paired with it.

Ordered prime pairs give a one-to-one model

Every semiprime has a factorization \(n=pq\) with primes \(p \le q\). Because the smaller factor is placed first, the representation is unique. So the whole problem becomes a counting problem on ordered prime pairs satisfying

$pq \lt L,\qquad p \le q.$

This removes the need for individual factorization. We never inspect a candidate integer and ask whether it is semiprime; we build exactly the semiprimes we want by counting admissible pairs.

Why the sieve only has to reach \(\left\lfloor\frac{L-1}{2}\right\rfloor\)

In every valid pair, the smaller prime satisfies \(p \ge 2\). Therefore the larger prime must satisfy

$q \le \left\lfloor\frac{L-1}{2}\right\rfloor.$

Indeed, if \(q\) were larger than \((L-1)/2\), then even the smallest possible partner \(p=2\) would give \(2q \ge L\). So every prime that can appear in any valid semiprime lies in the interval up to

$Q_{\max}=\left\lfloor\frac{L-1}{2}\right\rfloor.$

That is exactly why the implementations sieve primes only up to \(Q_{\max}\) rather than all the way to \(L\).

Counting the partner primes for a fixed \(p\)

Fix a prime \(p\) and insist that it is the smaller factor. Then \(q\) must satisfy two conditions: it must be at least \(p\), and the product must stay below \(L\). So the admissible range is

$p \le q \le \left\lfloor\frac{L-1}{p}\right\rfloor.$

If \(\pi(x)\) denotes the prime-counting function, then the number of valid partners for this one \(p\) is

$\pi\!\left(\left\lfloor\frac{L-1}{p}\right\rfloor\right)-\pi(p)+1.$

The implementations do not construct a separate array of \(\pi(x)\) values. They get the same count by binary-searching the sorted prime list and measuring how many primes in the suffix beginning at \(p\) stay below the upper bound.

Why the loop stops once \(p^2 \ge L\)

For a fixed smaller factor \(p\), the smallest possible partner is \(q=p\). If even \(p^2\) is already too large, then no larger prime \(q\) can work. Therefore only primes satisfying

$p^2 \lt L$

need to be processed. This explains the early termination condition used by all three implementations.

Worked example: the sample limit \(L=30\)

The problem statement says there are 10 semiprimes below 30. The counting method reproduces that result immediately.

Here

$Q_{\max}=\left\lfloor\frac{29}{2}\right\rfloor=14,$

so the relevant primes are \(2,3,5,7,11,13\).

For \(p=2\), we get \(q \le \lfloor 29/2 \rfloor = 14\), so the valid partners are \(2,3,5,7,11,13\). That contributes 6 semiprimes:

$2\cdot2,\ 2\cdot3,\ 2\cdot5,\ 2\cdot7,\ 2\cdot11,\ 2\cdot13.$

For \(p=3\), the bound is \(q \le \lfloor 29/3 \rfloor = 9\), so the valid partners are \(3,5,7\). That contributes 3 more:

$3\cdot3,\ 3\cdot5,\ 3\cdot7.$

For \(p=5\), the bound is \(q \le \lfloor 29/5 \rfloor = 5\), so only \(5\cdot5\) remains. At \(p=7\), we already have \(7^2=49 \ge 30\), so the process stops. The total is

$6+3+1=10,$

matching the sample list \(4,6,9,10,14,15,21,22,25,26\).

Final summation formula

If \(p_1 \lt p_2 \lt \cdots \lt p_m\) are the primes up to \(Q_{\max}\), then the answer can be written as

$S(L)=\sum_{\substack{1 \le i \le m\\ p_i^2 \lt L}} \#\left\{j \ge i : p_j \le \left\lfloor\frac{L-1}{p_i}\right\rfloor\right\}.$

This formula is exactly what the implementations evaluate. The only tasks are generating the ordered prime list once and then locating the right endpoint for each \(p_i\).

How the Code Works

The C++, Python, and Java implementations all begin with a sieve of Eratosthenes up to \(Q_{\max}\). The sieve invariant is the standard one: when the algorithm reaches a prime \(r\), every composite smaller than \(r^2\) has already been marked, so it is enough to strike out multiples starting from \(r^2\).

Once the prime list is available, the implementation scans that list from left to right. The current prime is treated as the smaller factor \(p\). If \(p^2 \ge L\), the loop ends immediately; otherwise the code computes the bound \(\left\lfloor\frac{L-1}{p}\right\rfloor\) and performs an upper-bound search in the sorted prime list.

Starting that search from the current index rather than from the beginning is the implementation-level version of enforcing \(q \ge p\). That single detail ensures that mixed products such as \(2\cdot13\) and \(13\cdot2\) are not both counted, while squares such as \(5\cdot5\) are still included. The three languages differ only in surface syntax: one uses a standard library upper-bound operation, one uses built-in list bisection, and one performs the same upper-bound search manually over an array.

Complexity Analysis

Let \(Q_{\max}=\left\lfloor\frac{L-1}{2}\right\rfloor\). The sieve phase costs

$O\!\left(Q_{\max}\log\log Q_{\max}\right)$

time and

$O(Q_{\max})$

memory.

The second phase processes only the primes below \(\sqrt{L}\), so it performs \(\pi(\sqrt{L})\) iterations, each with one binary search in the prime list. That contributes

$O\!\left(\pi(\sqrt{L})\log \pi(Q_{\max})\right)$

additional time. For the Euler limit \(L=10^8\), this post-sieve work is much smaller than factoring every integer below the limit, and the sieve remains the dominant cost.

Footnotes and References

  1. Project Euler problem 187: https://projecteuler.net/problem=187
  2. Semiprime: Wikipedia - Semiprime
  3. Sieve of Eratosthenes: Wikipedia - Sieve of Eratosthenes
  4. Prime-counting function: Wikipedia - Prime-counting function
  5. Binary search: Wikipedia - Binary search algorithm

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 186 · All Project Euler solutions · Next: Problem 188

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮