Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 615: The Millionth Number with at Least One Million Prime Factors

View on Project Euler

Project Euler Problem 615 Solution

EulerSolve provides an optimized solution for Project Euler Problem 615, The Millionth Number with at Least One Million Prime Factors, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Let \(\Omega(n)\) denote the number of prime factors of \(n\), counted with multiplicity. For a fixed rank \(R\), define $a_t(R)=\min\left\{x:\#\{n\le x:\Omega(n)\ge t\}\ge R\right\}.$ This means that \(a_t(R)\) is the \(R\)-th smallest positive integer whose prime-factor count is at least \(t\). The problem asks for $a_{10^6}(10^6)\pmod{123454321}.$ A direct scan through the integers is hopeless, because the target threshold and the target rank are both one million. The implementations therefore search at a much smaller auxiliary level and then lift the answer to the final threshold by a rigorous doubling argument. Mathematical Approach The central idea is to work with an intermediate threshold \(c\), find the millionth number with at least \(c\) prime factors, and then prove that once this number lies below \(3^c\), every later threshold is obtained by repeated multiplication by \(2\). Step 1: Enumerate Integers by Nondecreasing Prime Factors Every integer \(n\ge 2\) has a unique factorization $n=p_1p_2\cdots p_r,\qquad p_1\le p_2\le \cdots \le p_r,$ where \(r=\Omega(n)\). Writing the prime factors in nondecreasing order removes duplicates automatically. For example, $72=2\cdot 2\cdot 2\cdot 3\cdot 3$ appears only once when the search is constrained to stay nondecreasing....

Detailed mathematical approach

Problem Summary

Let \(\Omega(n)\) denote the number of prime factors of \(n\), counted with multiplicity. For a fixed rank \(R\), define

$a_t(R)=\min\left\{x:\#\{n\le x:\Omega(n)\ge t\}\ge R\right\}.$

This means that \(a_t(R)\) is the \(R\)-th smallest positive integer whose prime-factor count is at least \(t\). The problem asks for

$a_{10^6}(10^6)\pmod{123454321}.$

A direct scan through the integers is hopeless, because the target threshold and the target rank are both one million. The implementations therefore search at a much smaller auxiliary level and then lift the answer to the final threshold by a rigorous doubling argument.

Mathematical Approach

The central idea is to work with an intermediate threshold \(c\), find the millionth number with at least \(c\) prime factors, and then prove that once this number lies below \(3^c\), every later threshold is obtained by repeated multiplication by \(2\).

Step 1: Enumerate Integers by Nondecreasing Prime Factors

Every integer \(n\ge 2\) has a unique factorization

$n=p_1p_2\cdots p_r,\qquad p_1\le p_2\le \cdots \le p_r,$

where \(r=\Omega(n)\). Writing the prime factors in nondecreasing order removes duplicates automatically. For example,

$72=2\cdot 2\cdot 2\cdot 3\cdot 3$

appears only once when the search is constrained to stay nondecreasing. A depth-first search over such ordered prime sequences therefore enumerates every integer exactly once, while the recursion depth records \(\Omega(n)\).

Step 2: Search at a Working Threshold \(c\) Under the Barrier \(3^c\)

For a temporary threshold \(c\), consider the set

$S_c(X)=\{n\le X:\Omega(n)\ge c\}.$

The implementations raise \(c\) step by step and, at the same time, set the search limit to

$X=3^c.$

They stop as soon as

$\#S_c(3^c)\ge 10^6.$

At that moment, the millionth element for threshold \(c\) satisfies

$a_c(10^6)\le 3^c.$

This inequality is the exact condition needed for the doubling lemma in the next step.

Step 3: Prove the Doubling Lemma

Assume that for some \(c\) and fixed rank \(R\) we already know

$a_c(R)\le 3^c.$

We claim that

$a_{c+1}(R)=2\,a_c(R).$

First, there are at least \(R\) numbers with \(\Omega(n)\ge c\) and \(n\le a_c(R)\). Doubling each of them gives at least \(R\) numbers with \(\Omega(n)\ge c+1\) and value at most \(2a_c(R)\). Hence

$a_{c+1}(R)\le 2a_c(R).$

For the reverse direction, suppose \(m<2a_c(R)\) and \(\Omega(m)\ge c+1\). If \(m\) were odd, then all of its prime factors would be at least \(3\), so

$m\ge 3^{c+1}>2\cdot 3^c\ge 2a_c(R),$

a contradiction. Therefore every such \(m\) must be even, and then

$\Omega\!\left(\frac{m}{2}\right)\ge c,\qquad \frac{m}{2}<a_c(R).$

So the map \(m\mapsto m/2\) sends all numbers counted below \(2a_c(R)\) at level \(c+1\) into the numbers counted below \(a_c(R)\) at level \(c\). There are at most \(R-1\) of those. Hence fewer than \(R\) numbers with \(\Omega(n)\ge c+1\) lie below \(2a_c(R)\), and together with the previous inequality this proves

$a_{c+1}(R)=2a_c(R).$

Step 4: Iterate the Lemma Up to the Final Threshold

The problem uses

$R=T=10^6.$

Once the search finds a level \(c\) with \(a_c(R)\le 3^c\), Step 3 can be applied repeatedly:

$a_{c+1}(R)=2a_c(R),\quad a_{c+2}(R)=2a_{c+1}(R),\quad \dots,\quad a_T(R)=2^{T-c}a_c(R).$

Therefore the final answer is

$a_{10^6}(10^6)\equiv a_c(10^6)\cdot 2^{10^6-c}\pmod{123454321}.$

This is why the search only needs to determine the millionth value at the smaller threshold \(c\).

Step 5: Use a Logarithmic Lower Bound to Prune the Search Tree

Suppose the recursion is currently at a product \(n\), has already used \(u\) prime factors, and the next allowed prime is at least \(p_i\). To reach threshold \(c\), we still need \(c-u\) more prime factors. The smallest possible completion therefore has size at least

$n\,p_i^{\,c-u}.$

If this lower bound already exceeds \(3^c\), then no descendant of the current state can contribute a valid value. The pruning condition is

$n\,p_i^{\,c-u}>3^c.$

The implementations evaluate the same test in logarithmic form to avoid overflow:

$\log n+(c-u)\log p_i>c\log 3.$

Because the next prime only grows as the loop advances, once the bound fails for one choice of prime it fails for all later choices too, so the loop can stop immediately.

Worked Example: Rank \(R=5\) and Threshold \(T=5\)

This small example shows the same mechanism on manageable numbers. Start with \(c=3\), so the barrier is

$3^3=27.$

The integers \(n\le 27\) with \(\Omega(n)\ge 3\) are

$8,\ 12,\ 16,\ 18,\ 20,\ 24,\ 27.$

Therefore

$a_3(5)=20\le 27.$

Now the doubling lemma applies. Hence

$a_4(5)=40,\qquad a_5(5)=80.$

So the fifth number with at least five prime factors is \(80\). This is exactly the kind of checkpoint used to confirm that the enumeration and the lifting rule agree.

How the Code Works

The C++, Python, and Java implementations follow the same pipeline. First they build a prime table up to a fixed ceiling and also store the logarithm of each prime, because the pruning rule is checked thousands of times during the depth-first search.

For a current threshold \(c\), the search limit is set to \(3^c\). The recursion then enumerates products of primes in nondecreasing order. Every time the current product already has at least \(c\) prime factors, that product is recorded as a candidate. The recursion still continues deeper, because numbers with more than \(c\) prime factors also qualify as long as they stay under the same limit.

The logarithmic lower bound from Step 5 cuts off most branches very early. Moreover, once a branch fails for a certain next prime, the implementation stops trying larger next primes from the same state, because the lower bound can only get worse.

After each run, the candidate list is checked. If it still contains fewer than one million values, the working threshold is increased by \(1\) and the search limit is multiplied by \(3\). When the list is finally large enough, it is sorted, the millionth value is extracted as \(a_c(10^6)\), and then the implementation multiplies by \(2\) exactly \(10^6-c\) times modulo \(123454321\). That last phase is equivalent to multiplying by \(2^{10^6-c}\) modulo the same number.

Complexity Analysis

Let \(B\) be the prime-sieve ceiling, let \(V\) be the number of recursive states visited at the final successful threshold, and let \(M\) be the number of collected candidates at that threshold. Building the prime table costs \(O(B\log\log B)\) time and \(O(B)\) memory. The depth-first enumeration costs \(O(V)\) time, because each surviving search state is processed once. Sorting the candidate list costs \(O(M\log M)\) time and \(O(M)\) additional storage for the list itself.

The final lifting phase performs \(10^6-c\) modular doublings, so it adds \(O(10^6-c)\) time and \(O(1)\) extra memory. Overall, the running time is

$O(B\log\log B+V+M\log M+10^6-c),$

and the memory usage is

$O(B+M).$

The dominant practical cost is the search tree size \(V\), but the lower bound from Step 5 keeps it manageable.

Footnotes and References

  1. Project Euler problem page: https://projecteuler.net/problem=615
  2. Prime omega function: Wikipedia — Prime omega function
  3. Prime factorization: Wikipedia — Prime factorization
  4. Sieve of Eratosthenes: Wikipedia — Sieve of Eratosthenes
  5. Branch and bound: Wikipedia — Branch and bound

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 614 · All Project Euler solutions · Next: Problem 616

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮