Problem 920: Tau Numbers
View on Project EulerProject Euler Problem 920 Solution
EulerSolve provides an optimized solution for Project Euler Problem 920, Tau Numbers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary A tau number is a positive integer \(n\) for which the divisor count \(\tau(n)\) divides \(n\). With the fixed limit \(L=10^{16}\), the problem asks us to look at every divisor-count value \(k\) for which there exists at least one tau number \(n\le L\) satisfying \(\tau(n)=k\), and then define $m(k)=\min\{n\le L:\tau(n)=k,\ k\mid n\}.$ The required answer is the sum of all existing values \(m(k)\). This includes the trivial case \(m(1)=1\), but the real difficulty is global: for every reachable divisor count \(k\), we must identify the smallest admissible integer rather than just produce one example. Mathematical Approach The implementations do not scan integers. They search over prime-exponent patterns, because both the divisor function and the condition \(\tau(n)\mid n\) become rigid and testable in that language. Exponent vectors are the natural state space Write a candidate number as $n=\prod_{i=1}^{r} b_i^{e_i},\qquad b_1<b_2<\cdots<b_r,\qquad e_1\ge e_2\ge \cdots\ge e_r\ge 1.$ The exponents are arranged in non-increasing order because, once the multiset of exponents is fixed, the smallest possible integer is obtained by placing larger exponents on smaller primes. In this representation, the divisor count is $\tau(n)=\prod_{i=1}^{r}(e_i+1)=:T.$ So every exponent vector determines one divisor-count value \(T\)....
Detailed mathematical approach
Problem Summary
A tau number is a positive integer \(n\) for which the divisor count \(\tau(n)\) divides \(n\). With the fixed limit \(L=10^{16}\), the problem asks us to look at every divisor-count value \(k\) for which there exists at least one tau number \(n\le L\) satisfying \(\tau(n)=k\), and then define
$m(k)=\min\{n\le L:\tau(n)=k,\ k\mid n\}.$
The required answer is the sum of all existing values \(m(k)\). This includes the trivial case \(m(1)=1\), but the real difficulty is global: for every reachable divisor count \(k\), we must identify the smallest admissible integer rather than just produce one example.
Mathematical Approach
The implementations do not scan integers. They search over prime-exponent patterns, because both the divisor function and the condition \(\tau(n)\mid n\) become rigid and testable in that language.
Exponent vectors are the natural state space
Write a candidate number as
$n=\prod_{i=1}^{r} b_i^{e_i},\qquad b_1<b_2<\cdots<b_r,\qquad e_1\ge e_2\ge \cdots\ge e_r\ge 1.$
The exponents are arranged in non-increasing order because, once the multiset of exponents is fixed, the smallest possible integer is obtained by placing larger exponents on smaller primes. In this representation, the divisor count is
$\tau(n)=\prod_{i=1}^{r}(e_i+1)=:T.$
So every exponent vector determines one divisor-count value \(T\). Different vectors can yield the same \(T\), which is why the search must keep the best number found for each divisor-count key.
The canonical lower bound that drives pruning
For the same exponent vector, the smallest possible realization is obtained by using the first \(r\) primes:
$n_{\min}(e_1,\dots,e_r)=\prod_{i=1}^{r} p_i^{e_i},$
where \(p_i\) is the \(i\)-th prime. This number does not necessarily satisfy \(\tau(n)\mid n\); it is only the minimal number having that exponent multiset. But it is still crucial, because if even \(n_{\min}(e_1,\dots,e_r)>L\), then every other realization of the same vector is larger, so the entire branch can be discarded immediately.
This lower bound is exactly what the outer depth-first search maintains incrementally. It is the main invariant that makes the global enumeration finite.
What \(\tau(n)\mid n\) forces on the prime bases
Now factor the divisor count itself:
$T=\prod_{j=1}^{t} q_j^{\nu_j},$
with distinct primes \(q_j\). Since \(T\mid n\), every prime divisor \(q_j\) of \(T\) must also occur among the prime bases of \(n\), and its exponent inside \(n\) must be at least \(\nu_j\).
For a fixed exponent vector, this turns into an injective placement problem. Each mandatory prime \(q_j\) has to be assigned to a distinct exponent slot \(i\) satisfying
$e_i\ge \nu_j.$
If there are more mandatory primes than slots, so \(t>r\), the vector is impossible at once. Likewise, if the largest exponent is still smaller than one of the required values \(\nu_j\), no assignment can work.
Minimizing one fixed exponent pattern
Suppose a feasible assignment \(\sigma\) of the mandatory primes has been chosen. Then the resulting number has the form
$N_\sigma=\left(\prod_{j=1}^{t} q_j^{e_{\sigma(j)}}\right)\left(\prod_{i\notin \operatorname{Im}(\sigma)} s_i^{e_i}\right),$
where the remaining bases \(s_i\) are distinct primes not dividing \(T\). Once the mandatory primes are placed, the rest of the completion is forced greedily: fill the unused slots with the smallest available extra primes, matched from largest remaining exponent to smallest remaining exponent.
The justification is the swap inequality
$a^x b^y\le a^y b^x\qquad(a<b,\ x\ge y),$
which says that smaller primes belong on larger exponents. Therefore the only real branching for one vector is the placement of the mandatory primes. After that, the optimal completion is deterministic.
The best value attached to one exponent vector is therefore
$M(e_1,\dots,e_r)=\min_{\sigma} N_\sigma,$
and the desired minimum for a divisor-count value \(k\) is
$m(k)=\min_{\prod_{i=1}^{r}(e_i+1)=k} M(e_1,\dots,e_r).$
Why the global search is complete
The outer DFS builds exponent vectors in non-increasing order. Starting from the smallest prime, each new exponent is constrained not to exceed the previous one, so every exponent multiset is visited exactly once in canonical form.
At each node, two quantities are already known:
$\prod_{i=1}^{r} p_i^{e_i},\qquad T=\prod_{i=1}^{r}(e_i+1).$
The first is the pruning bound just discussed; the second is the divisor-count value whose current minimum may be improved. Because many different exponent vectors can lead to the same \(T\), the algorithm stores the best candidate found so far for each \(T\). It also caches the prime factorization of \(T\), since the same divisor count appears repeatedly during the search.
Worked Example: \(k=12\)
Take \(k=12\). Since
$12=2^2\cdot 3,$
any tau number \(n\) with \(\tau(n)=12\) must contain the prime \(2\) with exponent at least \(2\), and it must also contain the prime \(3\).
The exponent vectors satisfying
$\prod(e_i+1)=12$
are
$[11],\qquad [5,1],\qquad [3,2],\qquad [2,1,1].$
The vector \([11]\) is impossible because one slot cannot host both mandatory primes. For \([5,1]\), the smallest feasible number is
$2^5\cdot 3=96.$
For \([3,2]\), the best placement gives
$2^3\cdot 3^2=72.$
For \([2,1,1]\), the mandatory primes use the exponents \(2\) and \(1\), and the remaining slot receives the smallest extra prime not dividing \(12\), namely \(5\), producing
$2^2\cdot 3\cdot 5=60.$
Hence
$m(12)=60.$
This is exactly the comparison performed by the algorithm for every reachable divisor-count value.
How the Code Works
Outer enumeration and cached divisor factorizations
The C++, Python, and Java implementations first generate a sufficiently long prime table. Then they run an outer DFS over exponent vectors, maintaining the canonical product \(\prod p_i^{e_i}\) and the current divisor count \(T\) incrementally. Every visited node already represents a complete exponent pattern, so it is evaluated immediately as a candidate for its current \(T\).
Whenever a node is visited, the implementation looks up or computes the prime factorization of \(T\) and reuses that data through a cache. Capped multiplication is used throughout this stage so that any branch whose product would exceed \(L\) stops immediately without overflow.
Inner assignment search, greedy completion, and symmetry pruning
For the current exponent vector, a second DFS assigns the mandatory primes from the factorization of \(T\) to eligible slots. Those mandatory primes are tried in descending prime order so that expensive branches fail earlier, which strengthens pruning. If two unused slots have the same exponent, exploring both would only repeat a symmetric case, so the implementation keeps just one representative.
Once all mandatory primes have been placed, the unused slots are filled greedily with the smallest primes that do not divide \(T\). The resulting number is compared against the best value already stored for that divisor-count key, and the map is updated only if an improvement has been found.
Complexity Analysis
There is no simple closed form for the total running time, because the outer search size depends on the number of non-increasing exponent vectors whose canonical realization satisfies
$\prod_{i=1}^{r} p_i^{e_i}\le L.$
If we denote that set by \(V(L)\), then the outer DFS visits each vector in \(V(L)\) exactly once.
For one vector with \(r\) exponent slots and \(t\) distinct prime factors in \(T\), the inner search is a bounded backtracking procedure. Its naive worst case is exponential in \(t\), but in practice the tree is cut down sharply by the slot-feasibility test, the limit \(L\), the current best completion for the vector, the greedy tail fill, and the elimination of equal-exponent symmetries. Memory usage stays moderate: a prime table, a cache of divisor-count factorizations, recursion stacks, and the map of best values \(m(k)\).
Footnotes and References
- Project Euler problem page: https://projecteuler.net/problem=920
- Divisor function: Wikipedia - Divisor function
- Fundamental theorem of arithmetic: Wikipedia - Fundamental theorem of arithmetic
- Backtracking: Wikipedia - Backtracking
- Branch and bound: Wikipedia - Branch and bound