Problem 29: Distinct Powers
View on Project EulerProject Euler Problem 29 Solution
EulerSolve provides an optimized solution for Project Euler Problem 29, Distinct Powers, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Problem 29 asks for the number of distinct terms among \(a^b\) with \(2 \le a \le 100\) and \(2 \le b \le 100\). More generally, if we write $P(A,B)=\{a^b \mid 2 \le a \le A,\ 2 \le b \le B\},$ the task is to determine the cardinality \(|P(A,B)|\). The difficulty is duplication. Different pairs \((a,b)\) can represent the same integer, such as \(2^4=4^2\) and \(2^6=4^3=8^2\). A correct solution therefore needs an exact way to recognize when two powers are equal. Mathematical Approach The mathematics has two complementary parts. Unique factorization explains exactly why collisions occur, and the small search range explains why the implementation can simply generate every power exactly and let a set remove duplicates automatically. Canonical Form of a Base Write the prime factorization of a base \(a\) as $a=\prod_{i=1}^r p_i^{e_i}.$ If $g=\gcd(e_1,e_2,\dots,e_r),$ then we can factor out the common exponent: $a=\left(\prod_{i=1}^r p_i^{e_i/g}\right)^g=c^g.$ The number \(c\) is not a perfect power, because the exponents in its own prime factorization now have gcd \(1\). This gives a unique representation \(a=c^g\) with a minimal base \(c\) and an exponent \(g \ge 1\). Examples are \(4=2^2\), \(8=2^3\), \(16=2^4\), \(27=3^3\), while \(6\) is already in minimal form and must be written as \(6^1\)....
Detailed mathematical approach
Problem Summary
Problem 29 asks for the number of distinct terms among \(a^b\) with \(2 \le a \le 100\) and \(2 \le b \le 100\). More generally, if we write
$P(A,B)=\{a^b \mid 2 \le a \le A,\ 2 \le b \le B\},$
the task is to determine the cardinality \(|P(A,B)|\).
The difficulty is duplication. Different pairs \((a,b)\) can represent the same integer, such as \(2^4=4^2\) and \(2^6=4^3=8^2\). A correct solution therefore needs an exact way to recognize when two powers are equal.
Mathematical Approach
The mathematics has two complementary parts. Unique factorization explains exactly why collisions occur, and the small search range explains why the implementation can simply generate every power exactly and let a set remove duplicates automatically.
Canonical Form of a Base
Write the prime factorization of a base \(a\) as
$a=\prod_{i=1}^r p_i^{e_i}.$
If
$g=\gcd(e_1,e_2,\dots,e_r),$
then we can factor out the common exponent:
$a=\left(\prod_{i=1}^r p_i^{e_i/g}\right)^g=c^g.$
The number \(c\) is not a perfect power, because the exponents in its own prime factorization now have gcd \(1\). This gives a unique representation \(a=c^g\) with a minimal base \(c\) and an exponent \(g \ge 1\).
Examples are \(4=2^2\), \(8=2^3\), \(16=2^4\), \(27=3^3\), while \(6\) is already in minimal form and must be written as \(6^1\).
When Two Powers Coincide
Suppose two bases are written in minimal form:
$a_1=c_1^{k_1},\qquad a_2=c_2^{k_2}.$
Then
$a_1^{b_1}=c_1^{k_1b_1},\qquad a_2^{b_2}=c_2^{k_2b_2}.$
By the fundamental theorem of arithmetic, these two values are equal if and only if the primitive bases agree and the total exponents agree:
$a_1^{b_1}=a_2^{b_2}\iff c_1=c_2,\ k_1b_1=k_2b_2.$
So collisions never mix different minimal bases. They occur only inside the same perfect-power family. The family generated by minimal base \(2\), for instance, contains \(2,4,8,16,32,64\), and every one of those bases contributes values of the form \(2^m\).
A Worked Example: \(A=B=5\)
For \(2 \le a,b \le 5\), there are initially \(4 \cdot 4 = 16\) candidate pairs. Among the bases, \(2\) and \(4=2^2\) share minimal base \(2\), while \(3\) and \(5\) are already minimal.
The values coming from the minimal base \(2\) are
$2^2,2^3,2^4,2^5,\qquad 4^2=2^4,\ 4^3=2^6,\ 4^4=2^8,\ 4^5=2^{10}.$
The distinct exponents here are \(\{2,3,4,5,6,8,10\}\), so this family contributes \(7\) distinct values. The minimal base \(3\) contributes \(3^2,3^3,3^4,3^5\), and the minimal base \(5\) contributes \(5^2,5^3,5^4,5^5\). Those two families are disjoint from everything else, so
$|P(5,5)|=7+4+4=15.$
At the pair-count level, the only duplication is the collision \(2^4=4^2\).
Why Exact Enumeration Is Enough Here
One could count these families analytically, but the given bounds are small. The full Project Euler instance has only
$99 \cdot 99 = 9801$
candidate powers. The largest of them is
$100^{100}=10^{200},$
which is large but still routine for arbitrary-precision integer types. Because integer equality is exact, storing every computed power in a set gives the correct cardinality without any special duplicate-removal formula.
How the Code Works
Incremental Power Generation
For each base \(a\), the implementation starts from \(1\) and repeatedly multiplies by \(a\). After one multiplication the current value is \(a\), after two it is \(a^2\), and if we denote the stored value after \(b\) multiplications by \(v_b\), the loop invariant is
$v_b=a^b.$
This avoids recomputing powers from scratch. Once \(b \ge 2\), the current value belongs to \(P(A,B)\), so it is inserted into the deduplication container.
Deduplication by Exact Integer Comparison
The C++, Python, and Java implementations all use exact integer arithmetic, so a collision such as \(4^5=2^{10}\) produces literally the same stored integer in every language. The container therefore keeps only one copy of each value, and after all pairs \((a,b)\) have been processed, its size is exactly \(|P(A,B)|\).
The three versions differ only in the concrete container type: the C++ implementation uses an ordered set of arbitrary-precision integers, while the Python and Java implementations use exact big integers inside hash-based sets. They also include small checkpoints that verify \(|P(5,5)|=15\) and \(|P(10,10)|=69\).
Complexity Analysis
Let \(N=(A-1)(B-1)\) be the number of generated powers. The value \(a^b\) has \(\Theta(b\log a)\) bits, so the cost of multiplying and storing it grows with the exponent. A high-level digit-cost model gives
$O\!\left(\sum_{a=2}^{A}\sum_{b=2}^{B} b\log a\right)=O(AB^2\log A)$
for the arithmetic work needed to build all values incrementally.
On top of that, each insertion pays container overhead: an ordered-set lookup in the C++ version and expected constant-time hashing in the Python and Java versions, all applied to arbitrary-precision integers. Space usage is proportional to the number of distinct powers kept in the set. For the Euler bounds this remains modest: at most 9801 values are generated, and even the largest one has only 201 decimal digits.
Footnotes and References
- Project Euler problem page: https://projecteuler.net/problem=29
- Perfect power: Wikipedia - Perfect power
- Fundamental theorem of arithmetic: Wikipedia - Fundamental theorem of arithmetic
- Arbitrary-precision arithmetic: Wikipedia - Arbitrary-precision arithmetic
- OEIS sequence for distinct powers: OEIS A117805