Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 231: Prime Factorisation of Binomial Coefficients

View on Project Euler

Project Euler Problem 231 Solution

EulerSolve provides an optimized solution for Project Euler Problem 231, Prime Factorisation of Binomial Coefficients, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary The target quantity is the sum of the prime factors of the binomial coefficient \(\binom{20{,}000{,}000}{15{,}000{,}000}\), where each prime is counted with multiplicity. If a positive integer \(N\) has prime factorisation $N=\prod_{p \text{ prime}} p^{e_p},$ then the required weighted sum is $S(N)=\sum_{p \text{ prime}} p\,e_p.$ So for this problem we must compute $S\!\left(\binom{20{,}000{,}000}{15{,}000{,}000}\right)=\sum_{p \le 20{,}000{,}000} p\,v_p\!\left(\binom{20{,}000{,}000}{15{,}000{,}000}\right),$ where \(v_p(x)\) is the exponent of the prime \(p\) in \(x\). Directly expanding or factoring the binomial coefficient is impossible in practice, so the solution works entirely with prime exponents. Mathematical Approach The whole method rests on one idea: instead of building the enormous integer \(\binom{n}{k}\), determine how many times each prime occurs in it and add the weighted contributions \(p \cdot v_p\). Prime Exponents Inside a Binomial Coefficient Write $\binom{n}{k}=\frac{n!}{k!(n-k)!}.$ Prime exponents are additive across multiplication and subtractive across division, so for every prime \(p\), $v_p\!\left(\binom{n}{k}\right)=v_p(n!) - v_p(k!) - v_p((n-k)!).$ This is the exact cancellation pattern used by the implementations. It turns the problem into three factorial-valuation queries per prime....

Detailed mathematical approach

Problem Summary

The target quantity is the sum of the prime factors of the binomial coefficient \(\binom{20{,}000{,}000}{15{,}000{,}000}\), where each prime is counted with multiplicity. If a positive integer \(N\) has prime factorisation

$N=\prod_{p \text{ prime}} p^{e_p},$

then the required weighted sum is

$S(N)=\sum_{p \text{ prime}} p\,e_p.$

So for this problem we must compute

$S\!\left(\binom{20{,}000{,}000}{15{,}000{,}000}\right)=\sum_{p \le 20{,}000{,}000} p\,v_p\!\left(\binom{20{,}000{,}000}{15{,}000{,}000}\right),$

where \(v_p(x)\) is the exponent of the prime \(p\) in \(x\). Directly expanding or factoring the binomial coefficient is impossible in practice, so the solution works entirely with prime exponents.

Mathematical Approach

The whole method rests on one idea: instead of building the enormous integer \(\binom{n}{k}\), determine how many times each prime occurs in it and add the weighted contributions \(p \cdot v_p\).

Prime Exponents Inside a Binomial Coefficient

Write

$\binom{n}{k}=\frac{n!}{k!(n-k)!}.$

Prime exponents are additive across multiplication and subtractive across division, so for every prime \(p\),

$v_p\!\left(\binom{n}{k}\right)=v_p(n!) - v_p(k!) - v_p((n-k)!).$

This is the exact cancellation pattern used by the implementations. It turns the problem into three factorial-valuation queries per prime. The complement \(n-k\) matters naturally because the denominator contains both \(k!\) and \((n-k)!\), and the identity \(\binom{n}{k}=\binom{n}{n-k}\) explains why the two smaller factorials play symmetric roles.

Legendre's Formula for Factorials

To evaluate \(v_p(m!)\), the code uses Legendre's formula:

$v_p(m!)=\sum_{j \ge 1}\left\lfloor \frac{m}{p^j}\right\rfloor.$

Each term has a clean meaning. The term \(\left\lfloor m/p \right\rfloor\) counts the multiples of \(p\) in \(1,2,\dots,m\); the term \(\left\lfloor m/p^2 \right\rfloor\) counts the extra copies contributed by multiples of \(p^2\); the next term does the same for \(p^3\), and so on. The sum is finite because once \(p^j > m\), every later floor is zero.

Computationally, this means we can start with \(x=m\), repeatedly divide by \(p\), and add the quotients:

$v_p(m!)=\left\lfloor \frac{m}{p}\right\rfloor+\left\lfloor \frac{m}{p^2}\right\rfloor+\left\lfloor \frac{m}{p^3}\right\rfloor+\cdots.$

Turning Valuations into the Required Sum

Substituting Legendre's formula into the binomial identity gives

$v_p\!\left(\binom{n}{k}\right)=\sum_{j \ge 1}\left(\left\lfloor \frac{n}{p^j}\right\rfloor-\left\lfloor \frac{k}{p^j}\right\rfloor-\left\lfloor \frac{n-k}{p^j}\right\rfloor\right).$

Therefore

$S\!\left(\binom{n}{k}\right)=\sum_{p \le n} p \sum_{j \ge 1}\left(\left\lfloor \frac{n}{p^j}\right\rfloor-\left\lfloor \frac{k}{p^j}\right\rfloor-\left\lfloor \frac{n-k}{p^j}\right\rfloor\right).$

Only primes up to \(n\) can contribute, because no prime larger than \(n\) appears in \(n!\). That is why the first phase of the algorithm is simply to list all primes \(p \le n\) with a sieve.

Once those primes are known, every contribution is independent. For each prime \(p\), compute its exponent in \(n!\), subtract the two denominator exponents, multiply by \(p\), and add the result to the total.

Worked Example: \(\binom{10}{3}\)

The small checkpoint used by the implementations is

$\binom{10}{3}=120=2^3 \cdot 3 \cdot 5,$

so the weighted prime-factor sum should be

$S\!\left(\binom{10}{3}\right)=2 \cdot 3 + 3 \cdot 1 + 5 \cdot 1 = 14.$

Legendre's formula reproduces that value directly. For \(p=2\),

$v_2(10!)=5+2+1=8,\qquad v_2(3!)=1,\qquad v_2(7!)=3+1=4,$

hence

$v_2\!\left(\binom{10}{3}\right)=8-1-4=3.$

For \(p=3\),

$v_3(10!)=3+1=4,\qquad v_3(3!)=1,\qquad v_3(7!)=2,$

so

$v_3\!\left(\binom{10}{3}\right)=4-1-2=1.$

For \(p=5\),

$v_5(10!)=2,\qquad v_5(3!)=0,\qquad v_5(7!)=1,$

thus

$v_5\!\left(\binom{10}{3}\right)=2-0-1=1.$

And for \(p=7\), the cancellation is complete:

$v_7(10!)=1,\qquad v_7(3!)=0,\qquad v_7(7!)=1,\qquad v_7\!\left(\binom{10}{3}\right)=0.$

This example shows exactly why the method is reliable: the binomial coefficient never has to be formed explicitly, yet every prime exponent is recovered correctly.

How the Code Works

The C++, Python, and Java implementations all follow the same three-stage computation.

Generate All Relevant Primes

First, the implementation runs a sieve of Eratosthenes up to \(n\). A byte or boolean array marks which integers are still prime, composite multiples are crossed out, and the surviving indices are collected into a prime list. Because the target instance uses \(n=20{,}000{,}000\), this precomputation is large but still straightforward.

Evaluate Factorial Valuations by Repeated Division

For each prime \(p\), the implementation computes \(v_p(m!)\) by repeatedly dividing \(m\) by \(p\) and accumulating the quotients. The same routine is applied to \(n\), to \(k\), and to the complementary value \(n-k\). This is a direct implementation of Legendre's formula and avoids any factorisation of individual integers.

Accumulate the Weighted Contribution

After the three valuations are known, the exponent of \(p\) in the binomial coefficient is obtained by subtraction:

$v_p\!\left(\binom{n}{k}\right)=v_p(n!) - v_p(k!) - v_p((n-k)!).$

The implementation then adds

$p \cdot v_p\!\left(\binom{n}{k}\right)$

to a running integer total. The C++ implementation also includes a small checkpoint for \(\binom{10}{3}\), verifying that the method returns 14 before the main instance is processed. The Python and Java versions perform the same mathematics with the same default parameters.

Complexity Analysis

The sieve up to \(n\) costs \(O(n \log \log n)\) time and \(O(n)\) memory. That is the dominant memory expense, because the implementation stores one primality mark per integer up to \(n\).

After the sieve, each prime \(p\) requires \(O(\log_p n)\) divisions to evaluate Legendre's sum, so the valuation phase costs

$O\!\left(\sum_{p \le n}\log_p n\right).$

Combining both parts gives total running time

$O\!\left(n \log \log n + \sum_{p \le n}\log_p n\right),$

with memory usage \(O(n)\). For the actual problem size, the algorithm is efficient because it never manipulates the gigantic binomial coefficient itself; it only works with prime counts.

Footnotes and References

  1. Project Euler 231
  2. Binomial coefficient
  3. Legendre's formula
  4. \(p\)-adic valuation
  5. Sieve of Eratosthenes

Mathematical approach · C++ solution · Python solution · Java solution

Previous: Problem 230 · All Project Euler solutions · Next: Problem 232

Need help with a problem? Ask me! 💡
e
✦ Euler GLM 5.2
Hi! I'm Euler. Ask me anything about Project Euler problems, math concepts, or solution approaches! 🧮