Problem 365: A Huge Binomial Coefficient
View on Project EulerProject Euler Problem 365 Solution
EulerSolve provides an optimized solution for Project Euler Problem 365, A Huge Binomial Coefficient, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Define $N=10^{18},\qquad K=10^9,\qquad B=\binom{N}{K}.$ For every prime \(p\) with \(1000 \lt p \lt 5000\), we need the residue \(B \bmod p\). Then, for every triple of distinct primes \(p \lt q \lt r\) in that interval, we reconstruct the unique number \(x_{pqr}\) with $0 \le x_{pqr} \lt pqr,\qquad x_{pqr}\equiv B \pmod{p},\qquad x_{pqr}\equiv B \pmod{q},\qquad x_{pqr}\equiv B \pmod{r}.$ The required answer is the sum of all these reconstructed values. The sieve used in the code finds exactly \(501\) primes in the interval, so the outer summation runs over $\binom{501}{3}=20833250$ prime triples. Mathematical Approach Step 1: Lucas's Theorem Reduces the Huge Binomial For a fixed prime \(p\), write \(N\) and \(K\) in base \(p\): $N=\sum_{t=0}^{s} N_t p^t,\qquad K=\sum_{t=0}^{s} K_t p^t,\qquad 0 \le N_t,K_t \lt p.$ Lucas's theorem gives the congruence $\binom{N}{K}\equiv \prod_{t=0}^{s}\binom{N_t}{K_t}\pmod{p}.$ If some digit satisfies \(K_t \gt N_t\), then \(\binom{N_t}{K_t}=0\), so the whole product is \(0\pmod p\). This is exactly why the implementation stops immediately and returns zero in that case. The interval \(1000 \lt p \lt 5000\) makes the digit loop very short. Because \(p \gt 1000\), the number \(10^{18}\) has at most six base-\(p\) digits, and \(10^9\) has at most three....
Detailed mathematical approach
Problem Summary
Define
$N=10^{18},\qquad K=10^9,\qquad B=\binom{N}{K}.$
For every prime \(p\) with \(1000 \lt p \lt 5000\), we need the residue \(B \bmod p\). Then, for every triple of distinct primes \(p \lt q \lt r\) in that interval, we reconstruct the unique number \(x_{pqr}\) with
$0 \le x_{pqr} \lt pqr,\qquad x_{pqr}\equiv B \pmod{p},\qquad x_{pqr}\equiv B \pmod{q},\qquad x_{pqr}\equiv B \pmod{r}.$
The required answer is the sum of all these reconstructed values. The sieve used in the code finds exactly \(501\) primes in the interval, so the outer summation runs over
$\binom{501}{3}=20833250$
prime triples.
Mathematical Approach
Step 1: Lucas's Theorem Reduces the Huge Binomial
For a fixed prime \(p\), write \(N\) and \(K\) in base \(p\):
$N=\sum_{t=0}^{s} N_t p^t,\qquad K=\sum_{t=0}^{s} K_t p^t,\qquad 0 \le N_t,K_t \lt p.$
Lucas's theorem gives the congruence
$\binom{N}{K}\equiv \prod_{t=0}^{s}\binom{N_t}{K_t}\pmod{p}.$
If some digit satisfies \(K_t \gt N_t\), then \(\binom{N_t}{K_t}=0\), so the whole product is \(0\pmod p\). This is exactly why the implementation stops immediately and returns zero in that case.
The interval \(1000 \lt p \lt 5000\) makes the digit loop very short. Because \(p \gt 1000\), the number \(10^{18}\) has at most six base-\(p\) digits, and \(10^9\) has at most three. So each residue \(B \bmod p\) is computed from only a handful of digit-level binomials.
Step 2: Each Digit-Level Binomial Is Small
For \(0 \le b \le a \lt p\), we evaluate
$\binom{a}{b}=\frac{a!}{b!(a-b)!}\pmod{p}.$
Because \(p\) is prime and \(a,b \lt p\), the denominator is invertible modulo \(p\). The code therefore precomputes
$\text{fact}[i]=i!\pmod p,\qquad \text{invFact}[i]=(i!)^{-1}\pmod p,$
and then uses
$\binom{a}{b}\equiv \text{fact}[a]\cdot \text{invFact}[b]\cdot \text{invFact}[a-b]\pmod p.$
The modular inverses come from Fermat's little theorem:
$x^{p-1}\equiv 1\pmod p\quad\Longrightarrow\quad x^{-1}\equiv x^{p-2}\pmod p,$
which is why all three implementations contain a fast modular exponentiation routine.
Worked Lucas Checkpoint
The C++ code verifies Lucas's theorem on the smaller example \(\binom{30}{12}\). Take \(p=11\). In base \(11\),
$30=2\cdot 11+8,\qquad 12=1\cdot 11+1.$
Lucas gives
$\binom{30}{12}\equiv \binom{8}{1}\binom{2}{1}=8\cdot 2=16\equiv 5\pmod{11}.$
The exact value is \(\binom{30}{12}=86493225\), and indeed \(86493225 \equiv 5 \pmod{11}\). The checkpoint in the local source repeats this comparison for several primes.
Step 3: Reconstruct One Triple by the Chinese Remainder Theorem
Suppose we already know
$a\equiv B\pmod p,\qquad b\equiv B\pmod q,\qquad c\equiv B\pmod r,$
with \(p,q,r\) distinct primes. Since these moduli are pairwise coprime, the Chinese Remainder Theorem guarantees a unique residue modulo \(pqr\).
The code uses a Garner-style two-stage reconstruction. First write
$x_{pq}=a+p\,t.$
To make this also congruent to \(b\pmod q\), we need
$a+p\,t\equiv b\pmod q,$
so
$t\equiv (b-a)\,p^{-1}\pmod q.$
Hence
$x_{pq}=a+p\left((b-a)\,p^{-1}\bmod q\right),\qquad 0 \le x_{pq} \lt pq.$
Now lift once more by writing
$x=x_{pq}+pq\,u.$
Imposing \(x\equiv c\pmod r\) yields
$u\equiv (c-x_{pq})(pq)^{-1}\pmod r,$
and therefore
$x=x_{pq}+pq\left((c-x_{pq})(pq)^{-1}\bmod r\right),\qquad 0 \le x \lt pqr.$
This final \(x\) is the value denoted \(x_{pqr}\) in the statement.
Step 4: Why the Precomputed Inverse Table Is Enough
The implementations store every pairwise inverse
$\text{inv}[i][j]\equiv p_i^{-1}\pmod{p_j}.$
Then the inverse of a product is obtained for free:
$ (pq)^{-1}\equiv p^{-1}q^{-1}\pmod r.$
So once the table of pairwise inverses has been built, each triple reconstruction uses only a few modular additions and multiplications. There is no extended Euclidean algorithm inside the cubic loop.
Worked CRT Checkpoint
The local C++ checkpoints also test the CRT stage on \(\binom{20}{8}=125970\) and the tiny prime set \(\{7,11,13,17\}\). For the triple \((7,11,13)\), the exact residues are
$125970\equiv 5\pmod 7,\qquad 125970\equiv 9\pmod{11},\qquad 125970\equiv 0\pmod{13}.$
First combine \(7\) and \(11\):
$t\equiv (9-5)\cdot 7^{-1}\equiv 4\cdot 8\equiv 10\pmod{11},$
so
$x_{7,11}=5+7\cdot 10=75.$
Next combine with \(13\): since \(75\equiv 10\pmod{13}\) and \(77\equiv 12\pmod{13}\),
$u\equiv (0-10)\cdot 12^{-1}\equiv 3\cdot 12\equiv 10\pmod{13},$
which gives
$x_{7,11,13}=75+77\cdot 10=845.$
This matches the direct reduction \(125970\bmod(7\cdot 11\cdot 13)=845\). Summing the four tiny triples \((7,11,13)\), \((7,11,17)\), \((7,13,17)\), and \((11,13,17)\) gives the checkpoint total \(3803\) used by the source.
Step 5: Final Summation
If \(\mathcal P\) denotes the set of primes between \(1000\) and \(5000\), the target quantity is
$\boxed{S=\sum_{p \lt q \lt r,\; p,q,r\in\mathcal P} x_{pqr}.}$
Running the supplied implementations yields
$S=162619462356610313.$
How the Code Works
The three language versions follow the same structure. They first generate the prime list with a sieve, then compute every Lucas residue \(B \bmod p\), then precompute all pairwise inverses \(p_i^{-1}\bmod p_j\), and finally iterate over all prime triples. The C++ version accumulates the sum in unsigned __int128, the Java version uses BigInteger, and the Python version relies on Python's built-in arbitrary-precision integers.
Complexity Analysis
Let \(m=501\) be the number of primes in the interval. Sieve construction up to \(5000\) is negligible, roughly \(O(5000\log\log 5000)\). The Lucas preprocessing costs
$\sum_{p\in\mathcal P} O(p+\log_p N),$
because each prime builds factorial and inverse-factorial tables of size \(p\), while the digit loop is tiny. Precomputing pairwise inverses costs \(O(m^2)\) time and memory. The dominant stage is the triple loop, which performs
$\binom{m}{3}=\binom{501}{3}=20833250$
constant-time CRT merges. Therefore the overall running time is \(O(m^3)\), and the memory usage is \(O(m^2)\) because of the inverse matrix.
Footnotes and References
- Problem page: https://projecteuler.net/problem=365
- Lucas's theorem: Wikipedia — Lucas's theorem
- Chinese remainder theorem: Wikipedia — Chinese remainder theorem
- Fermat's little theorem: Wikipedia — Fermat's little theorem
- Garner-style reconstruction: cp-algorithms — Chinese remainder theorem