Problem 704: Factors of Two in Binomial Coefficients
View on Project EulerProject Euler Problem 704 Solution
EulerSolve provides an optimized solution for Project Euler Problem 704, Factors of Two in Binomial Coefficients, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary For each integer \(n\ge 1\), define $F(n)=\max_{0\le m\le n} v_2\binom{n}{m},$ where \(v_2(x)\) is the exponent of 2 in the factorization of \(x\). The required quantity is $S(N)=\sum_{n=1}^{N}F(n).$ A direct search over all binomial coefficients is far too slow when \(N\) is huge, so the solution replaces the optimization over \(m\) by a binary carry argument and then sums entire dyadic blocks at once. Mathematical Approach Let \(s_2(x)\) be the number of 1-bits in the binary expansion of \(x\). For the prime \(2\), Kummer's theorem gives $v_2\binom{n}{m}=s_2(m)+s_2(n-m)-s_2(n),$ which is also the number of carries produced when \(m\) and \(n-m\) are added in base 2. Therefore \(F(n)\) is the largest carry count obtainable by writing \(n\) as a sum of two nonnegative integers. Step 1: Convert the problem into a carry-counting question Fix \(n\ge 1\) and write $k=\lfloor\log_2 n\rfloor,$ so \(2^k\le n<2^{k+1}\). Also define $r=v_2(n+1).$ The number \(r\) is exactly the number of trailing 1s in the binary expansion of \(n\). When we add \(m\) and \(n-m\), no carry can start in those lowest \(r\) positions. At bit 0 there is no incoming carry, and at any bit whose output digit is 1, a carry-out is only possible if a carry-in already existed. By induction, the entire trailing block of 1s is carry-free....
Detailed mathematical approach
Problem Summary
For each integer \(n\ge 1\), define
$F(n)=\max_{0\le m\le n} v_2\binom{n}{m},$
where \(v_2(x)\) is the exponent of 2 in the factorization of \(x\). The required quantity is
$S(N)=\sum_{n=1}^{N}F(n).$
A direct search over all binomial coefficients is far too slow when \(N\) is huge, so the solution replaces the optimization over \(m\) by a binary carry argument and then sums entire dyadic blocks at once.
Mathematical Approach
Let \(s_2(x)\) be the number of 1-bits in the binary expansion of \(x\). For the prime \(2\), Kummer's theorem gives
$v_2\binom{n}{m}=s_2(m)+s_2(n-m)-s_2(n),$
which is also the number of carries produced when \(m\) and \(n-m\) are added in base 2. Therefore \(F(n)\) is the largest carry count obtainable by writing \(n\) as a sum of two nonnegative integers.
Step 1: Convert the problem into a carry-counting question
Fix \(n\ge 1\) and write
$k=\lfloor\log_2 n\rfloor,$
so \(2^k\le n<2^{k+1}\). Also define
$r=v_2(n+1).$
The number \(r\) is exactly the number of trailing 1s in the binary expansion of \(n\). When we add \(m\) and \(n-m\), no carry can start in those lowest \(r\) positions. At bit 0 there is no incoming carry, and at any bit whose output digit is 1, a carry-out is only possible if a carry-in already existed. By induction, the entire trailing block of 1s is carry-free.
That leaves only the positions \(r,r+1,\dots,k-1\), so
$F(n)\le k-\min(r,k).$
If \(n=2^{k+1}-1\), then \(r=k+1\) and this upper bound already says \(F(n)\le 0\), hence \(F(n)=0\).
Step 2: Build a choice of \(m\) that reaches the upper bound
Assume now that \(n\neq 2^{k+1}-1\), so \(r\le k\). Choose
$m=2^k-1.$
This number has its lower \(k\) bits equal to 1, and
$n-m=n+1-2^k.$
In binary subtraction, subtracting \(2^k-1\) from \(n\) forces a borrow to start exactly at the first 0 above the trailing block of \(r\) ones, and that borrow propagates through every higher position up to bit \(k-1\). So the number of borrows, and equivalently the number of carries in the matching addition, is exactly \(k-r\).
The same conclusion follows from the digit-sum identity:
$\begin{aligned} v_2\binom{n}{2^k-1} &=s_2(2^k-1)+s_2(n+1-2^k)-s_2(n)\\ &=k+\bigl(s_2(n)-r\bigr)-s_2(n)=k-r. \end{aligned}$
Combining this construction with the upper bound gives the exact closed form
$\boxed{F(n)=k-\min\bigl(v_2(n+1),k\bigr).}$
Step 3: Sum a complete dyadic block
Now consider one full block
$n\in[2^k,2^{k+1}-1].$
Set \(x=n+1\), so \(x\in[2^k+1,2^{k+1}]\). Then
$F(n)=k-\min(v_2(x),k).$
Therefore
$\sum_{n=2^k}^{2^{k+1}-1}F(n)=k2^k-\sum_{x=2^k+1}^{2^{k+1}}\min(v_2(x),k).$
Use the standard valuation-counting identity
$\sum_{x=A}^{B}\min(v_2(x),k)=\sum_{t=1}^{k}\#\{x\in[A,B]:2^t\mid x\}.$
Inside \([2^k+1,2^{k+1}]\), the number of multiples of \(2^t\) is exactly \(2^{k-t}\). Hence
$\sum_{x=2^k+1}^{2^{k+1}}\min(v_2(x),k)=\sum_{t=1}^{k}2^{k-t}=2^k-1,$
and the block total becomes
$\boxed{\sum_{n=2^k}^{2^{k+1}-1}F(n)=(k-1)2^k+1.}$
Step 4: Deal with the final partial block
Let
$K=\lfloor\log_2 N\rfloor.$
All blocks with indices \(1,2,\dots,K-1\) are complete. The last block is
$n\in[2^K,N],$
whose length is
$M=N-2^K+1.$
Again put \(x=n+1\), so \(x\in[2^K+1,N+1]\). Then
$\sum_{n=2^K}^{N}F(n)=KM-\sum_{x=2^K+1}^{N+1}\min(v_2(x),K).$
Counting multiples of \(2^t\) gives
$\sum_{x=2^K+1}^{N+1}\min(v_2(x),K)=\sum_{t=1}^{K}\left(\left\lfloor\frac{N+1}{2^t}\right\rfloor-\left\lfloor\frac{2^K}{2^t}\right\rfloor\right).$
Since \(\left\lfloor 2^K/2^t\right\rfloor=2^{K-t}\) for \(1\le t\le K\), we obtain the exact formula
$\boxed{S(N)=\sum_{k=1}^{K-1}\bigl((k-1)2^k+1\bigr)+K(N-2^K+1)-\sum_{t=1}^{K}\left(\left\lfloor\frac{N+1}{2^t}\right\rfloor-2^{K-t}\right).}$
No approximation is involved; every term comes from exact carry counting.
Worked Example: \(N=100\)
Here \(K=6\) because \(64\le 100<128\). The complete blocks are \(k=1\) through \(5\), so their contributions are
$\begin{aligned} k=1&:&&(1-1)2^1+1=1,\\ k=2&:&&(2-1)2^2+1=5,\\ k=3&:&&(3-1)2^3+1=17,\\ k=4&:&&(4-1)2^4+1=49,\\ k=5&:&&(5-1)2^5+1=129. \end{aligned}$
These add up to \(201\).
The final partial block is \(n\in[64,100]\), so \(M=37\) and \(x\in[65,101]\). In that interval, the counts of multiples of powers of 2 are
$\begin{aligned} 2^1&:&&18,\\ 2^2&:&&9,\\ 2^3&:&&4,\\ 2^4&:&&2,\\ 2^5&:&&1,\\ 2^6&:&&0. \end{aligned}$
So
$\sum_{x=65}^{101}\min(v_2(x),6)=18+9+4+2+1=34,$
and the tail contributes
$6\cdot 37-34=188.$
Therefore
$S(100)=201+188=389,$
which matches the check used by the implementation.
How the Code Works
The C++, Python, and Java implementations all follow the same structure. First they determine the index \(K\) of the highest set bit of \(N\). Then they sum every complete dyadic block with the closed formula \((k-1)2^k+1\).
After that, they process the last partial block \([2^K,N]\). Instead of evaluating \(F(n)\) one by one, the implementation counts how many numbers in \(x\in[2^K+1,N+1]\) are divisible by \(2,4,8,\dots,2^K\). The sum of those counts is exactly \(\sum \min(v_2(x),K)\), so subtracting it from \(K(N-2^K+1)\) yields the remaining contribution.
The arithmetic is exact in all three languages. Python uses arbitrary-precision integers automatically, Java uses arbitrary-precision arithmetic for the final accumulation, and the C++ version uses a wide integer accumulator for the same purpose.
Complexity Analysis
Let \(K=\lfloor\log_2 N\rfloor\). The complete-block summation takes \(K-1\) iterations, and the valuation-counting loop for the tail takes \(K\) more. So the total running time is
$O(\log N),$
and the extra memory usage is
$O(1).$
Footnotes and References
- Problem page: https://projecteuler.net/problem=704
- Kummer's theorem: Wikipedia — Kummer's theorem
- \(p\)-adic valuation: Wikipedia — \(p\)-adic valuation
- Binomial coefficient: Wikipedia — Binomial coefficient
- Hamming weight: Wikipedia — Hamming weight