Problem 689: Binary Series
View on Project EulerProject Euler Problem 689 Solution
EulerSolve provides an optimized solution for Project Euler Problem 689, Binary Series, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Let $Z=\sum_{n\ge 1}\frac{B_n}{n^2},\qquad B_n\in\{0,1\},\qquad \Pr(B_n=0)=\Pr(B_n=1)=\tfrac12.$ The problem asks for the tail probability $p(a)=\Pr(Z\gt a)$ at \(a=0.5\), to high precision. Since \(0\le Z\le \zeta(2)=\pi^2/6\) and the series contains infinitely many independent bits, direct enumeration is hopeless. The implementation instead rewrites the distribution in a symmetric form, derives its characteristic function, and then recovers the probability by Fourier inversion. Mathematical Approach The whole method is a clean chain: center the random series, compute the characteristic function term by term, convert the tail probability into a one-dimensional integral, and then approximate that integral numerically. Step 1: Center the Series and Identify the Range Each random bit contributes either \(0\) or \(1/n^2\), so its mean is \(1/(2n^2)\). Therefore $\mu=\mathbb E[Z]=\sum_{n\ge 1}\frac{1}{2n^2}=\frac{\zeta(2)}{2}=\frac{\pi^2}{12}.$ Write \(B_n=\frac{1+\varepsilon_n}{2}\), where \(\varepsilon_n\in\{-1,+1\}\) with equal probability. Then $Z=\mu+\sum_{n\ge 1}\frac{\varepsilon_n}{2n^2}.$ So the centered variable is $Y=Z-\mu=\sum_{n\ge 1}\frac{\varepsilon_n}{2n^2},$ which is symmetric around \(0\)....
Detailed mathematical approach
Problem Summary
Let
$Z=\sum_{n\ge 1}\frac{B_n}{n^2},\qquad B_n\in\{0,1\},\qquad \Pr(B_n=0)=\Pr(B_n=1)=\tfrac12.$
The problem asks for the tail probability
$p(a)=\Pr(Z\gt a)$
at \(a=0.5\), to high precision. Since \(0\le Z\le \zeta(2)=\pi^2/6\) and the series contains infinitely many independent bits, direct enumeration is hopeless. The implementation instead rewrites the distribution in a symmetric form, derives its characteristic function, and then recovers the probability by Fourier inversion.
Mathematical Approach
The whole method is a clean chain: center the random series, compute the characteristic function term by term, convert the tail probability into a one-dimensional integral, and then approximate that integral numerically.
Step 1: Center the Series and Identify the Range
Each random bit contributes either \(0\) or \(1/n^2\), so its mean is \(1/(2n^2)\). Therefore
$\mu=\mathbb E[Z]=\sum_{n\ge 1}\frac{1}{2n^2}=\frac{\zeta(2)}{2}=\frac{\pi^2}{12}.$
Write \(B_n=\frac{1+\varepsilon_n}{2}\), where \(\varepsilon_n\in\{-1,+1\}\) with equal probability. Then
$Z=\mu+\sum_{n\ge 1}\frac{\varepsilon_n}{2n^2}.$
So the centered variable is
$Y=Z-\mu=\sum_{n\ge 1}\frac{\varepsilon_n}{2n^2},$
which is symmetric around \(0\). Also every term is nonnegative and the largest possible sum is \(\zeta(2)\), so
$a\lt 0 \implies p(a)=1,\qquad a\ge \zeta(2)\implies p(a)=0.$
Step 2: Derive the Characteristic Function
For a fixed \(n\), the centered summand has characteristic function
$\mathbb E\left[e^{it\varepsilon_n/(2n^2)}\right]=\frac{e^{it/(2n^2)}+e^{-it/(2n^2)}}{2}=\cos\left(\frac{t}{2n^2}\right).$
The summands are independent, so the characteristic function of the full series factorizes into an infinite product:
$\varphi_Y(t)=\prod_{n\ge 1}\cos\left(\frac{t}{2n^2}\right).$
This function is real and even, reflecting the symmetry \(Y\overset{d}= -Y\). That symmetry is the reason the final inversion formula contains only a sine factor and no imaginary part to evaluate separately.
Step 3: Convert the Tail Probability into an Integral
Set
$x=\mu-a.$
Then \(p(a)=\Pr(Y\gt -x)\). Gil-Pelaez inversion gives
$p(a)=\frac12+\frac{1}{\pi}\int_0^{\infty}\frac{\sin(tx)}{t}\,\varphi_Y(t)\,dt.$
So an infinite random sum has been reduced to a deterministic one-dimensional integral. The hard combinatorics disappear; the remaining work is numerical analysis.
Step 4: Truncate the Infinite Product and the Infinite Interval
The implementation cannot keep infinitely many cosine factors, so it replaces \(\varphi_Y(t)\) by
$\varphi_{Y,N}(t)=\prod_{n=1}^{N}\cos\left(\frac{t}{2n^2}\right),$
with a large cutoff \(N\). It also replaces the interval \([0,\infty)\) by a finite interval \([0,T]\). This is effective because the omitted cosine factors are extremely close to \(1\) once \(n\) is large, and because the oscillatory integral contributes less and less beyond a sufficiently large upper limit.
Step 5: Apply Simpson's Rule
If \(T=mh\) with even \(m\), the numerical quadrature uses
$\int_0^T f(t)\,dt\approx \frac{h}{3}\left(f(0)+f(T)+4\sum_{j=1,3,\dots,m-1}f(jh)+2\sum_{j=2,4,\dots,m-2}f(jh)\right),$
where
$f(t)=\frac{\sin(tx)}{t}\,\varphi_{Y,N}(t).$
At \(t=0\), the expression \(\sin(tx)/t\) is interpreted by continuity:
$\lim_{t\to 0}\frac{\sin(tx)}{t}=x.$
That removes the apparent singularity at the origin and makes Simpson's rule stable there.
Worked Example: Symmetry and the Main Numerical Value
If every bit \(B_n\) is replaced by \(1-B_n\), then
$\sum_{n\ge 1}\frac{1-B_n}{n^2}=\zeta(2)-Z.$
So the distribution of \(Z\) is symmetric around \(\mu=\zeta(2)/2\). It follows that
$p(\mu)=\Pr(Z\gt \mu)=\frac12,$
and more generally
$p(\mu-u)+p(\mu+u)=1.$
In particular, the thresholds \(1\) and \(\zeta(2)-1\) are symmetric around \(\mu\), so both have probability \(1/2\). With the finer numerical settings used by the implementation, the target quantity comes out as
$p(0.5)\approx 0.565654540708545,$
which is the value checked before the final rounded answer is printed.
How the Code Works
The C++, Python, and Java implementations all compute the same truncated Fourier integral. The C++ implementation performs the numerical work directly: it precomputes the coefficients \(1/(2n^2)\) up to the cutoff, evaluates the truncated cosine product at each Simpson node, and accumulates the weighted node values over the chosen interval.
The interior Simpson nodes are divided across available threads, so the wall-clock time improves on multi-core hardware while the mathematical result stays unchanged. The endpoint at \(t=0\) is handled through the limit above, and the number of subintervals is adjusted so Simpson's rule always uses an even count.
The Python and Java implementations are thin front ends that delegate to the same compiled numerical core, so all three languages share the same checkpoints, the same truncation strategy, and the same final probability once their numerical parameters agree.
Complexity Analysis
If the integration limit is \(T\), the step size is \(h\), and the cosine-product cutoff is \(N\), then the number of Simpson subintervals is about \(T/h\). Each node evaluation multiplies \(N\) cosine factors, so the total running time is
$O\left(\frac{T}{h}\,N\right).$
The memory cost is \(O(N)\) for the precomputed coefficients, plus a small extra buffer for per-thread partial sums. Multithreading changes the constant factor in runtime, not the asymptotic order.
Footnotes and References
- Project Euler problem page: https://projecteuler.net/problem=689
- Characteristic function: Wikipedia - Characteristic function
- Gil-Pelaez inversion theorem: Wikipedia - Gil-Pelaez theorem
- Simpson's rule: Wikipedia - Simpson's rule
- Basel problem and \(\zeta(2)=\pi^2/6\): Wikipedia - Basel problem