Problem 645: Every Day Is a Holiday
View on Project EulerProject Euler Problem 645 Solution
EulerSolve provides an optimized solution for Project Euler Problem 645, Every Day Is a Holiday, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Place the \(D\) days on a cycle. One random draw chooses a day uniformly, and that draw covers two consecutive days: the chosen day and its cyclic successor. Let \(T_D\) be the number of draws needed until every day has been covered at least once. The task is to compute the expectation $E(D)=\mathbb{E}[T_D]$ for the large target value \(D=10000\). The C++, Python, and Java implementations do not simulate the process. Instead, they turn the expectation into a one-dimensional integral whose integrand comes from inclusion-exclusion on uncovered days and a \(2\times2\) transfer matrix on the cycle. Mathematical Approach Number the days modulo \(D\), and let a draw on day \(i\) cover the pair \((i,i+1)\). The stopping time is \(T_D\), the first moment when all days are covered. Step 1: Rewrite the expectation as a tail sum For any nonnegative integer-valued stopping time, $E(D)=\sum_{n=0}^{\infty}\Pr(T_D>n).$ So the whole problem becomes: after \(n\) draws, what is the probability that at least one day is still uncovered? A day \(j\) is uncovered exactly when no draw landed on \(j\) and no draw landed on \(j-1\), because those are the only two starting points that could cover \(j\)....
Detailed mathematical approach
Problem Summary
Place the \(D\) days on a cycle. One random draw chooses a day uniformly, and that draw covers two consecutive days: the chosen day and its cyclic successor. Let \(T_D\) be the number of draws needed until every day has been covered at least once. The task is to compute the expectation
$E(D)=\mathbb{E}[T_D]$
for the large target value \(D=10000\).
The C++, Python, and Java implementations do not simulate the process. Instead, they turn the expectation into a one-dimensional integral whose integrand comes from inclusion-exclusion on uncovered days and a \(2\times2\) transfer matrix on the cycle.
Mathematical Approach
Number the days modulo \(D\), and let a draw on day \(i\) cover the pair \((i,i+1)\). The stopping time is \(T_D\), the first moment when all days are covered.
Step 1: Rewrite the expectation as a tail sum
For any nonnegative integer-valued stopping time,
$E(D)=\sum_{n=0}^{\infty}\Pr(T_D>n).$
So the whole problem becomes: after \(n\) draws, what is the probability that at least one day is still uncovered?
A day \(j\) is uncovered exactly when no draw landed on \(j\) and no draw landed on \(j-1\), because those are the only two starting points that could cover \(j\).
Step 2: Apply inclusion-exclusion to uncovered sets
For a subset \(S\) of days, define
$N(S)=S\cup(S-1), \qquad S-1=\{i-1 \pmod D : i\in S\}.$
If every day in \(S\) is still uncovered after \(n\) draws, then every draw must avoid the set \(N(S)\). Since each draw is uniform over the \(D\) starting days,
$\Pr(A_S(n))=\left(1-\frac{|N(S)|}{D}\right)^n,$
where \(A_S(n)\) is the event that all days in \(S\) remain uncovered after \(n\) draws.
Now inclusion-exclusion over the uncovered days gives
$\Pr(T_D>n)=\sum_{\emptyset\ne S\subseteq C_D}(-1)^{|S|+1}\left(1-\frac{|N(S)|}{D}\right)^n,$
where \(C_D\) denotes the cycle of \(D\) days.
Step 3: Sum the geometric series and convert to an integral
Insert the previous formula into the tail sum and interchange the order of summation:
$\frac{E(D)}{D}=\sum_{\emptyset\ne S\subseteq C_D}(-1)^{|S|+1}\frac{1}{|N(S)|}.$
The reciprocal is converted with
$\frac{1}{m}=\int_0^1 x^{m-1}\,dx.$
Therefore
$\frac{E(D)}{D}=\int_0^1\frac{1-Z_D(x)}{x}\,dx,$
where
$Z_D(x)=\sum_{S\subseteq C_D}(-1)^{|S|}x^{|N(S)|}.$
The empty set contributes the leading \(1\), which is why the numerator becomes \(1-Z_D(x)\).
Step 4: Encode the subsets with a transfer matrix
Represent a subset \(S\) by a cyclic binary word \(s_1,\dots,s_D\), where \(s_i=1\) means day \(i\) is required to remain uncovered. Then
$|N(S)|=\sum_{i=1}^{D}s_i+\sum_{i=1}^{D}(1-s_i)s_{i+1}, \qquad s_{D+1}=s_1.$
The first sum counts the chosen days themselves, and the second sum counts one extra predecessor for each run of consecutive \(1\)'s around the cycle.
This makes the weight factorize locally:
$(-1)^{|S|}x^{|N(S)|}=\prod_{i=1}^{D} M_{s_i,s_{i+1}}(x),$
with transfer matrix
$M(x)=\begin{pmatrix}1 & -x^2 \\ 1 & -x\end{pmatrix}.$
Summing over all cyclic binary words is exactly the trace of the \(D\)-th matrix power, so
$Z_D(x)=\operatorname{tr}(M(x)^D).$
Step 5: Diagonalize the matrix
The characteristic polynomial of \(M(x)\) is
$\lambda^2-(1-x)\lambda-x(1-x)=0.$
Its two eigenvalues are
$\lambda_{1,2}(x)=\frac{1-x\pm\sqrt{(1-x)(1+3x)}}{2}.$
Hence
$Z_D(x)=\lambda_1(x)^D+\lambda_2(x)^D,$
and the expectation becomes
$\boxed{\frac{E(D)}{D}=\int_0^1\frac{1-\lambda_1(x)^D-\lambda_2(x)^D}{x}\,dx.}$
This is the exact formula used by the implementation.
Step 6: Endpoint behavior and numerical shape
The integrand looks singular because of the division by \(x\), but the singularity at \(x=0\) is removable. Expanding the eigenvalues near zero gives
$\lambda_1(x)=1-x^2+O(x^3), \qquad \lambda_2(x)=-x+O(x^2),$
so
$1-\lambda_1(x)^D-\lambda_2(x)^D = D x^2 + O(x^3),$
and therefore the integrand is \(Dx+O(x^2)\), which tends to \(0\). At \(x=1\), both eigenvalues are \(0\), so the integrand tends to \(1\).
Worked Example: \(D=5\)
For five days, the transfer-matrix trace simplifies to
$Z_5(x)=1-5x^2+5x^3-x^5.$
Thus
$\frac{E(5)}{5}=\int_0^1\frac{5x^2-5x^3+x^5}{x}\,dx=\int_0^1(5x-5x^2+x^4)\,dx.$
Integrating term by term gives
$\frac{E(5)}{5}=\frac{5}{2}-\frac{5}{3}+\frac{1}{5}=\frac{31}{30},$
so
$E(5)=\frac{31}{6}.$
This matches the exact checkpoint used by the implementations. For the smallest nontrivial cycle, \(Z_2(x)=1-x^2\), giving \(E(2)=1\).
How the Code Works
The C++, Python, and Java implementations all evaluate the same integrand
$I_D(x)=\frac{1-\lambda_1(x)^D-\lambda_2(x)^D}{x}.$
They first compute the discriminant \(\sqrt{(1-x)(1+3x)}\), then the two eigenvalues, and then the numerator. The endpoints are handled explicitly: the limiting value at \(x=0\) is \(0\), and the value at \(x=1\) is \(1\).
The delicate part is the term \(1-\lambda_1(x)^D\). When \(\lambda_1(x)\) is very close to \(1\), direct subtraction loses significant digits. To avoid that, the implementation rewrites it as
$1-\lambda_1(x)^D=-\operatorname{expm1}\!\bigl(D\log \lambda_1(x)\bigr)$
whenever \(D\log \lambda_1(x)\) is close to \(0\). The second eigenvalue is smaller in magnitude and is raised directly to the \(D\)-th power.
The integral on \([0,1]\) is then evaluated with Simpson's rule using an even number of intervals. Starting from an initial mesh, the code repeatedly doubles the interval count and compares two consecutive Simpson estimates:
$|S_{2N}-S_N|\le \text{rtol}\cdot \max(1,|S_{2N}|).$
Once this relative criterion is satisfied, the refined estimate is returned. The C++ implementation also supports parallel accumulation of the interior Simpson nodes when the mesh is large, and it verifies the derivation against the known values \(E(2)=1\), \(E(5)=31/6\), and \(E(365)\approx 1174.3501\).
Complexity Analysis
If the final Simpson mesh uses \(N\) subintervals, one pass costs \(O(N)\) evaluations of the integrand. Because the mesh doubles geometrically, the total work up to the accepted mesh is still \(O(N)\). The single-threaded implementations use \(O(1)\) auxiliary memory, while the parallel C++ version uses \(O(T)\) extra memory for \(T\) partial sums.
Footnotes and References
- Problem page: https://projecteuler.net/problem=645
- Inclusion-exclusion principle: Wikipedia - Inclusion-exclusion principle
- Transfer-matrix method: Wikipedia - Transfer-matrix method
- Eigenvalues and eigenvectors: Wikipedia - Eigenvalues and eigenvectors
- Simpson's rule: Wikipedia - Simpson's rule