Problem 898: Claire Voyant
View on Project EulerProject Euler Problem 898 Solution
EulerSolve provides an optimized solution for Project Euler Problem 898, Claire Voyant, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We must compute the success probability of the optimal decision rule for 25 independent probability pairs. For each \(k=25,26,\dots,49\), the parameter is \(q_k=k/100\), so each component compares \(q_k\) against its complement \(1-q_k\). After compressing one component into its net evidence, every component has three possible effects on the global decision, so a naive enumeration would still require \(3^{25}\) combined outcomes. The quantity we want is not only \(\Pr(S>0)\). If the total evidence is exactly balanced, the two global interpretations are equally likely, so a tie contributes half credit. The target value is therefore $A=\Pr(S>0)+\frac{1}{2}\Pr(S=0).$ Mathematical Approach The implementations turn the original probabilistic comparison into a sum of independent log-likelihood contributions and then evaluate the exact tie rule without enumerating all \(3^{25}\) states at once. Step 1: Compress One Pair into a Ternary Random Variable For each \(k\in\{25,\dots,49\}\), define $q_k=\frac{k}{100},\qquad w_k=2\log\frac{1-q_k}{q_k}=2\log\frac{100-k}{k}.$ One component contributes a signed amount \(X_k\) to the total log-likelihood ratio....
Detailed mathematical approach
Problem Summary
We must compute the success probability of the optimal decision rule for 25 independent probability pairs. For each \(k=25,26,\dots,49\), the parameter is \(q_k=k/100\), so each component compares \(q_k\) against its complement \(1-q_k\). After compressing one component into its net evidence, every component has three possible effects on the global decision, so a naive enumeration would still require \(3^{25}\) combined outcomes.
The quantity we want is not only \(\Pr(S>0)\). If the total evidence is exactly balanced, the two global interpretations are equally likely, so a tie contributes half credit. The target value is therefore
$A=\Pr(S>0)+\frac{1}{2}\Pr(S=0).$
Mathematical Approach
The implementations turn the original probabilistic comparison into a sum of independent log-likelihood contributions and then evaluate the exact tie rule without enumerating all \(3^{25}\) states at once.
Step 1: Compress One Pair into a Ternary Random Variable
For each \(k\in\{25,\dots,49\}\), define
$q_k=\frac{k}{100},\qquad w_k=2\log\frac{1-q_k}{q_k}=2\log\frac{100-k}{k}.$
One component contributes a signed amount \(X_k\) to the total log-likelihood ratio. The three possible values are
$X_k\in\{+w_k,\,0,\,-w_k\},$
with probabilities
$\Pr(X_k=+w_k)=(1-q_k)^2,\qquad \Pr(X_k=0)=2q_k(1-q_k),\qquad \Pr(X_k=-w_k)=q_k^2.$
This already explains why every component is ternary after aggregation: it can favor the correct global interpretation, remain neutral, or favor the opposite interpretation.
Step 2: Write the Global Decision Rule
The 25 components are independent, so the total evidence is
$S=\sum_{k=25}^{49} X_k.$
The optimal decision is to choose the correct global interpretation when \(S>0\), to choose the wrong one when \(S<0\), and to split the mass evenly when \(S=0\). Hence
$A=\Pr(S>0)+\frac{1}{2}\Pr(S=0).$
The direct state space has size \(3^{25}\), so the main task is to evaluate this probability exactly enough without brute force over all global combinations at once.
Step 3: Remove the Logarithms for Exact Comparisons
For one concrete global state, let \(\sigma_k\in\{-1,0,1\}\) indicate whether the \(k\)-th component contributes \(-w_k\), \(0\), or \(+w_k\). Then
$S=\sum_{k=25}^{49} \sigma_k\,2\log\frac{100-k}{k}=\log\left(\prod_{k=25}^{49}\left(\frac{100-k}{k}\right)^{2\sigma_k}\right).$
Therefore the sign of \(S\) is exactly the same as the comparison
$\prod_{k=25}^{49}\left(\frac{100-k}{k}\right)^{2\sigma_k}\mathop{\gtrless}_{=}\,1.$
This reformulation is crucial. Floating-point logarithms are convenient for sorting and thresholding, but ties and near-ties must be decided by exact rational arithmetic, not by rounded decimal values.
Step 4: Encode Every Ratio by Prime Exponents
All numbers involved come from the fixed set \(k\) and \(100-k\) for \(25\le k\le 49\), so only a fixed finite set of primes can appear. For each \(k\), write
$\left(\frac{100-k}{k}\right)^2=\prod_{p} p^{e_{k,p}}.$
If a global state uses the signs \(\sigma_k\), then the total exponent of a prime \(p\) becomes
$E_p=\sum_{k=25}^{49}\sigma_k e_{k,p}.$
The exact product is then
$\prod_{p} p^{E_p}=\frac{\prod_{E_p>0} p^{E_p}}{\prod_{E_p<0} p^{-E_p}}.$
So checking whether \(S\) is positive, zero, or negative reduces to an exact integer comparison between one numerator and one denominator built from these exponents.
Step 5: Meet in the Middle
Split the 25 components into two groups of sizes \(12\) and \(13\). For each half, enumerate all ternary states. A half-state stores three pieces of information:
$\text{approximate score},\qquad \text{probability mass},\qquad \text{exact prime-exponent vector}.$
After sorting the right-half states by approximate score, a left-half state with score \(s_L\) only needs right-half states with score near \(-s_L\) for possible ties. All right states with clearly larger score make the total positive automatically and can be summed at once by prefix sums.
This is why the method is fast: exact arithmetic is used only in a very narrow band around the decision boundary, not for every pair of states.
Worked Example: The First Pair Alone
Take \(k=25\). Then \(q=\frac{1}{4}\) and
$w=2\log\frac{3}{1}=2\log 3.$
The three local outcomes are
$+w\text{ with probability }\left(\frac{3}{4}\right)^2=\frac{9}{16},\qquad 0\text{ with probability }2\cdot\frac{1}{4}\cdot\frac{3}{4}=\frac{6}{16},\qquad -w\text{ with probability }\left(\frac{1}{4}\right)^2=\frac{1}{16}.$
For this one-pair toy case, the optimal accuracy is
$\frac{9}{16}+\frac{1}{2}\cdot\frac{6}{16}=\frac{12}{16}=\frac{3}{4}.$
The full problem uses the same rule, but now all 25 ternary contributions are combined.
How the Code Works
The C++, Python, and Java implementations use the same pipeline. They first build the fixed prime basis coming from the integers \(k\) and \(100-k\) for \(25\le k\le 49\). For each of the 25 components they then precompute the floating-point weight \(w_k\), the three local probabilities \((1-q_k)^2\), \(2q_k(1-q_k)\), and \(q_k^2\), and the exact prime-exponent representation of \(\left(\frac{100-k}{k}\right)^2\).
Next, the implementation enumerates all ternary states in a left half and a right half. Every stored state contains an approximate score, its probability mass, and its exact exponent vector. The right-half states are sorted by approximate score, and a prefix-sum array lets the program add the probability of all definitely winning combinations immediately after a binary search.
Only combinations whose approximate score falls inside a tiny neighborhood of zero need extra care. For that narrow band, the implementation combines the two exponent vectors, reconstructs an exact rational comparison with arbitrary-precision integers, and decides whether the product is greater than \(1\), equal to \(1\), or less than \(1\). That exact sign determines whether the combination contributes full probability, half probability, or nothing.
Complexity Analysis
Let \(L=3^{12}\) and \(R=3^{13}\). Enumerating the two halves costs \(O(L+R)\). Sorting the right-half states costs \(O(R\log R)\). For each left-half state, the merge phase performs two binary searches, so the main combination work is \(O(L\log R)\), plus a smaller extra term for the exact checks inside the near-tie window. Memory usage is \(O(L+R)\) states, with only a fixed small prime-exponent vector attached to each state.
Footnotes and References
- Problem page: https://projecteuler.net/problem=898
- Likelihood-ratio test: Wikipedia - Likelihood-ratio test
- Prime factorization: Wikipedia - Prime factorization
- Meet-in-the-middle: Wikipedia - Meet-in-the-middle
- Arbitrary-precision arithmetic: Wikipedia - Arbitrary-precision arithmetic