Problem 848: Guessing with Sets
View on Project EulerProject Euler Problem 848 Solution
EulerSolve provides an optimized solution for Project Euler Problem 848, Guessing with Sets, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Two players each choose a secret number uniformly at random. The first player chooses from \(\{1,\dots,n\}\), the second from \(\{1,\dots,m\}\). They then alternate asking yes-or-no questions of the form “is your number in this set?”, and both players choose their questions optimally. If \(p(m,n)\) denotes the first player’s winning probability, the required value is $\sum_{i=0}^{20}\sum_{j=0}^{20} p(7^i,5^j).$ The crucial observation used by the implementations is that the game value has an exact staircase structure. Because of that, each probability can be computed by a short block sum rather than by exploring a full game tree. Mathematical Approach The solution does not simulate actual question sequences. Instead, it counts which initial secret pairs are winning for the first player and then divides by the total number of equally likely pairs. Step 1: Turn the Probability into a Counting Problem There are exactly \(mn\) possible initial secret pairs, all with the same probability. For each pair, optimal play determines whether the first player wins or not. Let \(W(m,n)\) be the number of winning initial pairs for the first player. Then the probability is simply $p(m,n)=\frac{W(m,n)}{mn}.$ So the probabilistic problem is reduced to computing the integer quantity \(W(m,n)\)....
Detailed mathematical approach
Problem Summary
Two players each choose a secret number uniformly at random. The first player chooses from \(\{1,\dots,n\}\), the second from \(\{1,\dots,m\}\). They then alternate asking yes-or-no questions of the form “is your number in this set?”, and both players choose their questions optimally. If \(p(m,n)\) denotes the first player’s winning probability, the required value is
$\sum_{i=0}^{20}\sum_{j=0}^{20} p(7^i,5^j).$
The crucial observation used by the implementations is that the game value has an exact staircase structure. Because of that, each probability can be computed by a short block sum rather than by exploring a full game tree.
Mathematical Approach
The solution does not simulate actual question sequences. Instead, it counts which initial secret pairs are winning for the first player and then divides by the total number of equally likely pairs.
Step 1: Turn the Probability into a Counting Problem
There are exactly \(mn\) possible initial secret pairs, all with the same probability. For each pair, optimal play determines whether the first player wins or not. Let \(W(m,n)\) be the number of winning initial pairs for the first player.
Then the probability is simply
$p(m,n)=\frac{W(m,n)}{mn}.$
So the probabilistic problem is reduced to computing the integer quantity \(W(m,n)\).
Step 2: The Winning States Form a Staircase
Place the \(mn\) initial states in an \(m\times n\) grid. The exact formula used by the implementations says that in column \(t\), the number of winning cells is
$\min(m,b_t),$
where the boundary sequence \((b_t)\) begins as
$b_1=1,\qquad b_2=2,\qquad b_3=3.$
After that, the sequence is constant on doubling blocks. For every integer \(r\ge 0\), if
$3\cdot 2^r+1\le t\le 3\cdot 2^{r+1},$
then
$b_t=3\cdot 2^{r+1}.$
So the blocks are \([4,6]\), \([7,12]\), \([13,24]\), and so on, with corresponding heights \(6,12,24,\dots\). That staircase is the central mathematical structure behind the whole computation.
Step 3: Sum the Staircase Column by Column
Since column \(t\) contributes \(\min(m,b_t)\) winning states, the total count is
$W(m,n)=\sum_{t=1}^{n}\min(m,b_t).$
This already gives an exact closed form for one game size. However, evaluating it literally one column at a time would be wasteful, because the sequence \((b_t)\) stays constant over long ranges.
Step 4: Compress the Sum into Doubling Blocks
For \(r\ge 0\), the block starting at \(3\cdot 2^r+1\) and ending at \(3\cdot 2^{r+1}\) has constant height \(3\cdot 2^{r+1}\). Let
$R_r(n)=\max\left(0,\min\left(n,3\cdot 2^{r+1}\right)-\left(3\cdot 2^r+1\right)+1\right).$
This counts how many columns from that block are still present among the first \(n\) columns. Therefore
$W(m,n)=\min(m,1)+\min(m,2)+\min(m,3)+\sum_{r\ge 0} R_r(n)\,\min\left(m,3\cdot 2^{r+1}\right).$
Only finitely many terms are nonzero, and the number of relevant blocks grows like \(O(\log n)\).
Step 5: Worked Example \((m,n)=(7,5)\)
For \(n=5\), the first five staircase heights are
$b_1,b_2,b_3,b_4,b_5=1,2,3,6,6.$
Because \(m=7\), the clipping \(\min(7,b_t)\) leaves these values unchanged, so
$W(7,5)=1+2+3+6+6=18.$
Hence
$p(7,5)=\frac{18}{7\cdot 5}=\frac{18}{35}=0.51428571\dots$
This is the same checkpoint used by the implementations.
Step 6: Evaluate the Required Double Sum
The final target is
$\sum_{i=0}^{20}\sum_{j=0}^{20} p(7^i,5^j)=\sum_{i=0}^{20}\sum_{j=0}^{20}\frac{W(7^i,5^j)}{7^i5^j}.$
So the entire problem becomes 441 evaluations of the same staircase formula. The implementations also verify the natural edge cases \(p(1,n)=1\) and \(p(m,1)=1/m\), which fit the same counting rule.
How the Code Works
The C++, Python, and Java implementations all use the same algorithm. They evaluate one pair \((m,n)\) by handling the first three columns separately, because the staircase begins with the explicit values \(1,2,3\).
After that, the implementation walks through blocks whose lengths are
$3,6,12,24,\dots$
and whose heights are
$6,12,24,48,\dots$
For each block, it takes only the part that still lies within the first \(n\) columns and adds the number of surviving columns multiplied by the clipped block height \(\min(m,\text{height})\). This directly produces the exact integer \(W(m,n)\).
Once \(W(m,n)\) is known, the implementation divides by \(mn\) using exact or high-precision arithmetic and accumulates the result into the outer double sum. High precision matters because the final answer is printed to eight decimal places after adding 441 rational values.
Complexity Analysis
For one pair \((m,n)\), the number of processed blocks is proportional to the number of doublings needed to exceed \(n\), so the running time is \(O(\log n)\). Extra memory is \(O(1)\) apart from the exact-integer or high-precision decimal objects used to keep the arithmetic safe. Since the full problem needs only \(21\times 21=441\) evaluations, the overall computation is very small.
Footnotes and References
- Problem page: https://projecteuler.net/problem=848
- Minimax theorem: Wikipedia - Minimax theorem
- Twenty Questions: Wikipedia - Twenty Questions
- Decision tree: Wikipedia - Decision tree
- Binary search: Wikipedia - Binary search