Open Source · 7 Languages · 900+ Problems

Project Euler Solutions

Complete solutions in C++, Python & Java — with step-by-step mathematical explanations

All Problems

Problem 982: The Third Dice

Project Euler Problem 982 Solution

EulerSolve provides an optimized solution for Project Euler Problem 982, The Third Dice, with C++, Python, Java, and a step-by-step mathematical explanation.

Problem Summary Three fair six-sided dice are rolled. A player who sees all three faces chooses one die to hide. The second player then sees the two remaining faces and chooses the option with the larger expected payoff: either take the hidden die, whose value is uncertain, or take the better of the two visible dice, whose value is known exactly. The task is to determine the optimal expected payoff when the hider plays as well as possible. The local implementations do not guess a simple rule such as “always hide the largest die”. They solve the game exactly. The key observation is that once a visible pair has been fixed, the second player only needs two quantities: the conditional mean of the hidden die and the larger visible face. That turns the full game into a compact linear program. Mathematical Approach Write an ordered state as $s=(d_1,d_2,d_3)\in\{1,\ldots,6\}^3$, and let $a\in\{1,2,3\}$ denote which die is hidden. All $216$ ordered states are equally likely, so $p_s=1/216$. The observations that actually matter The state space is ordered, but the observation is not: after one die is hidden, the second player only sees the unordered pair of visible faces....

Detailed mathematical approach

Problem Summary

Three fair six-sided dice are rolled. A player who sees all three faces chooses one die to hide. The second player then sees the two remaining faces and chooses the option with the larger expected payoff: either take the hidden die, whose value is uncertain, or take the better of the two visible dice, whose value is known exactly. The task is to determine the optimal expected payoff when the hider plays as well as possible.

The local implementations do not guess a simple rule such as “always hide the largest die”. They solve the game exactly. The key observation is that once a visible pair has been fixed, the second player only needs two quantities: the conditional mean of the hidden die and the larger visible face. That turns the full game into a compact linear program.

Mathematical Approach

Write an ordered state as $s=(d_1,d_2,d_3)\in\{1,\ldots,6\}^3$, and let $a\in\{1,2,3\}$ denote which die is hidden. All $216$ ordered states are equally likely, so $p_s=1/216$.

The observations that actually matter

The state space is ordered, but the observation is not: after one die is hidden, the second player only sees the unordered pair of visible faces. Therefore the observation space is

$\mathcal{O}=\{\{u,v\}:1\le u\le v\le 6\},\qquad |\mathcal{O}|=\binom{7}{2}=21.$

For each state-action pair $(s,a)$, let $o(s,a)\in\mathcal{O}$ be that visible multiset, let $h(s,a)$ be the hidden face, and define

$m(o)=\max(o)$

as the best visible die the second player could take immediately. The smaller visible die is irrelevant to optimal play, because choosing it is always dominated by choosing the larger visible die.

A hiding strategy written as probabilities

The hider may randomize. Let $x_{s,a}\ge 0$ be the probability of hiding die $a$ in state $s$. For every state, these probabilities must sum to $1$:

$\sum_{a=1}^3 x_{s,a}=1\qquad(s\in\{1,\ldots,6\}^3).$

For a fixed observation $o$, the total probability of producing that observation is

$q_o=\sum_{\substack{(s,a)\\o(s,a)=o}} p_s\,x_{s,a}.$

Conditioned on seeing $o$, the expected value of the hidden die is

$\mu_o=\frac{1}{q_o}\sum_{\substack{(s,a)\\o(s,a)=o}} p_s\,h(s,a)\,x_{s,a}\qquad(q_o>0).$

So after observation $o$, the second player compares the sure visible value $m(o)$ with the conditional hidden expectation $\mu_o$. The contribution of $o$ to the overall expected payoff is therefore

$q_o\max(\mu_o,m(o)).$

Linearizing the opponent's best response

The expression above is piecewise linear, but it can be written exactly with one auxiliary variable $T_o$ for each observation class. First note that

$q_o\mu_o=\sum_{\substack{(s,a)\\o(s,a)=o}} p_s\,h(s,a)\,x_{s,a},$

and also

$q_om(o)=m(o)\sum_{\substack{(s,a)\\o(s,a)=o}} p_s\,x_{s,a}.$

Hence it is enough to impose

$T_o\ge \sum_{\substack{(s,a)\\o(s,a)=o}} p_s\,h(s,a)\,x_{s,a},$

$T_o\ge m(o)\sum_{\substack{(s,a)\\o(s,a)=o}} p_s\,x_{s,a}.$

If the objective is to minimize $\sum_{o\in\mathcal{O}}T_o$, every $T_o$ is forced down to the larger of those two quantities. Therefore

$\sum_{o\in\mathcal{O}}T_o=\sum_{o\in\mathcal{O}} q_o\max(\mu_o,m(o)),$

which is exactly the second player’s expected payoff. The whole problem is thus the linear program

$\min \sum_{o\in\mathcal{O}}T_o$

subject to the state-normalization equations and the two inequalities above for every observation.

Worked example: seeing $\{2,5\}$

Take the observation $o=\{2,5\}$. Then $m(o)=5$. This observation can come from many ordered state-action pairs, for example by hiding the first die in $(4,2,5)$ or by hiding the second die in $(2,6,5)$. The strategy decides how much probability mass each such pair contributes.

If the resulting conditional mean of the hidden die is $\mu_o=29/6$, then the sure visible $5$ is better, so this observation contributes $5q_o$. If instead $\mu_o=16/3$, then taking the hidden die is better, and the contribution becomes $(16/3)q_o$. The two inequalities for $T_o$ cover both situations without any case split inside the solver.

How the Code Works

Building the tableau

The C++, Python, and Java implementations enumerate all $216$ ordered three-die states and all $3$ hide choices, giving $648$ state-action columns. For each column they precompute the observation class, the hidden face, and the larger visible face. They also keep the smaller two-dice version of the same model, which has $36$ states, $2$ hide choices, and $6$ observation classes.

The linear program uses one normalization row per state and two rows per observation class. For the target instance that is $216+2\cdot 21=258$ rows. It also uses $21$ auxiliary observation variables, $42$ slack variables for the inequalities, and $216$ artificial variables for Phase I.

Phase I, Phase II, and the validation case

The state-normalization equations do not come with an obvious feasible basis, so the implementation starts with a standard two-phase simplex. Phase I minimizes the sum of artificial variables; feasibility is certified when that optimum reaches $0$. The artificial columns are then removed, any row that has become redundant is dropped, and Phase II minimizes $\sum T_o$ by maximizing its negative in tableau form.

Before solving the three-dice target, the same machinery is run on the two-dice instance and checked against the known value $145/36$. The C++ implementation also parallelizes the coefficient fill for the observation rows, while the Python and Java implementations perform the same arithmetic serially.

Complexity Analysis

Tableau construction is linear in the number of state-action pairs, because each column contributes to exactly one hidden-value row and one visible-value row. For the target case this means $648$ such contributions, plus fixed row and column setup.

The main cost is the simplex pivoting. In the three-dice instance, Phase I starts from a $258\times 927$ tableau, and Phase II continues with $711$ columns after the artificial variables are removed. As with any simplex implementation, the theoretical worst case is exponential in the number of pivots, but at this matrix size the exact solve is practical. Memory usage is $O(mn)$ for the tableau.

Footnotes and References

Problem page: https://projecteuler.net/problem=982
Linear programming: Wikipedia - Linear programming
Simplex algorithm: Wikipedia - Simplex algorithm
Expected value: Wikipedia - Expected value
Conditional expectation: Wikipedia - Conditional expectation