Problem 298: Selective Amnesia
View on Project EulerProject Euler Problem 298 Solution
EulerSolve provides an optimized solution for Project Euler Problem 298, Selective Amnesia, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Larry and Robin observe the same random sequence of calls from an alphabet of size \(10\). Both memories have capacity \(5\), but they update differently. After \(50\) turns we want the exact value of $\mathbb{E}(|L-R|),$ where \(L\) and \(R\) are the total hit counts of Larry and Robin. Mathematical Approach 1. Exact state and scoring rules A state is the ordered pair of memories $\sigma=(M_L,M_R), \qquad |M_L|,|M_R|\le 5.$ Both memories keep order from oldest to newest. The update rules are: Larry. On a hit, the called label is removed from its current position and moved to the back. On a miss, the label is appended, and if the memory is full the oldest label is evicted first. Robin. On a hit, the score increases but the memory order does not change. On a miss, Robin also appends and evicts the oldest label if necessary. For one call, the score increment is $\Delta=(\text{Larry hit})-(\text{Robin hit})\in\{-1,0,+1\}.$ 2. Why canonical relabeling is valid The actual digit names do not matter. Only equality relations such as "present in Larry only", "present in both", and relative order inside each memory affect future behavior. Therefore two states that differ only by a permutation of labels have exactly the same transition structure....
Detailed mathematical approach
Problem Summary
Larry and Robin observe the same random sequence of calls from an alphabet of size \(10\). Both memories have capacity \(5\), but they update differently. After \(50\) turns we want the exact value of
$\mathbb{E}(|L-R|),$
where \(L\) and \(R\) are the total hit counts of Larry and Robin.
Mathematical Approach
1. Exact state and scoring rules
A state is the ordered pair of memories
$\sigma=(M_L,M_R), \qquad |M_L|,|M_R|\le 5.$
Both memories keep order from oldest to newest. The update rules are:
Larry. On a hit, the called label is removed from its current position and moved to the back. On a miss, the label is appended, and if the memory is full the oldest label is evicted first.
Robin. On a hit, the score increases but the memory order does not change. On a miss, Robin also appends and evicts the oldest label if necessary.
For one call, the score increment is
$\Delta=(\text{Larry hit})-(\text{Robin hit})\in\{-1,0,+1\}.$
2. Why canonical relabeling is valid
The actual digit names do not matter. Only equality relations such as "present in Larry only", "present in both", and relative order inside each memory affect future behavior. Therefore two states that differ only by a permutation of labels have exactly the same transition structure.
The code canonicalizes a state by scanning Larry's memory from left to right, then Robin's memory, and renaming first-seen labels to \(0,1,2,\dots\). For example,
$M_L=(6,1,8,10,2),\qquad M_R=(2,4,6,8,10)$
becomes
$M_L=(0,1,2,3,4),\qquad M_R=(4,5,0,2,3).$
Any raw state with the same equality pattern collapses to this same canonical representative. This is why the graph stays finite and small.
3. Weighted transitions: known labels versus unseen labels
Suppose a canonical state contains the set \(U\) of labels that currently appear in at least one memory. Calls split into two classes:
Known labels. Every \(x\in U\) is processed individually with weight \(1\).
Unseen labels. Every label outside \(U\) behaves the same up to canonical relabeling, because it is absent from both memories and enters as one fresh symbol. So the code simulates one representative unseen label and assigns it weight
$10-|U|.$
After canonicalization, multiple calls may lead to the same \((\sigma',\Delta)\), so their weights are merged. For every state, the outgoing weights sum to the alphabet size \(10\). This is one of the graph-structure checkpoints in the C++ code.
4. DP over time and score difference
Let \(P_t(\sigma,d)\) be the probability that after \(t\) turns we are in canonical state \(\sigma\) and the hit difference is
$d=L-R.$
Initially only the empty state with \(d=0\) has probability \(1\). If a weighted edge \(\sigma\to\sigma'\) has score change \(\Delta\) and weight \(w\), then its probability is \(w/10\), so
$P_{t+1}(\sigma',d+\Delta)=\sum_{\sigma,d} P_t(\sigma,d)\cdot \frac{w(\sigma\to\sigma',\Delta)}{10}.$
Since each turn changes the difference by at most \(1\), after \(t\) turns we only need the band
$-t \le d \le t,$
which is why the code stores arrays of width \(2T+1\).
At the end, the required expectation is
$\mathbb{E}(|L-R|)=\sum_{\sigma,d}P_T(\sigma,d)\,|d|.$
5. Worked example from the checkpoint sequence
The C++ checkpoint uses the calls
$1,2,4,6,1,8,10,2,4,1.$
The cumulative hit counts become
$L:\ 0,0,0,0,1,1,1,1,1,2,$
$R:\ 0,0,0,0,1,1,1,2,3,3.$
So the final difference is
$L-R=-1.$
After the ten calls, the ordered memories are
$M_L=(8,10,2,4,1),\qquad M_R=(4,6,8,10,1).$
This example makes the rule difference concrete: Larry keeps refreshing hits, Robin does not.
6. Checkpoints and scale of the compressed graph
For the default parameters \((10,5,50)\), the canonical state graph has only
$439$
states. The exact expectation is
$\mathbb{E}(|L-R|)=1.768822942380093\ldots$
The code also verifies two brute-force cross-checks on smaller instances:
$\text{alphabet}=3,\ \text{capacity}=2,\ \text{turns}=7 \Rightarrow 0.224965706447188,$
$\text{alphabet}=4,\ \text{capacity}=2,\ \text{turns}=8 \Rightarrow 0.259277343750000.$
These are compared against exhaustive enumeration of all sequences, so they strongly validate the Markov-DP formulation.
How the Code Works
build_graph(...) performs a BFS from the empty state, canonicalizes every newly reached state, and stores weighted transitions \((\text{next state},\Delta,\text{weight})\). Then solve_exact(...) runs a probability DP over turns and score differences. The C++ version optionally parallelizes the transition sweep, and then checks that threaded and single-threaded answers agree.
Complexity Analysis
If \(S\) is the number of canonical states, \(B\) the average number of outgoing transitions, and \(T\) the number of turns, then the DP costs about
$O\!\bigl(T\cdot S\cdot B\cdot (2T+1)\bigr)$
time and
$O\!\bigl(S\cdot (2T+1)\bigr)$
memory. For the actual problem, \(S=439\), so the exact computation is easily manageable and avoids any Monte Carlo error.
Further Reading
- Problem page: https://projecteuler.net/problem=298
- Markov chains: https://en.wikipedia.org/wiki/Markov_chain
- State compression and canonicalization: https://cp-algorithms.com/dynamic_programming/intro-to-dp.html