Problem 191: Prize Strings
View on Project EulerProject Euler Problem 191 Solution
EulerSolve provides an optimized solution for Project Euler Problem 191, Prize Strings, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary We want the number of length-30 attendance records over \(\{O,A,L\}\), where \(O\) means on time, \(A\) absent, and \(L\) late. A record is prize-eligible if it uses at most one \(L\) in total and never contains three consecutive \(A\)'s. The two restrictions have different flavors. The late condition is global, because a single \(L\) can be used only once anywhere in the string. The absence condition is local, because only the current trailing run of \(A\)'s matters when we decide whether another \(A\) is legal. Mathematical Approach The implementations exploit exactly that structure. Instead of remembering the whole prefix, they remember only the pieces of information that affect the next day. The six relevant states For \(n \ge 0\), let \(S_n^{\ell,a}\) denote the number of valid length-\(n\) prefixes such that \(\ell \in \{0,1\}\) records whether a late day has already appeared, and \(a \in \{0,1,2\}\) is the number of consecutive trailing absences. These two quantities are sufficient. Any earlier part of the string becomes irrelevant once we know whether the unique \(L\) has already been spent and whether the current suffix ends with 0, 1, or 2 copies of \(A\). That gives exactly \(2 \times 3 = 6\) states....
Detailed mathematical approach
Problem Summary
We want the number of length-30 attendance records over \(\{O,A,L\}\), where \(O\) means on time, \(A\) absent, and \(L\) late. A record is prize-eligible if it uses at most one \(L\) in total and never contains three consecutive \(A\)'s.
The two restrictions have different flavors. The late condition is global, because a single \(L\) can be used only once anywhere in the string. The absence condition is local, because only the current trailing run of \(A\)'s matters when we decide whether another \(A\) is legal.
Mathematical Approach
The implementations exploit exactly that structure. Instead of remembering the whole prefix, they remember only the pieces of information that affect the next day.
The six relevant states
For \(n \ge 0\), let \(S_n^{\ell,a}\) denote the number of valid length-\(n\) prefixes such that \(\ell \in \{0,1\}\) records whether a late day has already appeared, and \(a \in \{0,1,2\}\) is the number of consecutive trailing absences.
These two quantities are sufficient. Any earlier part of the string becomes irrelevant once we know whether the unique \(L\) has already been spent and whether the current suffix ends with 0, 1, or 2 copies of \(A\). That gives exactly \(2 \times 3 = 6\) states.
The initial condition is
$S_0^{0,0}=1,$
with every other \(S_0^{\ell,a}\) equal to 0, because the empty record is valid, has used no late day, and has no trailing absences.
How one more day changes the state
From a valid state, there are at most three legal extensions.
If we append \(O\), the absence streak resets, so for every \(\ell\) and \(a\),
$S_{n+1}^{\ell,0}\leftarrow S_{n+1}^{\ell,0}+S_n^{\ell,a}.$
If we append \(A\), then the current streak must be shorter than 2, hence
$S_{n+1}^{\ell,a+1}\leftarrow S_{n+1}^{\ell,a+1}+S_n^{\ell,a}\qquad (a=0,1).$
If we append \(L\), then we must not have used a late day before, and the trailing absence count resets because the last symbol is no longer \(A\):
$S_{n+1}^{1,0}\leftarrow S_{n+1}^{1,0}+S_n^{0,a}\qquad (a=0,1,2).$
After \(n\) days, the total number of prize strings is
$T_n=\sum_{\ell=0}^{1}\sum_{a=0}^{2} S_n^{\ell,a}.$
This is the complete recurrence used by the implementations. Every loop iteration is just a redistribution of six counters.
Separating the cases with and without a late day
There is also a convenient closed counting interpretation. Let \(F_n\) be the number of valid length-\(n\) strings that contain no \(L\) at all. Then only the trailing run of absences matters, and one obtains the Tribonacci-type recurrence
$F_n=F_{n-1}+F_{n-2}+F_{n-3}\qquad (n\ge 3),$
with initial values
$F_0=1,\qquad F_1=2,\qquad F_2=4.$
Now consider a prize string of length \(n\) that contains exactly one \(L\). If that \(L\) sits in position \(r\), then the prefix of length \(r-1\) and the suffix of length \(n-r\) must both be \(L\)-free valid strings, and they are independent because the \(L\) breaks any absence streak. Therefore
$T_n=F_n+\sum_{r=1}^{n} F_{r-1}F_{n-r}.$
The first term counts strings with no \(L\); the sum counts strings with exactly one \(L\). The code does not iterate with this formula directly, but it is a clean proof that the six-state DP is counting the right objects.
Worked example: \(n=4\)
The no-late sequence begins
$F_0=1,\qquad F_1=2,\qquad F_2=4,\qquad F_3=7,\qquad F_4=13.$
So the strings of length 4 with exactly one \(L\) contribute
$F_0F_3+F_1F_2+F_2F_1+F_3F_0=1\cdot 7+2\cdot 4+4\cdot 2+7\cdot 1=30.$
Adding the \(L\)-free strings gives
$T_4=13+30=43.$
This matches the standard checkpoint for the problem and shows both constraints in action: the lone \(L\) can be placed anywhere, but the two surrounding pieces must independently avoid the forbidden pattern \(AAA\).
How the Code Works
The C++, Python, and Java implementations store the six state counts in a \(2\times 3\) table. One dimension records whether a late day has been used, and the other records the current trailing absence run.
For each day, the implementation creates a fresh \(2\times 3\) table for the next layer, scans the current six counts, and distributes each count through the legal transitions for \(O\), \(A\), and \(L\). Once that redistribution is complete, the new table becomes the current table and the process continues.
After 30 iterations, the implementation sums all six terminal states. The update rule can be checked against the smaller case \(T_4=43\). Because the number of states is fixed and the target length is only 30, the whole computation fits comfortably in ordinary 64-bit integer arithmetic.
Complexity Analysis
There are always exactly 6 states, and each state performs only constant work per day. For a record length of \(D\), the running time is therefore \(O(D)\).
The memory usage is \(O(1)\), because the implementation keeps only the current and next \(2\times 3\) tables. It does not need to store all earlier layers of the dynamic program.
Footnotes and References
- Problem page: Project Euler 191
- Dynamic programming: Wikipedia - Dynamic programming
- Finite-state machine: Wikipedia - Finite-state machine
- Tribonacci numbers: Wikipedia - Tribonacci number