Problem 54: Poker Hands
View on Project EulerProject Euler Problem 54 Solution
EulerSolve provides an optimized solution for Project Euler Problem 54, Poker Hands, with C++, Python, Java, and a step-by-step mathematical explanation.
Problem Summary Each line of the dataset contains ten cards. The first five belong to Player 1, the next five belong to Player 2. The task is to evaluate both 5-card poker hands under the standard ranking rules and count how many lines are won by Player 1. This is not a search problem and it is not a probability problem. The whole job is to turn each hand into a mathematically ordered descriptor and then compare the two descriptors. Mathematical Approach For a hand \(H=\{(r_i,s_i)\}_{i=1}^5\), the relevant data are the five ranks \(r_i\in\{2,\dots,14\}\), the suit pattern, and the multiplicities of equal ranks. Once those objects are extracted, poker comparison becomes a total lexicographic order. Ranks, suits, and multiplicity pattern Write the sorted ranks as \(a_1\le a_2\le a_3\le a_4\le a_5\). Let \(m(v)\) denote the number of cards of rank \(v\). The multiplicity pattern must be one of the partitions of 5: $4+1,\quad 3+2,\quad 3+1+1,\quad 2+2+1,\quad 2+1+1+1,\quad 1+1+1+1+1.$ These patterns already identify all frequency-based hand types: four of a kind, full house, three of a kind, two pairs, one pair, and high card....
Detailed mathematical approach
Problem Summary
Each line of the dataset contains ten cards. The first five belong to Player 1, the next five belong to Player 2. The task is to evaluate both 5-card poker hands under the standard ranking rules and count how many lines are won by Player 1.
This is not a search problem and it is not a probability problem. The whole job is to turn each hand into a mathematically ordered descriptor and then compare the two descriptors.
Mathematical Approach
For a hand \(H=\{(r_i,s_i)\}_{i=1}^5\), the relevant data are the five ranks \(r_i\in\{2,\dots,14\}\), the suit pattern, and the multiplicities of equal ranks. Once those objects are extracted, poker comparison becomes a total lexicographic order.
Ranks, suits, and multiplicity pattern
Write the sorted ranks as \(a_1\le a_2\le a_3\le a_4\le a_5\). Let \(m(v)\) denote the number of cards of rank \(v\). The multiplicity pattern must be one of the partitions of 5:
$4+1,\quad 3+2,\quad 3+1+1,\quad 2+2+1,\quad 2+1+1+1,\quad 1+1+1+1+1.$
These patterns already identify all frequency-based hand types: four of a kind, full house, three of a kind, two pairs, one pair, and high card. Two further structural tests are needed:
$\operatorname{flush}(H)\iff s_1=s_2=s_3=s_4=s_5,$
$\operatorname{straight}(H)\iff a_{i+1}=a_i+1\text{ for }i=1,2,3,4.$
An ace-high straight flush is simply the largest straight flush, so the implementations do not need a separate royal-flush category. The Python implementation also includes the usual normalization \(A,2,3,4,5\mapsto 1,2,3,4,5\) when detecting the special 5-high straight; the ranking framework is otherwise shared.
A canonical evaluation map
The key idea is to encode every hand as
$E(H)=(c(H),t(H)),$
where \(c(H)\) is a category number and \(t(H)\) is an ordered tie-break vector. The category order used by the implementations is
$8=\text{straight flush},\ 7=\text{four of a kind},\ 6=\text{full house},\ 5=\text{flush},\ 4=\text{straight},\ 3=\text{three of a kind},\ 2=\text{two pairs},\ 1=\text{one pair},\ 0=\text{high card}.$
Now collect the distinct ranks as pairs \((m(v),v)\) and sort these pairs first by descending multiplicity and then by descending rank. Let the sorted list be
$((c_1,v_1),(c_2,v_2),\dots).$
This ordering invariant is what makes the solution compact: the same sorted group list automatically places the most important repeated ranks before the less important ones, and among kickers it automatically uses descending card values.
With that convention, the evaluation map is
$E(H)= \begin{cases} (8,(a_5)) & \text{if }\operatorname{straight}(H)\land \operatorname{flush}(H),\\ (7,(v_1,v_2)) & \text{if }c_1=4,\\ (6,(v_1,v_2)) & \text{if }(c_1,c_2)=(3,2),\\ (5,(a_5,a_4,a_3,a_2,a_1)) & \text{if }\operatorname{flush}(H),\\ (4,(a_5)) & \text{if }\operatorname{straight}(H),\\ (3,(v_1,v_2,v_3)) & \text{if }c_1=3,\\ (2,(v_1,v_2,v_3)) & \text{if }c_1=c_2=2,\\ (1,(v_1,v_2,v_3,v_4)) & \text{if }c_1=2,\\ (0,(a_5,a_4,a_3,a_2,a_1)) & \text{otherwise.} \end{cases}$
For two pairs, this means \(v_1\) is the higher pair, \(v_2\) is the lower pair, and \(v_3\) is the kicker. For one pair, \(v_1\) is the pair and the remaining entries are the kickers from largest to smallest.
Why lexicographic comparison works
Once both hands have been mapped to \(E(H)\), comparison is reduced to ordinary lexicographic order:
$H_1\text{ beats }H_2\iff E(H_1)>_{\mathrm{lex}}E(H_2).$
This works because poker comparison is hierarchical. First compare the hand classes; if the classes tie, compare the most important repeated rank; if that still ties, continue through the remaining kickers in descending priority. The vector \(t(H)\) stores exactly that order of importance.
That is the main invariant used by the code. Instead of writing a different comparison routine for every hand type, the implementations normalize every hand into the same mathematical object: category plus ordered tie-break vector.
Worked examples
Take
$H_1=\{5H,5C,6S,7S,KD\},\qquad H_2=\{2C,3S,8S,8D,TD\}.$
Both hands are one pair, so both have category \(1\). Their evaluation vectors are
$E(H_1)=(1,(5,13,7,6)),\qquad E(H_2)=(1,(8,10,3,2)).$
The categories match, but the pair rank \(8\) is larger than \(5\), so the second hand wins immediately.
Now consider
$H_1=\{4D,6S,9H,QH,QC\},\qquad H_2=\{3D,6D,7H,QD,QS\}.$
Again both hands are one pair, and both pairs are queens. Their vectors are
$E(H_1)=(1,(12,9,6,4)),\qquad E(H_2)=(1,(12,7,6,3)).$
The pair value ties at \(12\), so the comparison moves to the largest kicker. Since \(9>7\), the first hand wins. This example shows why storing the kickers in the correct order is essential.
How the Code Works
Parsing the cards
The C++, Python, and Java implementations read one line at a time, split it into ten 2-character tokens, convert \(2,3,\dots,9,T,J,Q,K,A\) into integers \(2\) through \(14\), and separate the first five cards from the last five.
Evaluating one hand
For each hand, the implementation counts equal ranks, checks whether all suits are the same, sorts the ranks, and tests whether the sorted ranks are consecutive. It also builds the sorted multiplicity list \((m(v),v)\). Using these ingredients, it returns the category number together with the correctly ordered tie-break vector.
Comparing the players and counting wins
After both hands are evaluated, the implementation compares categories first and tie-break vectors second. If Player 1 has the larger evaluation, the running total is incremented. The dataset is constructed so that each line has a definite winner, so no extra tie-resolution layer is needed beyond the lexicographic comparison already described.
Complexity Analysis
Every line contains only two 5-card hands, so the per-line cost is constant. Sorting five ranks, counting frequencies of five cards, and checking for a straight or a flush are all \(O(1)\) operations with very small constants.
If there are \(N\) lines, the total running time is \(O(N)\) and the extra memory usage is \(O(1)\). For the given file of 1000 hands, the computation is effectively instantaneous.
Footnotes and References
- Problem page: Project Euler 54 - Poker Hands
- Poker hand rankings: Wikipedia - List of poker hands
- Poker probabilities: Wikipedia - Poker probability
- Lexicographic order: Wikipedia - Lexicographic order