Launch notebook with ipython notebook --matplotlib=inline
numba 0.11.1 is compulsary - later versions are significantly different
This post is a personal exploration of the Texas Hold'em Poker game. It is based on SpecialK's very elegant, a bit lucky (see why below), and extremely efficient hand evaluator algorithm. I am using the IPython notebook as an experiment, and all the code is available in this github repo.
Here is a brief description of the algorithm.
But before I get into the details, I want to show the image it evoked in my mind when I first read it, as elegant and fast as it immediately appeared.
The objective is to evaluate the rank of a full 7 card poker hand: 2 card in the hand, 5 on the board. Each card has a face, from 0 to 12 (from 2 to Ace), and a suit, from 0 to 3 (spades, hearts, diamonds, clubs). Each face and suit is associated to 2 integers, the first one will be used if the hand is not a flush (card face key), the second if the hand is a flush (card flush key). Each suit is associated to an integer (suit key).
Two sets of keys are created for a five card hand for a seven card hand
For a five card hand: The suit keys are such that any 2 sums of different 7 suit keys (corresponding to the 7 cards of the evaluated hand) are different. The face keys are such that any 2 sums of 5 face keys, each present at most 4 times in the sum, are different. The flush keys are such that any 2 sums of 5 flush keys, each present at most once in the sum, are different.
For a seven card hand: The same suit keys as determined for five card hands are used. The face keys are such that any 2 sums of 7 face keys, each present at most 4 times in the sum, are different. The flush keys are such that any 2 sums of 5 to 7 flush keys, each present at most once in the sum, are different.
ough so there is no need to explore further.
For suit keys, an exhaustive search is feasible. The best result is [0, 1, 29, 37] for [spades, hearts, diamonds, clubs]. The flush keys and face keys are determined by brute force, through a lexicographic search: Solve for n keys, then start from this solution to try and determine the solution for n+1 keys, all the way up to 13. It takes less than a few seconds to find valid keys in most cases, but several hours to find valid face keys for seven card hands. The keys found by this method are small enough (see below) so there is no need to dig further.
The properties of the suit/face/flush keys are such that a 5 (resp. 7) card hand can be uniquely associated to a set of 2 keys, (i) the sum of the suit keys, (ii-1) the sum of the face keys if the hand is not a flush or straight flush, or (ii-2) the sum of the flush keys if the hand is a flush or straight flush.
Next step is to construct lookup tables for five hand hand cards: table of hand ranks (except flush and straight flush) indexed by sum of face keys. table of hand ranks (exclusively flush and straight flush) indexed by sum of flush keys.
and for seven hand cards: table of flush suit indexed by sum of suit keys. table of hand ranks (except flush and straight flush) indexed by sum of face keys. table of hand ranks (exclusively flush and straight flush) indexed by sum of flush keys.
The five hand card tables are constructed by going through all possible 5 card hands.
Based on these precalculated tables, here is how the rank of a five card hand is determined. Add up all 5 suit keys. This sum, uniquely associated to this hand, enables to determine whether the hand is a flush or not. If it is not a flush, add up all 5 face keys. By construction, the sum of the 5 face keys is uniquely associated to this hand. Look up the sum of the face keys in the face key table and read the hand rank. If it is a flush, add up all 5 flush keys. By construction, the sum of the flush keys is uniquely associated to this hand. Look up the sum of the flush keys in the flush key table and read the hand rank.
The seven hand card tables are constructed by going through all possible 7 card hands, and evaluating each as the best among the 21 candidate 5 card hands, using the five card hand evaluator thus created.
Based on these precalculated tables, here is how the rank of a seven card hand is determined. Add up all 7 suit keys. This sum, uniquely associated to this hand, enables to determine whether the hand is a flush or not. If it is not a flush, add up all 7 face keys. By construction, the sum of the 7 face keys is uniquely associated to this hand. Look up the sum of the face keys in the face key table and read the hand rank. If it is a flush, lookup the flush suit, then add up all flush keys (5, 6, or 7 depending of how many cards are the same suit). By construction, the sum of the flush keys is uniquely associated to this hand. Look up the sum of the flush keys in the flush key table and read the hand rank.
After the construction phase, only the seven card hand evaluator will be used.
Let us now examine the keys.
2^8 < 7 x the largest suit key < 2^9
2^22 < 4 x the largest face key + 3 x the second largest face key < 2^23
(The flush keys are significantly smaller than the face key, which is unsurprising given the much stronger constraint applied to the face keys)
So it is possible to encode and process a seven card hand within 32 bits i.e. a machine integer.
There is the magic and the unbelievable luck of this algorithm, I would even say the divine touch !
Practically, the information for each card is encoded as follows: the card face key contains the suit key (9 less significant bits) and the face key (24 most significant bits) the card flush key is the flush key
Based on this convention, we can clarify further how the seven card hand evaluator works: (1) sum the 7 card keys (2) extract the sum of the suit keys using a bit mask/shift (3) if the hand is not a flush, extract sum of the face keys using a bit mask/shift then lookup the hand rank (4) if the hand is a flush, determine which card participate in the flush, and add the card flush key of all participants then lookup the hand rank
The algo is very efficient: If the hand is not a flush (97.11% probability) the method determines the hand rank in 6 additions, 2 bit mask/shift, and 2 table lookups. If the hand is a flush (2.88% probability), add 7 tests and 7 additions. All operations on 32 bit integers (NB: unsigned for the hand key to negative sum).
The flip side of this speed is that the algo needs a large table, essentially empty, in memory: 0.63% of the face rank table is non zero. But is can be store efficiently as a sparse array.
Below I compute and display the main results.
Read the code to delve into the details
import os
import numpy as np
import pandas as pd
import matplotlib.pylab as pl
import itertools as it
import cPickle
from jinja2 import Environment
from IPython.display import HTML, display, Math
from timeit import default_timer as timer
import EvalKeys as keys
import EvalFive, EvalSeven, EvalGenerateHands, EvalAnalysis
np.set_printoptions(linewidth=250)
np.set_printoptions(threshold=100)
np.set_printoptions(edgeitems=5)
np.set_printoptions(suppress=True)
pd.set_option('display.height', 100)
pd.set_option('display.max_rows', 300)
pd.set_option('display.max_columns', 200)
pd.set_option('display.width', 500)
# pd.describe_option('display')
keys.show_keys()
EvalFive.compute_tables()
EvalFive.test()
EvalSeven.compute_tables()
EvalSeven.test()
Below is the representation of the 52 deck cards.
import PIL
card_files = [[os.path.join('Cards','Full', str(1+s+4*f)+'.png')
for f in range(13)]
for s in range(4)]
jinja_template = """
<table style="border:0px solid black;">
{% for row in array %}
<tr style="border:0px solid black;">
{% for col in row %}
<td style="background-color:white;
border-collapse:collapse;
text-align:center;
border:0px solid black;">
<img border="10" src="{{ col }}" alt="N/A">
</td>
{% endfor %}
</tr>
{% endfor %}
</table>"""
HTML_content = Environment().from_string(jinja_template).render(array=card_files)
HTML(HTML_content)
Below are all possible five card hands stored in a dataframe.
t0 = timer()
df_hand_five = EvalAnalysis.create_df_hand_five()
t1 = timer()
print 'df_hand_five time = {:4.2f} s'.format(t1-t0)
HTML('''<style>
.df tbody tr:last-child { background-color: #FFAAAA; }
.df { font-size: 12px; }
</style>''' + df_hand_five.to_html(classes='df'))
Below are the exact formulae for the number of five card hands by type. See wikipedia for the calculations.
for k, h in enumerate(EvalAnalysis.nb_occurence_formula_hand_five):
print '{} : '.format(h[0])
display(Math(h[1]))
print
Below are all the possible seven card hands stored in a dataframe.
Somewhat counterintuitively, there are less than five hand cards.
On closer examination, this is explained by the fact that the 2 extra cards cannot be ignored and tend to increase its value.
t0 = timer()
df_hand_seven = EvalAnalysis.create_df_hand_seven()
t1 = timer()
print 'df_hand_seven time = {:4.2f} s'.format(t1-t0)
HTML('''<style>
.df tbody tr:last-child { background-color: #FFAAAA; }
.df tbody { font-size: 12px; }
</style>''' + df_hand_seven.to_html(classes='df'))
Below are the exact formulae for the number of seven hands by type. See wikipedia for the calculations.
for k, h in enumerate(EvalAnalysis.nb_occurence_formula_hand_seven):
print '{} : '.format(h[0])
display(Math(h[1]))
print
There are 169 different possible pre flop hands: C(13,2) = 78 suited hands C(13,2) = 78 offsuited hands, excluding pairs 13 pairs
face = EvalAnalysis.face_symbol
choice = EvalAnalysis.choice
all_preflop_hands = EvalAnalysis.all_preflop_hands
pairs = [[{'hand' : face[i]+face[j]+choice(i, j, ['o', 's', 'o']),
'color' : choice(i, j, ['#8dd3c7', '#ffffb3', '#bebada'])}
for i in range(len(face))[::-1]]
for j in range(len(face))[::-1]]
jinja_template = """
<table>
{% for row in array %}
<tr>
{% for col in row %}
<td style="background-color:{{ col.color }};
color:black;
border-collapse:collapse;
text-align:center;
font-family:Arial;
border:1px solid black;">
{{ col.hand }}</td>
{% endfor %}
</tr>
{% endfor %}
</table>"""
HTML_content = Environment().from_string(jinja_template).render(array=pairs)
print 'nb of preflop hands = {}'.format(len(all_preflop_hands))
HTML(HTML_content)
For a given pre flop hand, there are C(50,5) = 2,118,760 different tables. Assuming each set of table cards is equally likely, i.e. neglecting the other players' cards, we compute the distribution of hand ranks for a given pre flop hand.
df_preflop_hand_distrib = EvalAnalysis.create_df_preflop_hand_distrib()
Below is displayed a fraction of the distribution of hand ranks per preflop hand.
#df_preflop_hand_distrib = pd.read_pickle(os.path.join('Tables', 'df_preflop_hand_distrib.pd'))
HTML('''<style>
.df1 { font-size: 11px; }
</style>''' + df_preflop_hand_distrib.ix[:, :30].tail(5).to_html(classes='df1'))
Below is plotted the distribution of hand ranks for one particular preflop hand, compared with the native distribution (all preflop hands being equally probable).
hand = ('A5s')
idx = EvalAnalysis.preflop_hand_str_order(hand, return_index=True)
s = df_preflop_hand_distrib.ix[:, idx]
s_ref = df_preflop_hand_distrib.ix[:, 0]
s_ref = s_ref*1.0*s.sum()/s_ref.sum()
# exact plot
fig=pl.figure(figsize=(10, 8))
s_ref.plot(color="#6495ED", alpha=0.3)
s.plot(color="#F08080", alpha=0.5)
pl.title('Distribution of hand ranks for a preflop hand')
pl.xlabel('Hand Rank')
pl.ylabel('Nb of Hands')
pl.legend()
pl.show()
# plot by hand type buckets
bins = np.r_[df_hand_five['MinRank'].values[:-1][::-1],
df_hand_five['MaxRank'].values[0]+1]
s_bin = np.zeros([bins.size-1], dtype=np.float32)
for k in range(bins.size-1):
s_bin[k] = s[bins[k]:bins[k+1]].sum()
s_bin = pd.Series(s_bin, index=df_hand_five['HandType'][:-1][::-1], name=s.name)
s_ref_bin = np.zeros([bins.size-1], dtype=np.float32)
for k in range(bins.size-1):
s_ref_bin[k] = s_ref[bins[k]:bins[k+1]].sum()
s_ref_bin = pd.Series(s_ref_bin, index=df_hand_five['HandType'][:-1][::-1], name=s_ref.name)
fig=pl.figure(figsize=(10, 8))
s_ref_bin.plot(kind='bar', color="#6495ED", alpha=0.3)
s_bin.plot(kind='bar', color="#F08080", alpha=0.5)
pl.title('Distribution of hand ranks for a preflop hand')
pl.ylabel('Nb of Hands')
pl.legend()
pl.show()
# plot by same sized hand rank buckets
bins_1 = np.array(np.linspace(1, 7462, 50), dtype=np.int32)
s_bin_1 = np.zeros([bins_1.size-1], dtype=np.float32)
for k in range(bins_1.size-1):
s_bin_1[k] = s[bins_1[k]:bins_1[k+1]].sum()
s_bin_1 = pd.Series(s_bin_1, name=s.name)
s_ref_bin_1 = np.zeros([bins_1.size-1], dtype=np.float32)
for k in range(bins_1.size-1):
s_ref_bin_1[k] = s_ref[bins_1[k]:bins_1[k+1]].sum()
s_ref_bin_1 = pd.Series(s_ref_bin_1, name=s_ref.name)
fig=pl.figure(figsize=(10, 8))
s_ref_bin_1.plot(kind='bar', color="#6495ED", alpha=0.3)
s_bin_1.plot(kind='bar', color="#F08080", alpha=0.5)
pl.title('Distribution of hand ranks for a preflop hand')
pl.xlabel('Hand Rank Bucket Nb')
pl.ylabel('Nb of Hands')
pl.legend()
pl.show()
Below is the visualisation of pre flop hand rank distributions for all preflop hands.
Here is the native page.
(iframes inside the notebook often don't look as good as in native pages)
from IPython.display import HTML
HTML("""<div class="wrapper" style="height: 850px; overflow: hidden; padding: 0; text-align: center; width: 850px;">
<iframe scrolling="no" src="http://oscar6echo.github.io/Poker2/viz/one_preflop_hand/"
style="-moz-transform-origin: 0 0;
-moz-transform: scale(1.0);
-o-transform-origin: 0 0;
-o-transform: scale(1.0);
-webkit-transform-origin: 0 0;
-webkit-transform: scale(1.0);
border: 0px black solid;
height: 850px;
overflow: hidden;
width: 850px;
zoom: 1.0;">
</iframe>
</div>'""")
Now, instead of considering a single pre flop hand, we compare all pairs of pre flop hands.
First we determine all possible pairs.
Provisionally there are C(169,2)+169 = 14,365 possible pairs. But on closer inspection, for each such pair, there may be several combinations of suits in both hands. Specifically, the combinations of suits depends on whether each hand is suited/offsuited, a pair/not a pair. The exhaustive range of combinations is in the tables below.
all_preflop_two_hands = EvalAnalysis.all_preflop_two_hands
n = len(all_preflop_hands)
print 'nb of preflop two hands = {}'.format(EvalAnalysis.C(n, 2)+n)
(all_preflop_two_hands[:10], all_preflop_two_hands[-10:])
pfh_combin = EvalAnalysis.pfh_combin
suit_combin = EvalAnalysis.suit_combin
suit_combin_freq = EvalAnalysis.suit_combin_freq
suit_symbol_html = EvalAnalysis.suit_symbol_html
top_row = ['']+range(8)
rng_row = range(len(pfh_combin))
rng_col = range(max([len(e) for e in suit_combin]))
jinja_template = """
<table style="border:0px;">
<tr style="border:0px; border-bottom:1px solid #999;">
<td style="border-width:0px; border-right:1px solid #999;"></td>
{% for col in rngcol %}
<td style="border:0px; text-align:center;">{{ col+1 }}</td>
{% endfor %}
</tr>
{% for row in pfhcombin %}
{% set outer_loop = loop %}
<tr style="border:0px;">
<td style="border:0px; border-right:1px solid #999; width:260px; font-size:90%;">{{row}}</td>
{% for col in suitcombin[outer_loop.index0] %}
<td style="border:0px; width:60px; text-align:center; font-size:90%">
{{freq[outer_loop.index0][loop.index0]}}x[{{symbol[col[0]]}}{{symbol[col[1]]}}-{{symbol[col[2]]}}{{symbol[col[3]]}}]
</td>
{% endfor %}
</tr>
{% endfor %}
</table>"""
HTML_content = Environment().from_string(jinja_template).render(toprow=top_row,
pfhcombin=pfh_combin,
suitcombin = suit_combin,
freq = suit_combin_freq,
symbol=suit_symbol_html,
rngrow=rng_row,
rngcol=rng_col)
HTML(HTML_content)
The number of suit combinations ranges from 2 to 7 depending on the pair characteristics (suited/offsuited, pair/no pair) considered. Note that these combinations represent the maximum diversity in terms of suit possibilities for a pre flop hand pair. In several cases these combinations collapse to a narrower range of possibilities. For example, 98o vs. 98o are two offsuited hands, and none is a pair. The table mentions 7 suit combinations, but because the faces overlap, 3 of these combinations disappear, or rather merge into the 4 remaining.
Below are displayed the exact statistics for the pair of preflop hands {98o, 98o}.
# all_preflop_two_hand_equity = create_all_preflop_two_hand_equity(verbose=False)
all_preflop_two_hand_equity = cPickle.load(
open(os.path.join('Tables','all_preflop_two_hand_equity_full.pk'), 'rb'))
pair = ('89o', '98o')
pair_ordered = EvalAnalysis.preflop_two_hand_str_order(pair, return_index=False)
EvalAnalysis.preflop_two_hand_equity(pair_ordered, verbose=True);
Going though all pairs of pre flop hands and suit combinations, we can determine the exact number of distinct (in terms of hand ranks) pairs of pre flop hands: 47,086. Which means the average number of suits combination for each pair of pre flop hands is 47086/14365~3.28.
Note that this takes several hours.
print 'number of distinct (rankwise) pairs of preflop hands = {}'.format(
np.array([len(e) for e in all_preflop_two_hand_equity]).sum())
(all_preflop_two_hand_equity[:3], all_preflop_two_hand_equity[-3:])
Below are plotted the exact statistics for a pair of preflop hands in all its possible suit combinations.
no_to_char = EvalAnalysis.one_hand_no_to_char
str_to_no = EvalAnalysis.hand_str_to_no
all_preflop_two_hands = EvalAnalysis.all_preflop_two_hands
pair = ('A5o', 'JJo')
p = EvalAnalysis.preflop_two_hand_str_order(pair, return_index=False)
idx = all_preflop_two_hands.index(p)
h1, h2 = p
for k in range(len(all_preflop_two_hand_equity[idx])):
arr = all_preflop_two_hand_equity[idx][k, :3]
arr = np.array(1.0*arr/arr.sum(), dtype=np.float32)
freq = all_preflop_two_hand_equity[idx][:, 3]
freq = np.array(1.0*freq/freq.sum(), dtype=np.float32)
s = pd.Series(arr, index=['A Wins', 'Ties', 'B Wins'])
print 'suit combination #{}'.format(1+k)
print 'Hand A / Hand B = ', no_to_char(str_to_no(p[0]))+' - '+no_to_char(str_to_no(p[1]))
print 'frequency = {:5.3f} %'.format(100.0*freq[k])
fig=pl.figure(figsize=(5, 4))
s.plot(kind='bar', color="#6495ED", alpha=0.8)
for q in range(3):
pl.text(0.5+q, max(0.1, min(s[q]+0.05, 0.9)), '{:4.2f}%'.format(100*s[q]), fontsize=10)
pl.ylim(0, 1)
pl.show()
After an exhaustive comparison of all pre flop hands, we create an overview of the equity distribution for all pairs of pre flop hands, for better visualisation. In order to do so we aggregate all suit combinations and keep the frequency weighted average of the equity distribution.
t0 = timer()
df_equity_two_hands = EvalAnalysis.create_all_preflop_two_hand_equity_aggregate(
all_preflop_two_hand_equity,
save=True)
t1 = timer()
print 'df_equity_two_hands time = {:6.4f} s'.format(t1-t0)
df_equity_two_hands.head(10)
df_equity_two_hands.tail(10)
print 'average of all suit combinations weighted by frequency'
print 'Hand A / Hand B = {}'.format(p[0]+' - '+p[1])
s = pd.Series(data=df_equity_two_hands.ix[idx][4:].values, index=['A Wins', 'Ties', 'B Wins'])
fig=pl.figure(figsize=(5, 4))
s.plot(kind='bar', color="#6495ED", alpha=0.8)
for q in range(3):
pl.text(0.5+q, max(0.1, min(s[q]+0.05, 0.9)), '{:4.2f}%'.format(100*s[q]), fontsize=10)
pl.ylim(0, 1)
pl.show()
Below is the visualisation of preflop hand pairs equity distributions for all preflop hand pairs.
Here is the native page.
from IPython.display import HTML
HTML("""<div class="wrapper" style="height: 1200px; overflow: hidden; padding: 0; text-align: center; width: 850px;">
<iframe scrolling="no" src="http://oscar6echo.github.io/Poker2/viz/two_preflop_hand/"
style="-moz-transform-origin: 0 0;
-moz-transform: scale(1.0);
-o-transform-origin: 0 0;
-o-transform: scale(1.0);
-webkit-transform-origin: 0 0;
-webkit-transform: scale(1.0);
border: 0px black solid;
height: 1200px;
overflow: hidden;
width: 850px;
zoom: 1.0;">
</iframe>
</div>'""")
Computing the strenth of preflop hands when there are more than 2 players (almost always the case in practise) is intractable as the number of different games to evaluate grows to gigantic proportions very quickly.
Below is the exact number, p being the number of opponents (from 1 to 9).
Math(r"""\binom{52-2(p+1))}{5}\prod_{i=1}^{p}\binom{52-2i}{2}""")
Fortunately the monte carlo convergence is fast and 100,000 to 200,000 random games seem enough to give the odds of the preflop hands with a 0.10% precision (more than enough in any practical context), meaning running the simulation several times generally gives the same result within the precision. Note that is only an observation and has not theoretical basis whatsoever. Anyway this fact is convenient and makes possible a practical the online odd evaluator (see further down).
Now in order to compute the table of preflop hand equity per number of opponents, I draw 300 million random games for each preflophand.
The black line is the total equity, the gray line is the total equity minus the contribution of ties to the total equity.
Below is the visualisation of the table of preflop hand equity per number of opponents.
Here is the native page.
from IPython.display import HTML
HTML("""<div class="wrapper" style="height: 1000px; overflow: hidden; padding: 0; text-align: center; width: 750px;">
<iframe scrolling="no" src="http://oscar6echo.github.io/Poker2/viz/one_preflop_hand_montecarlo/"
style="-moz-transform-origin: 0 0;
-moz-transform: scale(1.0);
-o-transform-origin: 0 0;
-o-transform: scale(1.0);
-webkit-transform-origin: 0 0;
-webkit-transform: scale(1.0);
border: 0px black solid;
height: 1000px;
overflow: hidden;
width: 750px;
zoom: 1.0;">
</iframe>
</div>'""")
Below is a Texas Hold'em odd calculator implementing in javascript the algorithm prototyped in Python in this notebook. Javascript has become fast enough to run this evaluator.
The evaluation is exhaustive if all players cards are known, and monte carlo if only the main player's cards are known. In the latter case other player cards are randomly drawn just like the table cards. If one card of the other player cards shows then the monte carlo simulation assumes this card is determined. This is to take into account the main player 'guess' if he has one.
Here is the native page.
from IPython.display import HTML
HTML("""<div class="wrapper" style="height: 800px; overflow: hidden; padding: 0; text-align: center; width: 750px;">
<iframe scrolling="no" src="http://oscar6echo.github.io/Poker2/viz/game/"
style="-moz-transform-origin: 0 0;
-moz-transform: scale(1.0);
-o-transform-origin: 0 0;
-o-transform: scale(1.0);
-webkit-transform-origin: 0 0;
-webkit-transform: scale(1.0);
border: 0px black solid;
height: 800px;
overflow: hidden;
width: 750px;
zoom: 1.0;">
</iframe>
</div>'""")
Et voilà !
I did this summary partly to experiment with the IPython notebook and I have to say I'm impressed.
It is fantastically convenient. Notebooks existed before IPython, but (it seems to me) the real stroke of genius was the design decision to build it on top of modern browsers, instead of starting from scratch again. In other words, to stand on the shoulders of giants...