1 / 20

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …)

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …). Presented by Brett Borghetti 7 Jan 2007. Contributions of the work:. New betting strategy using probability: Propagate a “Probability Triple” knowledge representation <P(fold),P(call),P(raise)>

ethel
Download Presentation

Using Probabilistic Knowledge And Simulation To Play Poker (Darse Billings ++ 1999 …)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Probabilistic Knowledge And Simulation To Play Poker(Darse Billings++ 1999 …) Presented by Brett Borghetti 7 Jan 2007

  2. Contributions of the work: • New betting strategy using probability: • Propagate a “Probability Triple” knowledge representation • <P(fold),P(call),P(raise)> • An atomic unit stating the likelihood of each action occurring under a given situation • Uses real-time simulations to generate a selective sample of the possible outcomes while a hand is in progress. Brett Borghetti

  3. Old Loki (Loki-1) • Only carried most likely action or probability of playing the hand. • Uses “expert knowledge” • Initial tables of income rates • Initial weight probabilities of opponent hands (how likely will they play with these cards) • Re-weighting rules for opponent model updates • Hand evaluator strength and potential • Rule based Betting module Brett Borghetti

  4. New Loki (Loki-2) • All tables store probability triples • Propagating distributions allows distributed decisionmaking in components • Simulator calculates expected value of the selected sample of the way the hand might play out • Eliminates some of the required ‘expert knowledge’ Brett Borghetti

  5. Probability Triples • Stores probability of 3 actions • [f,c,r] such that f+c+r = 1.0 • Used in 3 locations in Loki 2 • Triple Generator • Evaluates 2 card hands in the current context • Opponent Modeler • For updating the weight tables in the opponent model • Action Selector (for choosing our next action) • Can adjust the selection based on desired play style Brett Borghetti

  6. Simulation-Based Betting Strategy • Calculates approximate expected value of the return on investment (expected value) for each possible betting action. • Since folding has EV=0, they only consider the actions of call or raise from current position and try to expand the game tree from there • Since the entire game tree would be intractable to search, uses selective sampling • Simulated opponent actions are biased by their weight tables, using random number to select the actual action in that simulated hand • Author claims this approach should be better than the static approach • [brett] that would depend on how accurate the weighting scheme was at detecting the true behavior of the opponent Brett Borghetti

  7. Comparing Performance • Single measurement: Small Bets per Hand • If you play 30 hands and it is a $10/20 game, an improvement of +0.10sb/hand means you win an extra $1.00 per game which results in an extra $30 won. Brett Borghetti

  8. Experiments • Examines each change from Loki-1 separately • R: changing the reweighting • B: changing from rules-based betting to ‘action selector’ with randomizing • S: incorporating the simulation to compute EV in the action selection decision Brett Borghetti

  9. Experiments, (continued) • Self Play in 10-seat game: Added components one at a time and compared performance • B~R, B<<S, R<<S • B alone vs R alone is roughly equivalent and provides the least improvement, with S alone providing the most improvement • B+R+S > S Brett Borghetti

  10. Experiments, (continued) • Player Type comparisons in 10-seat game • Number of hands played to the flop: • T = Tight • L = Loose • How frequent bet and raise after the flop • A = Agressive • C = Conservative Brett Borghetti

  11. Issues [Brett] • At the core of Loki-2 is the weighting system that models the opponent. • Is this system flexible and adaptable to rapid changes in opponent strategy, or do the weights have some kind of inertia that prevents the model to incorporate changes as quick as they might happen • Do the weight updates (belief updates) make sense? Brett Borghetti

  12. Background Information Brett Borghetti

  13. Texas Hold’em Heads-up Limit Poker Basics • 2 Players • 4 Betting Rounds per hand • Preflop(2 hole cards), Flop(3 community cards), Turn (1cc), River (1 cc) • Action set = {fold, call(check), raise(bet)} • Up to 3 raises allowed per round • Round is over when either • When all players are even in the pot via a final call and each player has had at least one opportunity to act [go on to next round] • When one player folds [other player wins] Brett Borghetti

  14. Requirements for a World Class Poker Player • Able to assess • Hand Strength • Hand Potential • Opponents Betting Strategy (opponent model) • Has a strong • Betting strategy • Ability to play deceptively [bluff vs. slow play*] • Ability to play unpredictably Brett Borghetti

  15. Optimal vs Maximal play • Optimal player makes decisions based on game-theoretic probabilities without regard to specific context (opponent’s plays) • Maximal player takes into account the opponent’s sub-optimal tendencies and adjusts its play to exploit perceived weaknesses Brett Borghetti

  16. Hand Assessment (Hand Strength = HS) • Pre-Flop HS determined from 169 equivalence classes “income rate” from 1M simulated poker hands • Flop HS determined comparing each of the 1081 possible opponent hands with ours and determining how many wins each player has Brett Borghetti

  17. Hand Potential (HP) at the Flop • PPot1 = likelihood that our hand will improve with one card (the turn card) • PPot2 = likelihood that our hand will improve with two cards (turn and river) • NPot1 and 2 = equivalent calculations of likelihood that our opponent’s hand will get better than ours on the turn and/or river Brett Borghetti

  18. Effective Hand Strength & Pot Odds • EHS = HSn + (1-HSn) x Ppotn • The chance that we either are ahead or could pull ahead by the end of n=1 or n=2 cards from now • Pot odds = P(win)/(Expected Return on Pot) • Example: if your chance of winning is 25%, you would call a $4 bet to win a $16 pot because your earnings are 0.25*$20 = $5 and hence you can expect to win $5 every time you pay $4 for an expected net gain of $1.00 per play. Brett Borghetti

  19. Opponent Modeling • Uses initial weighting scheme based on original income rates on the 169 preflop card equivalency classes • Updates the weights generically on each hand based on the betting used during that hand • Updates the weights specifically based on the total betting history over all hands with this opponent • Weight updates based on mean and variance of call vs. raise vs. fold actions Brett Borghetti

  20. Using the opponent model • Calculate a new weight for all possible starting card combos (1081) of the opponent based on initial weights, HS, EHS and opponent actions (generic and specific) • Weights for each possible hole card tuple provides an ordering over the possible hands • Usually greatly reduces the uncertainty of what hands the opponent is playing… asuming they are not playing deceptively. Brett Borghetti

More Related