1 / 32

DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3 Junghuei Chen 1* Departments of 1 Chemistry & Biochemistry and 4 Computer & Information Sciences University of Delaware 2 The Wharton School, University of Pennsylvania

bary
Download Presentation

DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DNA Starts to Learn Poker David Harlan Wood4* Hong Bi1 Steven O. Kimbrough2 Dongjun Wu3 Junghuei Chen1* Departments of 1Chemistry & Biochemistry and 4Computer & Information Sciences University of Delaware 2The Wharton School, University of Pennsylvania 3Benett S. Lebow College of Business, Drexel University

  2. Deal Ace Say Ace (adds $1) Player Dealer Call (adds $1) Fold Losses $ 1 Loses $2 Player Dealt an Ace

  3. Deal 2 Say 2 Say Ace (adds $1) Player Losses $ 1 Dealer Fold Call (adds $1) Losses $ 1 Wins $ 2 Player dealt a 2

  4. Say Ace (adds $1) Say 2 Say Ace (adds $1) Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Player dealt an Ace Player dealt a 2 OBJECTIVE: To Obtain Probabilistic Strategies Each player wants to obtain a strategy for the game. A strategy prescribes an action in every possible situation. That is, at each node, raising as a function of hand dealt.

  5. Deals Poker Play New Game Assemble New Dealer Strategies New Player Strategies

  6. Amplify Mutate Crossover Amplify Mutate Crossover Learning Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Separate by Payoffs Recover & Distribute Strategies Dealer’s Adaptation Programmable Selection of Recovered Dealer Strategies Player’s Adaptation Programmable Selection of Recovered Player Strategies

  7. Deals Amplify Mutate Crossover Amplify Mutate Crossover LearningPoker Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Play New Game Separate by Payoffs Assemble Recover & Distribute Strategies Dealer’s Adaptation New Dealer Strategies Programmable Selection of Recovered Dealer Strategies Player’s Adaptation New Player Strategies Programmable Selection of Recovered Player Strategies

  8. Say Ace (adds $1) Say 2 Say Ace (adds $1) Fold’ Say A’ Call’ FOLD’ R.E. 2 Stopper Stopper Player’s Strategies Say 2’ 2’ Say A’ SAY2’ Say A’ A’ Error Fold’ R. E. 1 Stopper Stopper Stopper Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Dealt 2 Dealer’s Strategies 2 R.E. 2 Dealt A R.E. 1 A R.E. 2 Sequences from: Sakamoto, et. al, DNA4 (1997)

  9. Dealer’s Strategies

  10. Player’s Strategies

  11. Deals

  12. A R.E. 2 A Player’s Strategy Say 2’ 2’ Say A’ SAY2’ Say A’ A’ Error Fold’ R. E. 1 Two Strategies and a Deal Define a Game Ace Dealt A Dealer’s Strategy Fold’ Say A’ Call’ FOLD’ R.E. 1 R.E. 2

  13. Error Fold’ Cut with R.E.1 & R.E.2 and Assemble A Game A Say A’ Call’ Fold’ FOLD’ R.E. 2 Say A’ 2’ Say 2’ Say A’ A’ SAY 2’ R. E. 1 Deal Player’s Strategy Dealer’s Strategy Say A’ Call’ Fold’ FOLD’ Error Fold’ Say A’ 2’ Say 2’ Say A’ A A’ SAY 2’

  14. Error Fold’ A R.E. 2 A Player’s Strategy Say 2’ 2’ Say A’ SAY2’ Say A’ A’ Error Fold’ R. E. 1 Cut with R.E.1 & R.E.2 and Assemble A Game A Say A’ Call’ Fold’ FOLD’ R.E. 2 Say A’ 2’ Say 2’ Say A’ A’ SAY 2’ R. E. 1 Deal Player’s Strategy Dealer’s Strategy Say A’ Call’ Fold’ FOLD’ Error Fold’ Say A’ 2’ Say 2’ Say A’ A A’ SAY 2’ Two Strategies and a Deal Define a Game Ace Dealt A Dealer’s Strategy Fold’ Say A’ Call’ FOLD’ R.E. 1 R.E. 2

  15. Say 2’ 2’ Say A’ Say A’ A’ Error Fold’ Fold’ Say A’ Call’ FOLD’ A SAY 2’ 53-mer (S4) 48-mer (S3) 57-mer (S2) 74-mer (S1) L1 (25 mer) L3 (28 mer) L2 (28 mer) 232 225 200 150 100 75 Deal Player’s Strategy Dealer’s Strategy 50 S1 S2 S3 S4 R1 R2 M R1: Ligation Reaction R2: Purified Ligation Product

  16. Deal Ace Say Ace (adds $1) Say 2 Player Dealer Call (adds $1) Fold Losses $ 1 Loses $2 Player dealt an Ace Player Says A Dealer Folds Dealer MIGHT Change to Call

  17. Dealer’s Strategy Player’s Strategy Deal Say 2’ 2’ Say A’ Say A’ A’ Error Fold’ Fold’ Say A’ Call’ FOLD’ A SAY 2’ Player Dealt an Ace Player Says Ace Extend (Say A) A Say A’ A’ Player’s Strategy Extend (Fold) Dealer Folds Say A Fold’ Say A’ Dealer’s Strategy Extend (Call) Dealer MIGHT Change to Call Fold Preventer Call’ FOLD’ Fold’ Error Dealer’s Strategy

  18. Player Says Ace Extend (Say A) A Say A’ A’ Extend (Fold) Dealer Fold Say A Fold’ Say A’ Dealer MIGHT Change to Call Extend (Call) Fold Preventer Call’ FOLD’ Fold’ (282-mer) 300 (262-mer) 275 250 225 (247-mer) 200 (232-mer)

  19. Player Says 2 Dealer Folds Dealer Changes to Call Deal 2 Say 2 Say Ace (adds $1) Player Losses $ 1 Dealer Fold Call (adds $1) Losses $ 1 Wins $ 2 Player dealt a 2 Player Changes to Say A (Block Say 2)

  20. Player’s Strategy Dealer’s Strategy Deal Extend (Say 2) 2 2’ Say 2’ Player’s Strategy Extend (Say A) Say 2 Say A’ SAY 2’ Player Dealt a 2 2 Say A’ Call’ Fold’ FOLD’ Error Fold’ Say A’ 2’ SAY 2’ Say 2’ Say A’ A’ Player Says 2 Player MIGHT Change to Say Ace Player’s Strategy Dealer Folds Extend (Fold) Say A Say A’ Fold’ Dealer’s Strategy Dealer MIGHT Change to Call Extend (Call) Fold Preventer Call’ FOLD’ Error Fold’ Dealer’s Strategy

  21. Say Ace (adds $1) Say 2 Say Ace (adds $1) Player Says 2 Player MIGHT Change to Say Ace Dealer Folds Dealer MIGHT Change to Call Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Player dealt an Ace Player dealt a 2 Player Says A Dealer Folds Dealer MIGHT Change to Call

  22. Deals Amplify Mutate Crossover Amplify Mutate Crossover LearningPoker Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Play New Game Separate by Payoffs Assemble Recover & Distribute Strategies Dealer’s Adaptation New Dealer Strategies Programmable Selection of Recovered Dealer Strategies Player’s Adaptation New Player Strategies Programmable Selection of Recovered Player Strategies

  23. Amplify Mutate Crossover Learning Recover & Cut Play Histories for Player’s & Dealer’s Strategies Dealer’s Strategies Player’s Strategies Separate by Payoffs Recover & Distribute Strategies Dealer’s Adaptation Programmable Selection of Recovered Dealer Strategies • Strategies are returned • grouped by outcomes: • $ 2, - $ 1, + $ 1, + $ 2. • Select Dealer’s own • Preferred mix of • strategies to be bred Breed by using PCR to restore population size using a variable mutation rate. Crossover by pairwise recombining of “change your mind” regions.

  24. Say Ace (adds $1) Say 2 Say Ace (adds $1) Deal Ace 2 Player Losses $ 1 Dealer Call (adds $1) Fold Fold Call (adds $1) Losses $ 1 Losses $ 1 Loses $2 Wins $ 2 Player dealt an Ace Player dealt a 2 OBJECTIVE: To Obtain Probabilistic Strategies Each player wants to obtain a strategy for the game. A strategy prescribes an action in every possible situation. That is, at each node, raising as a function of hand dealt.

  25. Complexity Our complexity is linear in the number of nodes in the tree # nodes in tree = 2 players + betting rounds At each node, we need a probability distribution giving “level of bet” as a function of “dealt hand”. For us, probability distribution is substituted by probabilistic hybridization of DNA encoded “dealt hand” to adapting “change you mind about folding” region of strategy. The output (if generated) is an adapting “level of bet” region of strategy. Extend next hand bet next hand’ bet’ next’ bet generator hand evaluator

  26. Comparison Koller and Pfeffer derive equilibrium mixed strategies with complexity polynomial in # nodes * # possible deals * 2betting levels • Two-player games only • Don’t exploit weakness of opponent • No dynamics, only equilibrium “Representations and Solutions for Game-Theoretic Problems,” Artificial Intelligence (1997)

  27. P1 Pass Bet $ a P2 Pass Bet $ a F C P3 Bet $ a Pass F C F C F C P1 F C F C F C P2 F C F C P3 3-Player Poker: All Possible Deals 2 2 2 2 Player 1 2 2 2 2 Player 2 2 2 2 2 Player 3 Course of Play C: Call (add $ b) F: Fold

  28. LearningPoker Assemble Amplify Mutate Crossover Amplify Mutate Crossover Separate by Payoffs Play New Game Deals Recover Dealer’s & Player’s Strategies Dealer’s Adaptation New Dealer Strategies Programmable Selection of Recovered Dealer Strategies Player’s Adaptation New Player Strategies Programmable Selection of Recovered Player Strategies

  29. 2 2 2 2 2 A A A A 2 2 2 2 2 A A A A

  30. 2 2 2 2 A A A A 2 2 2 2 A A A A A A

  31. 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

  32. Player Says 2 Player MIGHT Change to Say Ace Dealer Folds Dealer MIGHT Change to Call

More Related