1 / 41

Multi-Player Games: Overview and Recent Research

Multi-Player Games: Overview and Recent Research. Spencer Polk COMP 4106 February 24, 2014. Overview. Game Playing history and introduction Two-Player Games (brief review) Mini-Max style Multi-Player algorithms Monte Carlo style Multi-Player algorithms

cullen
Download Presentation

Multi-Player Games: Overview and Recent Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Player Games: Overview andRecent Research Spencer Polk COMP 4106 February 24, 2014

  2. Overview • Game Playing history and introduction • Two-Player Games (brief review) • Mini-Max style Multi-Player algorithms • Monte Carlo style Multi-Player algorithms • Current research at Carleton in Multi-Player games

  3. Game Playing: Introduction • Canmachines outplay man? • Legends and Greek Mythology • “The Turk” (left) was presented to nobility as a chess-playing automaton • Dream has just come true! The Turk (1770)

  4. Game Playing: Introduction • The Turk: Played Napoleon, Benjamin Franklin, and Edgar Allen Poe • The Turk was, of course, a fraud • King vs Rook Strategies: Solved by automaton – 1914 • True AI game playing – Claude Shannon: 1950 • Before AI even named as a field – 1956 Dartmouth Conference • Shannon did not first propose Mini-Max theorem, but did first propose Mini-Max algorithm

  5. Game Playing: Introduction • Focus was on chess – still very focused today • Shannon studied chess in 1950 paper • Shannon saw as academic exercise only • Saw no practical purpose; no available hardware • 1970s: First commercially available chess playing programs • 1980s: Chess programs playing at expert levels • Still far to go to Grandmaster level

  6. Game Playing: Introduction • 1997: Deep Blue defeats Kasparov • First defeat of a chess grandmaster • Field branched out • Poker • Go Kasparov vs Deep Blue

  7. Mini-Max Algorithm Sample Mini-Max Tree

  8. Mini-Max Algorithm function integer minimax(node, depth): if node is terminal or depth <= 0 then return heuristic value of node else if node is max then val = −∞ for all child of node do val = max(val, minimax(child; depth − 1) end for else val = ∞ for all child of node do val = min(val; minimax(child; depth − 1) end for end if end if

  9. Alpha-Beta Pruning Alpha-cutoff at blue 4 (MIN node)

  10. Alpha-Beta Pruning • Creates bounds on maximum and minimum values • Alpha Cutoff – Already guaranteed more • Beta Cutoff – Opponent can guarantee less • Tighter bounds = More pruning! • How to improve bounds? – Move ordering! • Expert knowledge (game histories) • Ordering heuristics • Iterative Deepening

  11. Extending to Multi-Player Games • Mini-Max developed for Chess • Exclusively two-player, zero-sum game • Now, want to play multi-player games • Chinese Checkers • Multi-player Othello • Need to extend Mini-Max to multi-player games • Many ways to do this…

  12. Extending to Multi-Player Games • Problem: Mini-Max holds a single value for score • For two players, this is fine • Game is zero sum, so second player’s score is negation • Single value is very valuable – Pruning • Multi-player needs a way to do this • Simple solution: ALL opponents are negation • MAX-MIN-MIN, etc • Called Paranoid Algorithm

  13. Paranoid Algorithm Sample Paranoid Tree

  14. Paranoid Algorithm function integer paranoid(node, depth): if node is terminal or depth <= 0 then return heuristic value of node else if node is max then val = −∞ for all child of node do val = max(val, paranoid(child, depth − 1) end for else val = ∞ for all child of node do val = min(val, paranoid(child, depth − 1) end for end if return val end if

  15. Paranoid Algorithm • Algorithm exact same as Mini-Max in many cases • Pros: • Very simple to implement • Subject to Alpha-Beta Pruning (on MAX/MIN border) • Cons: • Sees all players as coalition – bad play • Limited look-ahead for perspective player

  16. Max-N Algorithm • 1986: Luckhardt and Irani • Attempt to address coalition problem • Keeps a tuple of scores, not a single value • Assumption: Player maximizes their own score • No consideration for other scores • Heuristic returns value for all N players: eg [5, 2, 11] • Nth player maximizes Nth score

  17. Max-N Algorithm Sample Max-N Tree

  18. Max-N Algorithm function integer[] max-n(node, depth): if node is terminal or depth <= 0 then return heuristic value of node else val = −∞ tuple = [] for all child of node do val = max(val, max-n(child; depth − 1)[node.player]) if val changed tuple = max-n(child; depth-1) end if end for return tuple end if

  19. Max-N Algorithm • In terms of raw Mini-Max: Very simple extension • Pros: • Players “look out for number one” • More realistic play • Perspective player can see more opportunities • Cons: • Pruning is very complicated – not as good • Can wind up worse than Paranoid again

  20. Best-Reply Search • Relatively new: February 2011 • All opponents considered to be one player • They only get one turn • Only opponent with best move gets to act • Return to MAX-MIN-MAX-MIN… • Essentially the same algorithm as Mini-Max… Again

  21. Best-Reply Search BRS (One level)

  22. Best-Reply Search function integer best-reply(node, depth): if node is terminal or depth <= 0 then return heuristic value of node else if node is max then val = −∞ for all child of node do val = max(val, best-reply(child; depth − 1) end for else val = ∞ for all opponents do for all opponent’s child at node do val = min(val; best-reply(child; depth − 1) end for end for end if end if

  23. Best-Reply Search • Attempt to get “best of both worlds” • Pros: • Balance between coalition and free-for-all • Allows Alpha-Beta pruning • Significant look-ahead for perspective player • Cons: • Illegal game states analyzed • Not valid for some games

  24. Monte-Carlo Methods • Entirely different way of looking at game playing • No heuristics or searching • Driven by random game playing • Good when there is no natural heuristic • Example: Go • Very simple example: Play 50 random games after each move, pick one with most wins

  25. UCT • Stands for Upper-Confidence bounds applied to Trees • Monte Carlo method used with trees • Navigate from root to leaf • Navigation method is key – leads tree expansion • Play random game(s) at leaf from that position • Propagate win/loss rate back up the tree • After time elapsed – pick move with best win rate

  26. UCT • From Root: Pick explored leaf that maximizes UCTValue • UCTValue = winrate + +sqrt(ln(parent.visits)/visits) • ALWAYS explore unexplored leaf first • Continue until an unexplored leaf is reached • Propagate win or loss back up – usually single value in UCT • Multi-Player and Two-Player are exactly the same

  27. UCT function integer uct(node, depth): for time-steps do position = root while position is explored val = −∞ for child of position !– Unexplored node check--! val = max(val, UCTValue(child)) position = val.node end for end while Play random game(s) at child while position is not root update win-rate for player at node position = position.parent end while end for

  28. Adaptive Data Structures • Other, completely unrelated field • Concerned with record access frequency • Problem: Elements in data structure accessed with different frequency • Solution: Change the structure of the data structure as elements are accessed • Can use list, tree or other • We will use it here to order players

  29. ADS – Move to Front Order of access: 3, 1, 1, 4, …

  30. ADS – Transposition Rule Order of access: R3, R1, R1, …

  31. Threat-ADS Heuristic BRS with Threat-ADS (one level)

  32. Threat-ADS Heuristic • Our contribution • ADS operations are constant, and small • ADS updated to move player with most threatening move forward • Achieves move ordering for Alpha-Beta Pruning

  33. BRS with Threat-ADS function integer brs_threat_ads(node, depth): if node is terminal or depth <= 0 then return heuristic value of node else if node is max then val = −∞ for all child of node do val = max(val, best-reply(child; depth − 1) end for else val = ∞ for all opponents in ADS do for all opponent’s child at node do val = min(val; best-reply(child; depth − 1) end for end for ADS.update(val.opponent) end if end if

  34. Experimental Framework • Game needed to test Threat-ADS heuristic • Needs: • BRS must be applicable • Game should be simple to implement • Use established games Focus and Chinese Checkers • Also develop the Virus Game

  35. Virus Game • Turn based game with N players • Played on 2D board • Goal is to eliminate all other players • Turn: Player “infects” a square they are adjacent to • All nearby squares, according to a configured pattern, are given to that player

  36. Virus Game

  37. Experimental Setup • One player: BRS with Threat-ADS • Others: Random (Interested in tree pruning) • Take Node Count over first few turns of the game • Count each node expanded, but not those pruned • Average over 200 games • Run for each of three games mentioned

  38. Results (Node Count)

  39. Discussion • See improvement in NC in all games • Improvement from 6% to 10% reduction in size • Represents hundreds of thousands of nodes • All results statistically significant to 95% certainty • Focus strongest results • Shows benefits of ADS in even simple capacity to game playing

  40. Research Conclusions and Future Work • Threat-ADS cannot worsen the BRS • ADS operations: O(1) • Threat-ADS only relies on basic BRS structure • Opens up new connections between ADS and MPG • Many possibilities for future work being explored now

  41. Project Ideas • Paranoid, Max-N, BRS game playing • UCT in two player game • UCT in simple multi-player game • Other Monte Carlo game playing algorithms you find • Creative application of ADS to algorithm discussed in class

More Related