160 likes | 313 Views
Combining Tactical Search and Monte-Carlo in the Game of Go. by Tristan Cazenave & Bernard Helmstetter. Presenter: Ling Zhao University of Alberta November 1, 2005. Outline. Monte-Carlo Go Motivations Tactical search Gather statistics Combining search and Monte-Carlo
E N D
Combining Tactical Search and Monte-Carlo in the Game of Go by Tristan Cazenave & Bernard Helmstetter Presenter: Ling Zhao University of Alberta November 1, 2005
Outline • Monte-Carlo Go • Motivations • Tactical search • Gather statistics • Combining search and Monte-Carlo • Experimental results
Monte-Carlo Go • Invented in 1993 by Bruegmann using simulated annealing. • Based on Abramson’s expected-outcome model (1990). • Achieved a moderate success in 9x9.
Basic idea • Play a number of random games • Choose a move by 1-ply search, maximizing expected score • The only domain-dependent information is eye.
Weakness • Scalability due to large computation • Blunder due to lack of knowledge
Framework • Tactical search: capture, connection, eye, life and death. • Play random games and gather statistics for goals. • Goal evaluation: mean score of the game when the goal is achieved minus that of the game when the goal fails. • Pick the move associated with the best goal.
Tactical search • Capture search: for any string, find if it can be captured or saved. • Connection search: for any two strings, find if they can be connected. • Empty connection search: find if a string can be connected to an empty point. • Eye search: find if an eye can be made on an empty point or its neighbors. • Life and death search: use generalized widening for groups of strings.
Statistics on random games • Compute the mean for the random games where a goal is achieved and the mean for those where a goal has failed. • Two new goals for intersections: 1. The goal of playing first on an intersection. 2. The goal of owning an intersection at the end of a game.
Selecting problems • Strings cannot be disconnected will form groups. • Select the simplest problem for a goal: Avoid over-estimating goals.
Gather statistics • Play random games • For each selected goal, find the mean of the game when it succeeds and the mean when it fails. • Score of a life problem of a string: mean of the game when an intersection of the string keeps its color.
Choose a move • Find the goal with the maximum difference of two mean scores. • Choose the move associated with the goal.
Why is it useful? • High level classifications of points on the board. • Successful incorporation of tactical search.
Positive and negative goals • Positive goals: confidence with search results. • Negative goals: less confidence. Example: save a string (a string is consider safe when it has more than 4 liberties). Fix over-estimation.
Experimental results • New enhancement vs. standard MC Each plays 10,000 random games to choose a move on 20 9x9 games result: 52.1 (+-34.2). • First one plays 1,000 games, and the second one plays 10,000 games. result: 24.6 (+-40).
Experimental results (cont’d) • New enhancement vs. Golois Both use the same tactical search. The second one uses global search and hand tuned heuristics. 40 games were played. result: 26 points.
Conclusions • A creative idea to incorporate tactical search and Monte-Carlo. • Nice extension to the authors’ previous work. • The experimental results are good. • The program should be tested against the strongest program.