Combining Tactical Search and Monte-Carlo in the Game of Go

Combining Tactical Search and Monte-Carlo in the Game of Go by Tristan Cazenave & Bernard Helmstetter Presenter: Ling Zhao University of Alberta November 1, 2005

Outline • Monte-Carlo Go • Motivations • Tactical search • Gather statistics • Combining search and Monte-Carlo • Experimental results

Monte-Carlo Go • Invented in 1993 by Bruegmann using simulated annealing. • Based on Abramson’s expected-outcome model (1990). • Achieved a moderate success in 9x9.

Basic idea • Play a number of random games • Choose a move by 1-ply search, maximizing expected score • The only domain-dependent information is eye.

Weakness • Scalability due to large computation • Blunder due to lack of knowledge

Framework • Tactical search: capture, connection, eye, life and death. • Play random games and gather statistics for goals. • Goal evaluation: mean score of the game when the goal is achieved minus that of the game when the goal fails. • Pick the move associated with the best goal.

Tactical search • Capture search: for any string, find if it can be captured or saved. • Connection search: for any two strings, find if they can be connected. • Empty connection search: find if a string can be connected to an empty point. • Eye search: find if an eye can be made on an empty point or its neighbors. • Life and death search: use generalized widening for groups of strings.

Statistics on random games • Compute the mean for the random games where a goal is achieved and the mean for those where a goal has failed. • Two new goals for intersections: 1. The goal of playing first on an intersection. 2. The goal of owning an intersection at the end of a game.

Selecting problems • Strings cannot be disconnected will form groups. • Select the simplest problem for a goal: Avoid over-estimating goals.

Gather statistics • Play random games • For each selected goal, find the mean of the game when it succeeds and the mean when it fails. • Score of a life problem of a string: mean of the game when an intersection of the string keeps its color.

Choose a move • Find the goal with the maximum difference of two mean scores. • Choose the move associated with the goal.

Why is it useful? • High level classifications of points on the board. • Successful incorporation of tactical search.

Positive and negative goals • Positive goals: confidence with search results. • Negative goals: less confidence. Example: save a string (a string is consider safe when it has more than 4 liberties). Fix over-estimation.

Experimental results • New enhancement vs. standard MC Each plays 10,000 random games to choose a move on 20 9x9 games result: 52.1 (+-34.2). • First one plays 1,000 games, and the second one plays 10,000 games. result: 24.6 (+-40).

Experimental results (cont’d) • New enhancement vs. Golois Both use the same tactical search. The second one uses global search and hand tuned heuristics. 40 games were played. result: 26 points.

Conclusions • A creative idea to incorporate tactical search and Monte-Carlo. • Nice extension to the authors’ previous work. • The experimental results are good. • The program should be tested against the strongest program.

Combining Tactical Search and Monte-Carlo in the Game of Go

Combining Tactical Search and Monte-Carlo in the Game of Go

Presentation Transcript

Monte Carlo Go Has a Way to Go

Monte Carlo Simulation

The Monte Carlo Method!!!

Monte Carlo

Sustainability and Risk in Real Estate Investments: Combining Monte Carlo Simulation and DCF

Combining Tensor Networks with Monte Carlo: Applications to the MERA

Combining MPI and GPGPU for Monte Carlo Applications

Monte-Carlo Tree Search

The Monte Carlo Center

The Monte Carlo method

Progressive Strategies For Monte-Carlo Tree Search

Monte-Carlo Search Algorithms

The HV1.0 Monte Carlo

Combining Monte Carlo Estimators

Optimally Combining Sampling Techniques for Monte Carlo Rendering

The Monte Carlo method

Monte-carlo and Bootstrapping

Monte Carlo Integration

Monte Carlo Issues