monte carlo go has a way to go
Download
Skip this Video
Download Presentation
Monte Carlo Go Has a Way to Go

Loading in 2 Seconds...

play fullscreen
1 / 24

Monte Carlo Go Has a Way to Go - PowerPoint PPT Presentation


  • 365 Views
  • Uploaded on

Monte Carlo Go Has a Way to Go. Adapted from the slides presented at AAAI 2006. Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1). (*1)University of Tokyo (*2) Future University Hakodate. Games in AI.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Monte Carlo Go Has a Way to Go' - Mia_John


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
monte carlo go has a way to go

Monte Carlo Go Has a Way to Go

Adapted from the slides presented at AAAI 2006

Haruhiro Yoshimoto (*1)

Kazuki Yoshizoe (*1)

Tomoyuki Kaneko (*1)

Akihiro Kishimoto (*2)

Kenjiro Taura (*1)

(*1)University of Tokyo

(*2)Future University Hakodate

games in ai
Games in AI
  • Ideal test bed for AI research
    • Clear results
    • Clear motivation
    • Good challenge
  • Success in search-based approach
    • chess (1997, Deep Blue)
    • and others
  • Not successful in the game of Go
    • Go is to Chess as Poetry is to Double-entry accounting
    • It goes to the core of artificial intelligence, which involves the study of learning and decision-making, strategic thinking, knowledge representation, pattern recognition and, perhaps most intriguingly, intuition
the game of go
The game of Go
  • An 4,000 years old board game from China
  • Standard size 19×19
  • Two players, Black and White, place the stones in turns
  • Stones can not be moved, but can be captured and taken off
  • Larger territory wins
playing strength
Playing Strength

$1.2M was set for beating a professional with no handicap (expired!!!)

Handtalk in 1997 claimed $7,700 for winning an 11-stone handicap match against a 8-9 years old master

difficulties in computer go
Difficulties in Computer Go
  • Large search space
    • the game becomes progressively more complex, at least for the first 100 ply
difficulties in computer go7
Difficulties in Computer Go
  • Lack of good evaluation function
    • a material advantage does not mean a simple way to victory, and may just mean that short-term gain has been given priority
    • legal moves around 150–250, usually <50 acceptable (even <10), but computers have a hard time distinguishing them.
  • Very high degree of pattern recognition involved in human capacity to play well.
why monte carlo go
Why Monte Carlo Go?

Replace evaluation function by random sampling

Brugmann:1993, Bouzy:2003

  • Success in other domains

Bridge [Ginsberg:1999], Poker [Billings et al.:2002]

  • Reasonable position evaluation based on sampling

search space from O(bd) to O(Nbd)

  • Easy to parallelize
  • Can win against search-based approach
    • Crazy Stone won the 11th Computer Olympiad in 9x9 Go
    • MoGo 19th, 20th KGS 9x9 winner, rated highest on CGOS
basic idea of monte carlo go
Basic idea of Monte Carlo Go
  • Generate next moves by 1-ply search
  • Play a number of random games and compute the expected score
  • Choose the move with the maximal score
  • The only domain-dependent information is eye.
terminal position of go
Terminal Position of Go

Larger territory wins

Territory =

surrounded area + stones

▲ Black’s territory is 36 points

× White’s territory is 45 points

White wins by 9 points

example
Play many sample games

Each player plays randomly

Compute average points for each move

Select the move that has the highest average

Example

Play rest of the game randomly

5 points win for black

9 points win for black

move A: (5 + 9) / 2 = 7 points

monte carlo go and sample size
Monte Carlo Go and Sample Size

Monte Carlo with

1000 sample games

  • Can reduce statistical errors with additional samples
  • Relationships between sample size and strength are not yet investigated
    • Sampling error~
    • N: # of random games

Diminishing returns must appear

Monte Carlo with

100 sample games

Stronger than

our monte carlo go implementation
Our Monte Carlo Go Implementation
  • basic Monte Carlo Go
  • atari-50 enhancement: Utilization of simple go knowledge in move selection
  • progressive pruning [Bouzy 2003]: statistical move pruning in simulations
atari 50 enhancement
Atari-50 Enhancement
  • Basic Monte Carlo: assign uniform probability for each move in sample game (no eye filling)
  • Atari-50: higher probability for capture moves
    • Capture is “mostly” a good move
    • 50%

Move A captures black stones

progressive pruning bouzy2003
Progressive Pruning [Bouzy2003]
  • Try sampling with smaller sample size
  • Prune statistically inferior moves

score

move

Can assign more sample games

to promising moves

experimental design
Experimental Design
  • Machine
    • Intel Xeon Dual CPU at 2.40 GHz with 2 GB memory
    • Use 64 PCs (128 processors) connected by 1GB/s network
  • Three versions of programs
    • BASIC: Basic Monte Carlo Go
    • ATARI: BASIC + Atari-50 enhancement
    • ATARIPP: ATARI + Progressive Pruning
  • Experiments
    • 200 self-play games
    • Analysis of decision quality from 58 professional games
decision quality of each move
Decision Quality of Each Move

a

b

c

1

20

17

10

2b -> 9 times

2c -> 1 times

15

2

25

30

3

12

21

7

Selected move for

100 sample game

Monte Carlo Go

Evaluation score of “Oracle”

(64 million sample games)

Average error of one move is

((30 – 30) * 9 + (30 - 15 ) * 1) / 10 = 1.5 points

summary of experimental results
Summary of Experimental Results
  • Additional enhancements improve strength of Monte Carlo Go
  • Diminish returns eventually
  • Additional enhancements get quicker diminishing returns
  • Need to collect more samples in the early stage game of 9x9 Go
conclusions and future work
Conclusions and Future Work
  • Conclusions
    • Additional samples achieve only small improvements
      • Not like search algorithm, e.g. chess
    • Good at strategy, not tactics
      • blunder due to lack of domain knowledge
    • Easy to evaluate
    • Easy to parallelize
    • The way for Monte Carlo Go to go

Small sample games with many enhancements will be promising

  • Future Work
    • Adjust probability with pattern matching
    • Learning
    • Search + Monte Carlo Go
      • MoGo (exploration-exploitation in the search tree using UCT)
    • Scale to 19×19
slide24

Questions

?

Reference:

  • Go wiki http://en.wikipedia.org/wiki/Go_(board_game)
  • Gnu Go http://www.gnu.org/software/gnugo/
  • KGS Go Server http://www.gokgs.com
  • CGOS 9x9 Computer Go Server http://cgos.boardspace.net
ad