Evolution and coevolution of anns playing go
Download
1 / 81

- PowerPoint PPT Presentation


  • 239 Views
  • Uploaded on

Evolution and Coevolution of ANNs playing Go Peter Mayer, 2004 Outline Computers and Games The game of Go Experimental Setup Training of Go playing ANNs Evolution of Go Playing ANNs Summary and Outlook Games Algorithms designed since AIs onset Clearly defined rules Still complex

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - johana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Evolution and coevolution of anns playing go l.jpg

Evolution and Coevolution of ANNs playing Go

Peter Mayer, 2004


Outline l.jpg
Outline

  • Computers and Games

  • The game of Go

  • Experimental Setup

  • Training of Go playing ANNs

  • Evolution of Go Playing ANNs

  • Summary and Outlook


Games l.jpg
Games

  • Algorithms designed since AIs onset

    • Clearly defined rules

    • Still complex

  • Chess received the most attention

    • More researched than Go

  • Two main approaches

    • Rely on expertise – directly programmed weighted features; Extensive knowledge

    • Use evolution – less knowledge; more versatility


The game of go l.jpg
The game of Go

  • Oldest (unaltered) strategic board game in the world

  • 10,000,000 players in Japan “alone”

  • Fairly simple rules

  • BUT difficult to master

    • Immense tree (~200 opts)

    • Complex structures

    • Many concurrent goals


Go rules l.jpg
Go Rules

  • 19x19 board

    • Empty in the beginning

  • Black & White “stones”

  • Black starts

  • Each turn

    • Place 1 stone

    • At an intersection

    • Never move stones

    • OR pass


Go rules 2 l.jpg
Go Rules [2]

  • Objective - Get the most points !

  • Points are acquired by:

    • Securing Territories

    • Capturing opp’s

      pieces


Go rules 3 l.jpg
Go Rules [3]

  • Stones at a vertically or horizontally adjacent intersection are called a group

  • An empty intersection adjacent to a stone or group is called a "liberty" of that group

  • 1 Liberty = group in “atari”

  • No liberties -> CAPTURE ! Group is removed

  • Example – Black places stone in X resulting in right figure


Go rules 4 l.jpg
Go Rules [4]

  • Stones can be placed anywhere, but cannot commit suicide (except Chinese)

  • Legal if stone simultaneously captures opponent’s group (2 right figures)

Suicide – white cannot place at X

White CAN place at X

Result: capture


Go rules 5 l.jpg
Go Rules [5]

  • Same position cannot occur more than once

  • Endless repetitions:

    • Black can capture at upper figure by placing at X

    • White - same by placing at Y

    • Black – repeat…

  • Ko rule

    • White may not place at Y before playing somewhere else first

    • Avoid any repetitions


Go rules live and dead groups l.jpg
Go Rules – Live and Dead groups

  • “Dead” groups if impossible to prevent capture

    • It is not necessary to do so

    • Group remains on board

    • At end of game, removed and added to captured stones

  • “Living” groups are impossible to capture

    • Group with 2 “eyes” – even if white surrounds it, playing at X or Y is suicide

    • Opponent must play elsewhere


Go basics end game l.jpg
Go Basics – End game

  • Play continues until both players pass

  • Players then alternatively play stones at “neutral” points – adjacent to both White and Black

    • Also known as “dame” (DAH-MAY)

  • Dead stones are removed from the board and counted with other prisoners (1 point per prisoner)

  • Also - 1 point for each intersection surrounded by player’s stones (“territory”)


Go basics end game example l.jpg
Go Basics – End game example

  • Prisoners were removed already

  • All 4 points marked X are dame – worthless

  • Black has

    • 7 points in UR (territory); 2 points in LL

    • 1 removed prisoner

    • TOTAL = 10 points

  • White has

    • 5 in UL; 2 in LR

    • 2 prisoners

    • TOTAL = 9 points

  • Black wins unless komi (5.5 pts compensation) is due


Ranking and handicaps l.jpg
Ranking and Handicaps

  • Determine Go players’ strength

  • Resemblance to martial arts

  • Both amateur and professional ranking system

  • Amateur

    • 35 kyu to 1 kyu

    • THEN 1 dan to 7 dan

  • Pro

    • 1 dan to 9 dan

    • Awarded only by Go institutions

  • Pro dans are much stronger than amateur dans


Ranking and handicaps 2 l.jpg
Ranking and Handicaps (2)

  • Handicaps

    • Weaker player starts with several stones on the board

    • Placed at specific places

    • Helps make games more even

  • Difference in ranks ~ number of handicap stones needed to win

    • 2 stones to even 2 dan against 4 dan

    • 4 to even 3 kyu and 2 dan

  • The most powerful Go programs reach only …

  • … 10 kyu!


Outline15 l.jpg
Outline

  • Computers and Games

  • The game of Go

  • Experimental Setup

  • Training of Go playing ANNs

  • Evolution of Go Playing ANNs

  • Summary and Outlook


Experimental setup l.jpg
Experimental Setup

  • Opponent Go players

  • ANN player

  • Go board (input) representations

  • Move (output) representations

  • Coevolution

    • Hall of Fame coevolution

    • Cultural coevolution

  • General evolution setup


Go players random l.jpg
Go Players - Random

  • No strategy

  • Pass move also

  • “Knows” only the rules of go

    • legality of moves

  • Usually weakest opponent


Go players na ve player l.jpg
Go Players – Naïve Player

  • Roughly human-beginner level

  • Able to save and capture stones

  • Knows about

    • Lost stones

    • Saving - connecting stones to living groups

    • Weak stones (not savable)


Go players na ve strategy l.jpg
Go Players – Naïve Strategy

  • A subset of JaGo’s (main opponent) strategy

  • Outline (arranged by priority):

    • Attempt to save

    • Try to put opponent into atari

    • Connect weak stones

    • Capture opponent groups in atari

    • Check intersections for placing stones

      • In random order

      • Make sure no (own) liberties decrease below 2 as a result

    • Perform Random move


Go players jago player l.jpg
Go Players – JaGo Player

  • Java based program

  • Best computer player used

    • Not a strong player ~16 kyu

  • Knows standard techniques

    • Mainly save & capture

  • Uses pattern matching

    • Looks at entire board

    • 32 patterns, with rotations and mirrors


Go players jago strategy 1 l.jpg
Go Players – JaGo Strategy (1)

  • Save stones in atari

  • Try to decrease liberties of large groups

  • Find own savable larger groups

  • Attack opponent’s groups (decreasing order:)

    • With 2 or more liberties and attackable

    • With 2 or more stones & less than 3 liberties

    • With 2 or less liberties


Go players jago strategy 2 l.jpg
Go Players – JaGo Strategy (2)

  • Save own groups with few liberties if savable

  • Start pattern matching – Response; Center

    • Random move order

  • Seek opponent’s groups to capture in 2 moves

  • Perform random move which isn’t of a bad pattern

  • Capture opponent’s single liberties

  • Connect own weak stones

  • PASS




Go players gnu go l.jpg
Go Players – GNU Go

  • Advantages

    • 5x5 to 19x19 boards

    • Handles handicaps well

    • Rated 10 kyu

  • Problems

    • 5x5 solved – open an C3 for 18.5 points (komi=5.5) – always wins in Black

    • GNU Go passes on B3, C2-4, D3 (only correct at C3)

    • Premature convergence of evolution


Ann player l.jpg
ANN Player

  • Inform ANN about actual position

  • Evaluate ANN output to receive next move

  • Representation is important!

  • Intention maps

    • For each Go move (including PASS) – value between [0,1]

    • High value – high intention to make move (and v.v)

    • Select legal move with highest value

      To avoid predictability – consider sub optimal moves also (“creativity factor”)


Player strength l.jpg
Player Strength

  • Commonly to receive a rating unrated Go players play against rated players (same in Chess)

  • The strength s of a player is determined by

    • The score of 1000 double games

    • Against each of 3 opponents: R, N, JaGo

    • Divided by the number of games (6,000)

  • 1 is perfect strength

  • 3 opponents help resist over-fitting


Player competence l.jpg
Player Competence

  • Strength is not understanding of rules (legality)

    • E.g. 2 players receive same score but only one always tried legal moves first

  • The competence C of a player is defined as follows:

  • bi = games; i = moves; tij = #tried illegal moves; kij = #possible illegal move

  • C is the averaged on all games


Board representations l.jpg
Board Representations

  • 19x19 boards

    • far too large

    • Even for evolved agents

  • Use only 5x5 boards


Board representations30 l.jpg
Board Representations

  • Should preprocess position to make ANNs life easier

  • Tested in training experiments

  • Standard Input Representation (SIR)

    • 2 neurons at each intersection :-

    • 1 per player’s piece; 1 per opponent’s

    • No distinction between B and W stones

    • Optional – 1 neuron to tell if B or W

    • (2*b^2) neurons (were b is board size) = 50


Representations nir l.jpg
Representations - NIR

  • Naïve Input Representation

  • More compact

  • 1 neuron per intersection

  • Set to -1 (player’s stone) or 1 (opponent’s)

  • 0 if empty

  • Uses half of SIRs neurons = 25


Representations lvir l.jpg
Representations - LVIR

  • Limited View Input Representation

  • Splits the Go board into several quadratic areas of size 3x3

  • Idea – simplest way of capturing stones works within this area

    • E.g. capture of 1 stone by surrounding it

  • Areas overlap at middle row and middle column

  • Coding – similar to SIR

    • w is number of areas (=4)

    • 72 Neurons

    • Could also be Naïve


Clever representations l.jpg
Clever Representations

  • Based on image processing and circuits

  • We want less raw inputs to allow ANN to concentrate more on features

  • Manhattan distance

    • Used in integrated circuits where wires run parallel to X or Y axis

    • Got its name from Manhattan NY, where streets are aligned in grid

    • P1 = (x1, x2)

    • P2 = (y1, y2)


Clever representations34 l.jpg
Clever Representations

  • Manhattan distance is related to distance of Go stones (no diagonals)

    • distance = [#(separating stones) – 1]

    • 1 if next to each other

    • 2 if separated by one stone

    • 3 for knight’s move or two separating stones


Representations c o matrix l.jpg
Representations: c-o-Matrix

  • Co-occurrence-matrices

  • Used in image processing

  • Many parameters are derived from it

    • Mean, Sd, energy, contrast, homogeneity, …

  • Quadratic

  • Based on a relation p between image positions (symmetric if p is)


Representations c o matrix36 l.jpg
Representations: c-o-Matrix

  • Elements C[i][j] =

    • Number of times pixels occur in an image of a specified value (color)

    • In the relation specified by p

    • Relative to other pixels

    • Size is number of different colors


Representations c o matrix37 l.jpg
Representations: c-o-Matrix

  • An actual go board is an “image” with 3 different colors (including empty)

  • Example

    • p1: Manhattan distance of 1 between 2 points

    • First matrix row:

    • B near B 16 times

    • B near W 3 times

    • B near empty 11 times


Representations c o matrix38 l.jpg
Representations: c-o-Matrix

  • Does not say much about absolute positions – must combine

  • SIR and C for whole board

  • NIR and C for whole board

  • NIR and Cs for 3x3 areas

  • sLVIR and Cs for 3x3 areas

  • NLVIR and Cs for 3x3 areas


Output representations l.jpg
Output Representations

  • Only 2 

  • Standard Output Representation (SOR)

    • Each intersection is represented by 1 neuron

    • 1 for PASS

    • (b^2 + 1) neurons


Output representations40 l.jpg
Output Representations

  • Row Column Output Representation (RCOR)

    • Used to decrease ANN size

    • 5 neurons for columns; 5 for rows

    • 1 for PASS

    • (2b + 1) neurons

    • Intention more complicated:

    • PASS intention is square of relevant neuron

  • RCOR Limits intention map:

    • v1>v2  y1>y2  v4>v3

    • All values positive, non-zero


Coevolution l.jpg
Coevolution

  • Derives non-static fitness, as in nature

  • 1 or more populations; interacting

  • Competitive [battle] vs. Cooperative [subtasks]

  • Advantages

    • “Who needs enemies when you got friends like these?” – saves finding opponents; Especially in Go where no strong program exists

    • Variety in fitness – adaptive opponents

    • No upper bound for improvement


Coevolution methods applied l.jpg
Coevolution Methods Applied

  • Based on work by Lubberts & Mikkulainen [2001]

  • Hall of Fame

    • Host population and Master population

    • Maintaining the ability of host population to beat opponents of previous generations

    • Each generation, the best individual is added to HoF

    • All population competes against sample of the HoF


Coevolution hof l.jpg
Coevolution - HoF

  • Applied in this resaearch

  • HoF initially filled without competition

  • Individuals get their fitness by competing against the masters

  • When full - host with highest win rates (against masters) joins HoF

    • Replace first Master to lose all games

  • Coevolutionary progress cannot be directly seen

    • Both populations constantly changeing


Cultural coevolution l.jpg
Cultural Coevolution

  • A new approach!

  • Maintains “culture” of masters resembling HoF

  • To enter culture, host must defeat all masters

    • Masters never replaced – unlimited culture size

  • Every individual receives a fitness score by competing against all masters

  • Culture growth rate decreases rapidly

    • Every new master is the strongest found (yet)


Cultural coevolution 2 l.jpg
Cultural Coevolution [2]

  • Numerous advantages

  • Maintains ability to defeat weak players

    • Keeps good solutions found

  • Same player cannot enter twice

    • Needs to defeat itself

  • Culture’s performance never decreases

  • Avoid focusing on a specific player’s weakness

    • As soon as any master is immune, the hosts have to find another way

    • More masters  less likely to remember all weaknesses


General evolution setup l.jpg
General Evolution Setup

  • Opponents – Random; Naïve; JaGo

  • Fitness = strength

    • Rate of wins against all 3 opponents

    • 6,000 games of both colors

  • Not using scores, only win rates

    • Defeating more opponents is better

  • Generalized Multi-Layer Perceptrons (GMLPs)

    • All non-loop connections are permitted

  • Evolving

    • Hidden neurons; connections; weights; bias (for non-input)


General evolution setup 2 l.jpg
General Evolution Setup [2]

  • 2 binary Chromosomes used

    • 1 for connections : 0-no 1-yes

    • 1 for hidden neurons (if 0, no connections also)

    • Number of possible connections:

    • ni, nh, no – number of input, hidden and output neurons

    • Determines size of chromosome

  • Real-Chromosome

    • Weights & Bias values (seen as weights)

    • Size is number of connections + number of bias vals (for non-input)


General evolution setup 3 l.jpg
General Evolution Setup [3]

  • Tournament selection (size 2)

  • 2 point crossover

  • Binary mutation

    • Flip bits with 1/L probability

  • Real-Chromosome Mutation

    • multiple-σSA

    • Each object maintains altering “strategy” params which alter distribution of “object” params

    • Normal distributions used for both


Setup recurrent nets l.jpg
Setup – Recurrent Nets

  • Difficult to learn Go without structured input

  • Experiments with recurrent nets included

  • Allow loops for input Ns

    • Naturally represent adjacent board intersections

  • No hidden Ns

  • Played against JaGo

  • Typically output changes without input change due to feedback loops

    • Computed output only once!

    • Only 2 directly connected Ns influence each other

    • Evolutions should connect only close Ns


Outline50 l.jpg
Outline

  • Computers and Games

  • The game of Go

  • Experimental Setup

  • Training of Go playing ANNs

  • Evolution of Go Playing ANNs

  • Summary and Outlook


Training anns setup l.jpg
Training ANNs – Setup

  • Testing IRs mentioned previously

  • No Go-specific knowledge used

  • Each experiment was repeated 20 times

  • Nets, same as Richards [1998]

    • 3 layers; Fully connected; Feed forward

    • Linear activation for input Ns; Sigmoid for rest

    • 50 input; 26 output; 100 hidden - 7600 connections

  • Patterns:

    • JaGo vs Jago; 5x5 board;

  • Rprop – resilient variant of Backprop


Training anns experiment 1 l.jpg
Training ANNs – Experiment 1

  • Determine number of training cycles

    • Too few cycles  Weights not adjusted properly

    • Too many  over-fitting

  • Determine training pattern set

    • Limit the level a Go player can reach

    • Should include all 3 game stages

    • Both expert and novice moves

  • JaGo vs JaGo

    • All game stages

    • No distinction between winner and loser moves

  • 1,000 .. 5,000 Cycles; 50/100/200 Games


Training anns results 1 l.jpg
Training ANNs – Results 1

  • Average of 20 runs

  • 100&200 games better than 50

  • 3000\5000 games don’t add strength

  • Best – 200 games; 2000 cycles

    • Used hereafter


Training anns experiment 2 l.jpg
Training ANNs – Experiment 2

  • Determine number of hidden Ns

  • Many

    • Diverse features

  • Few

    • Few stronger features (perhaps better 1s)

    • Less time-consuming

  • 100 Ns yielded best results  selected


Training anns experiment 3 l.jpg
Training ANNs – Experiment 3

  • Output representations

  • Standard (SOR) vs Row-Column (RCOR)

  • 200 patterns; 2000 games; 100 hidden Ns

  • Similar strength; RCOR competence slightly lower

  • RCOR still expansive and adds constraints

  • SOR is used in the following experiments


Training anns ir experiments l.jpg
Training ANNs – IR Experiments

  • Various input representations

  • Used reference-ANN (RANN)

    • SIR & SOR; 100 hidden; 7,600 connections

    • Strength = 0.2908; Competence = 0.8467

  • 2,000 games; 200 cycles

  • NIR (half input size) & SOR

    • Strength = 0.2093; Competence = 0.8031

    • Naïve input makes it difficult to learn Go

  • LVIR (3x3 windows) & SOR

    • Strength = 0.2755; Competence = 0.8258

    • Slightly lower; LVIR doesn’t add input difficulty


Training anns irs 2 l.jpg
Training ANNs – IRs [2]

  • Whole Co-occur-matrix (dist=1,2,3); SIR&SOR

    • Found better Strength & Competence!

    • Knight’s-Move matrix adds relevant information

  • Whole matrix (dist=1,2,3); NIR&SOR

    • 21% less connections due to NIR

    • Better than standard NIR, but still low


Training anns irs 3 l.jpg
Training ANNs – IRs [3]

  • 3x3 matrices (dist=1,2,3) ; NIR&SOR

    • Low but ~20% better than previous (whole matrix) NIR

  • 3x3 matrices (dist=1,2,3) ; LVIR\NLVIR

    • Both matrices and board views use 3x3 windows

    • No improvement; Huge number of Ns not necessary



Training anns irs summary60 l.jpg
Training ANNs – IRs Summary

  • Trained ANNS better against JaGo compared to Naïve

    • Although JaGo is better

    • Some over-fitting for good players

    • Against Naïve outputs close to zero – no repsonse

  • NIR ANNs generally weaker than SIR

  • Manhattan distance of 2 good against Random

  • IR + whole matrix (dist=2) was strongest

  • RANN is still best; Selected for evolution


Outline61 l.jpg
Outline

  • Computers and Games

  • The game of Go

  • Experimental Setup

  • Training of Go playing ANNs

  • Evolution of Go Playing ANNs

  • Summary and Outlook


Evolving go anns l.jpg
Evolving Go ANNs

  • Setup of Evolution experiments

  • Evolution of ANNs against Computer Players

    • Random Player; Naïve; JaGo

    • Recurrent against JaGo

  • Coevolution

    • Cultural

    • Hall of Fame

  • Training Evolved ANNs


Evolution setup l.jpg
Evolution Setup

  • 5x5 boards; Komi of 5.5

  • 50 Individuals

    • Described previously (3 chromosomes)

  • GMLPs with SIR and SOR

    • Max 3,010 connections

  • Recurrent ANNs

    • Using NIR (25 Ns) and SOR (26)

    • Max 2,601 connections

  • Same strength measure as training (6k games)


Evolution against random l.jpg
Evolution Against Random

  • Empirically 64 games to determine fitness

  • Best ANN evolved {Str=0.4005; Comp=0.48}

    • After 47 gens; 929 connections

  • Evolved ANNs hardly reacted to different positions

    • Always in the middle; Never in corners – creates eyes

    • Unnecessary to “think” against Random

  • Occasionally Random places at strategic intersection and then usually wins

  • Only 3 of 20 best ANNs open at optimal C3


Evolution against naive l.jpg
Evolution Against Naive

  • Better player; ANNs develop better strategies

  • Same setting

  • 200 gens for ALL population to win ½ of games – fast learning

  • Best {Str=0.69; Comp=0.487} after 2915 gens

  • High strength and only 10 hidden !!

  • Win rates

    • Same against Naïve and Random

    • Low against JaGo (~0.2)

  • 25% use optimal opening move (still low)

  • Exploit Naïve’s weaknesses at endgames


Evolution against jago l.jpg
Evolution Against JaGo

  • Far stronger than Naïve (85% wins)

  • Takes significantly more time for each move

    • Used distributed computing

    • 64 games would take 32 hours per run

    • Only 32 games for fitness - empirically sufficient

  • Best {Str=0.772; Comp=0.476} after 1909 gens

    • Scores 100% wins

    • 1k gens to score 0.4;

    • In 4 runs 100% wins in 3k gens!!!

  • Sd twice as large – harder for evolution

  • Weak against Naïve ~0.4;Strong against Random


Evolution against jago67 l.jpg
Evolution Against JaGo

  • Again, low competence ~0.5

  • Evolved strategies

    • Still connecting stones but faster (responsive)

    • Tenuki (abandon & play elsewhere) to distract JaGo

    • 9 open optimally; All in 3x3 area around center

    • Strength depends heavily on opening move

    • Mid games sometimes show standard Go sequences!

    • Take advantage of JaGo’s weakness – capturing weak stones


Recurrent nets evolution l.jpg
Recurrent Nets Evolution

  • Natural representation on Go board

    • Input are connected

  • More time consuming

    • Only 2 runs; 32 games; setting described previously

  • 100% win rate within 1k generations!!!

  • Both nets open at C3

  • Strategies

    • 1 aggressive;1 distractive

    • Protect; Create living groups; Bad Endgames

  • Very high relative strength

    • 0.94 Random; 0.49 Naïve (never played before)


Cultural coevolution69 l.jpg
Cultural Coevolution

  • Until now much over-fitting was observed

  • Fitness

    • 8 games against all masters (4 each color)

    • Few because games are quite similar

  • Results of typical run – host population

    • 3,500 gens

    • 90% wins at 500 gens

    • Stagnation around 1k

    • Last master added at 462

    • After 2k Mean fitness decreases


Cultural coevolution 270 l.jpg
Cultural Coevolution [2]

  • Masters

    • 21 ANNs

    • After number 8 all have R>0.8

    • Last obtained Strength of 0.365

  • Strategy (both populations)

    • Many random move selection

    • Due to many saturated Ns (output=1)

    • Games usually similar but multiple random moves are hard to defeat

    • May be cause by mutation (Multiple-Self Adaption)


Cultural coevolution 3 l.jpg
Cultural Coevolution [3]

  • Strategy (cont.)

    • Coevolution found easy solution

    • Computer players are very difficult to beat with saturated neurons

  • New extremely long experiment (60k gens!) was performed with different mutation (single-SA)

    • Similar results, Except:

    • Now most culture growth until gen 10k (last at 40k)

    • Now less saturated Neurons

    • Less fitness decrease despite increasing culture Strength


Cultural coevolution 4 l.jpg
Cultural Coevolution [4]

  • Culture Summary

    • 80 members

    • After #16 Random>0.94

    • After #29 all opened optimally

    • After #57 all Strength>0.4

    • Wins against JaGo ~0.5 Naïve

    • ~15 hidden Ns – fluctuate between successive


Recurrent cultural l.jpg
Recurrent & Cultural

  • 10k gens

  • Faster learning but basically same results

    • R>0.9 at C11 (compared to C14)

    • N>0.2 at 14 (compared to C37)

  • Strategy

    • Still bad against JaGo

    • Bad openings! (only 2% optimal)

    • Only last 5 masters close to center

    • Learned not to capture dead groups


Hall of fame coevolution l.jpg
Hall of Fame Coevolution

  • Compared to Cultural

  • Parameters

    • Important parameter is HoF size={1,2,4,8,16}

    • Eight games against each master

    • 3k gens were coevolved

    • After coevolution all HoF ANNs were evaluated

    • Every 100 gens the best ANN was evaluated


Hall of fame coevolution 2 l.jpg
Hall of Fame Coevolution [2]

  • Results – HoF size 1

    • Masters – low strength of 0.3625

    • In gen 1k – one ANN had 0.4  Lost solution

    • HoF changed every generation  cycles

  • Results – HoF size 16

    • Master 5 – highest strength of 0.4403 in gen 400

    • Strength of 0.5057 was obtained and lost

    • One master was replaced in every generation!

    • Somehow weak masters remained in the HoF

    • Host population stagnates (cycles)


Hall of fame coevolution 3 l.jpg
Hall of Fame Coevolution [3]

  • Strategies

    • All place first stone at D4!

    • HoF coevolution does not encourage diversity among ANNs


Training evolved anns l.jpg
Training Evolved ANNs

  • Evolution against JaGo –

    • Strength ~0.77

    • 4-16 hidden Ns

  • Training

    • Strength ~0.3

    • 100 hidden Ns

  • Check whether evolved structure is good

    • Train after evolution

    • Train without evolution only using structure


Training evolved anns 2 l.jpg
Training Evolved ANNs[2]

  • Used best 2 evolved ANNs against JaGo

    • Taken from runs 11 & 17

    • ANN11 – 10 hidden; 1178 connections

    • ANN17 – 14 hidden; 1162 connections

  • Trained with 200 games; 2,000 cycles

  • Experiment 1 (post-evolution) Results

    • Bad!  Strength of 0.11 and 0.10 –

    • Lower than any trained ANN (RANN has 0.29)

    • High competence 0.89


Training evolved anns 3 l.jpg
Training Evolved ANNs[3]

  • Experiment 2 – keep only evolved structure

    • Strength below 0.152 (RANN is 0.29)

    • Weakest against JaGo (0.05) although trained with JaGo

    • Against Naïve 0.11 (same as RANN)

  • Evolutions creates efficient structures

    • Few hidden Ns

    • Difficult to learn with training

  • High competence due to they seldom responded with same move to different positions


Summary l.jpg
Summary

  • Training could not achieve high Go playing skills

  • Evolved ANNs specialized in the opponent which was used during evolution

  • Cultural coevolution generated strong players

    • Strength increasing throughout the process

    • Perhaps an ANN stronger than amateurs can be coevolved

  • Recurrent nets learned faster


Summary 2 l.jpg
Summary [2]

  • 2 coevolved (recurrent and feed-forward) won the grand tournament

  • Coevolution proved better than evolution for developing Go strategies

  • Recurrent ANNs would provide a field for further research

    • More natural board representation

    • Could contain a fixed input layer representing the board


ad