- 221 Views
- Uploaded on
- Presentation posted in: General

Game Playing

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Human Game Playing

- Intellectual Activity
- Skill Comparison

Computer Game Playing

- Testbed for AI
- Limitations

General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at “runtime”.

Translation: They don’t know the rules until the game starts.

Additional Complexities:

uncertainty (moves of other players)

resource bounds (game clock)

Unlike specialized game players (e.g. Deep Blue), they do not use algorithms designed in advance for specific games.

Artificial Intelligence Technologies

knowledge representation

reasoning, reformulation, learning

rational behavior w/ uncertainty, resource bounds

Annual AAAI Competition

Held at AAAI conference

Administered by Stanford

(Stanford folks not eligible to participate)

Reward

2005 - Jim Clune

2006 - Stephan Schiffel and Michael Thielscher

2007 - Yngvi Bjornsson and Hilmar Finsson

2008 - Yngvi Bjornsson and Hilmar Finsson

Come cheer on your favorite.

The battle continues at AAAI-2007.

General Game Description

a

b

s2

s5

s8

a

a

b

a

a

b

d

b

b

s1

s3

s6

s9

s11

d

a

c

a

a

c

b

b

a

s7

s4

s10

aa

ab

s2

s5

s8

aa

bb

ab

aa

aa

ab

bb

ab

ab

s1

s3

s6

s9

s11

ba

aa

ba

ba

ab

bb

bb

aa

aa

s4

s7

s10

Since all of the games that we are considering are finite, it is possible in principle to communicate game information in the form of transition graphs.

Problem:Size of description. Even though everything is finite, the graphs can be large.

Ideas:

1. Different Conceptualization

2. Compact Encoding

s1

Decompose states into “propositions”.

Benefit - n propositions can represent 2n states.

p

q

s

r

Propositions

Connectives

Transitions

p

q

r

p

r

q

x31

o11

x21

o21

x11

o31

o32

x22

x12

x32

o22

o12

or1

xr2

xr1

or3

xr3

or2

o33

x13

o23

x23

x33

o13

row1o :- row1x :-

true(11o), true(11x),

true(12o), true(12x),

true(13o) true(13x)

row2o :- row1x :-

true(21o), true(21x),

true(22o), true(22x),

true(23o) true(23x)

row3o :- row1x :-

true(31o), true(31x),

true(32o), true(32x),

true(33o) true(33x)

- row(M,P) :-
- true(cell(M,1,P)) &
- true(cell(M,2,P)) &
- true(cell(M,3,P))

init(cell(1,1,b))

init(cell(1,2,b))

init(cell(1,3,b))

init(cell(2,1,b))

init(cell(2,2,b))

init(cell(2,3,b))

init(cell(3,1,b))

init(cell(3,2,b))

init(cell(3,3,b))

init(control(x))

legal(P,mark(X,Y)) :-

true(cell(X,Y,b)) &

true(control(P))

legal(x,noop) :-

true(control(o))

legal(o,noop) :-

true(control(x))

next(cell(M,N,P)) :-

does(P,mark(M,N))

next(cell(M,N,Z)) :-

does(P,mark(M,N)) &

true(cell(M,N,Z)) & Z#b

next(cell(M,N,b)) :-

does(P,mark(J,K)) &

true(cell(M,N,b)) &

(M#J | N#K)

next(control(x)) :-

true(control(o))

next(control(o)) :-

true(control(x))

terminal:- line(P)

terminal:- ~open

goal(x,100) :- line(x)

goal(x,50) :- draw

goal(x,0) :- line(o)

goal(o,100) :- line(o)

goal(o,50) :- draw

goal(o,0) :- line(x)

row(M,P) :-

true(cell(M,1,P)) &

true(cell(M,2,P)) &

true(cell(M,3,P))

column(N,P) :-

true(cell(1,N,P)) &

true(cell(2,N,P)) &

true(cell(3,N,P))

diagonal(P) :-

true(cell(1,1,P)) &

true(cell(2,2,P)) &

true(cell(3,3,P))

diagonal(P) :-

true(cell(1,3,P)) &

true(cell(2,2,P)) &

true(cell(3,1,P))

line(P) :- row(M,P)

line(P) :- column(N,P)

line(P) :- diagonal(P)

open :-true(cell(M,N,b))

draw :- ~line(x) &

~line(o)

- What we see:
- next(cell(M,N,x)) :-
- does(white,mark(M,N)) &
- true(cell(M,N,b))

- What the player sees:
- next(welcoul(M,N,himenoing)) :-
- does(himenoing,dukepse(M,N)) &
- true(welcoul(M,N,lorenchise))

General Game Playing

Possible to use logical reasoning for everything

Computing legality of moves

Computing consequences of actions

Computing goal achievement

Computing termination

Easy to convert from logic to other representations

Simplicity of logical formulation

Simple, widely published algorithms

3-4 orders or magnitude speedup

no asymptotic change

X

O

X

cell(1,1,x)

cell(1,2,b)

cell(1,3,b)

cell(2,1,b)

cell(2,2,o)

cell(2,3,b)

cell(3,1,b)

cell(3,2,b)

cell(3,3,x)

control(black)

legal(W,mark(X,Y)) :-cell(1,1,x)

true(cell(X,Y,b)) cell(1,2,b)

true(control(W))cell(1,3,b)

cell(2,1,b)

legal(white,noop) :-cell(2,2,o)

true(cell(X,Y,b)) cell(2,3,b)

true(control(black))cell(3,1,b)

cell(3,2,b)

legal(black,noop) :-cell(3,3,x)

true(cell(X,Y,b)) control(black)

true(control(white))

Conclusions:

legal(white,noop) legal(black,mark(1,2))

…

legal(black,mark(3,2))

cell(1,1,x)cell(1,1,x)

cell(1,2,b)cell(1,2,b)

cell(1,3,b)cell(1,3,o)

cell(2,1,b) blackcell(2,1,b)

cell(2,2,o)cell(2,2,o)

cell(2,3,b) mark(1,3)cell(2,3,b)

cell(3,1,b)cell(3,1,b)

cell(3,2,b)cell(3,2,b)

cell(3,3,x)cell(3,3,x)

control(black)control(white)

next(cell(M,N,o)) :-

does(black,mark(M,N)) &

true(cell(M,N,b))

X

O

X

O

O

X

X

X

X

O

X

O

X

O

X

X

X

O

O

O

O

X

O

X

O

X

X

X

O

X

O

X

X

X

X

X

O

X

O

X

O

X

O

O

X

X

X

X

X

O

O

X

O

O

O

X

X

O

X

O

O

X

X

X

X

O

X

O

X

O

X

X

X

O

O

O

O

X

O

X

O

X

X

Large state spaces

~5000 states in Tic-Tac-Toe

~1044 states in Chess

Limited Resources

Memory

Time (start clock, move clock)

Incremental Search

Expand Game Graph incrementally

As much as time allows

Minimax/etc. evaluation on non-terminal states

using an evaluation function of some sort

But how do we evaluate non-terminal states?

In traditional game playing, the rules are known in advance, and the programmer can invent an evaluation function. Not possible with GGP.

Ideas

Goal proximity (everyone)

Maximize mobility (Barney Pell)

Minimize opponent’s mobility (Jim Clune)

Depth Charge (random probe)

<insert your good idea here>

Basic Idea

(1) Explore game graph to some level

storing generated states

(2) Beyond this, explore to end of game

making random choices for moves of all players

not storing states (to limit space growth)

(3) Assign expected utilities to states by

summing utilities and dividing by number of trials

Features

Fast because no search

Small space because nothing stored

100

0

0

0

0

100

100

0

0

0

0

0

100

0

100

100

25

50

75

0

100

0

0

0

0

100

100

0

0

0

0

0

100

0

100

100

Metagaming

Metagaming is the process of analyzing and/or modifying a game offline (e.g. during the startclock or between moves or in a separate thread) so as to improve performance online (once the game begins, recommences, etc.).

Examples of Metagaming Techniques:

Headstart

End Game Book

Factoring

Dead State Removal

And so forth

Hodgepodge = Chess + Othello

Branching factor: aBranching factor: b

Analysis of joint game:

Branching factor as given to players: a*b

Fringe of tree at depth n as given: (a*b)n

Fringe of tree at depth n factored: an+bn

Examples

Symmetry, e.g. Tic-Tac-Toe

Independence, e.g Hodgepodge

Bottlenecks, e.g. Triathalon

Techniques:

Graph Analysis (factoring, branching factor, …)

Proofs (esp those using mathematical induction)

Trade-off - cost of finding structure vs savings

Sometimes cost varies with size of description

rather than size of expanded game graph

Use of startclock versus playclock

p1

p2

a1

a2

r1

r2

q1

q2

s1

n1

n2

s2

t1

t2

l

n3

n4

l1

l2

o1

a3

o2

a4

g

Method:

Compute factors

Use factors to generate “subplans”

Reassemble overall plan from subplans

Patterns:

Disjunctive Factors

* Conjunctive Factors

* Interleaved Factors

Sequential Factors

Overall factors versus conditional factors

Conjunctive Goals

Delete conjunctive goal

Make conjuncts goals for each factor

Conjunctive Legality

Delete legality conjunction

Make conjuncts goals for each factor

Solve problem, conjoin solutions.

Can delete disjunctions too, but treating disjunctions as conjunctions might destroy completeness.

p1

p2

a1

a2

r1

r2

q1

q2

s1

n1

n2

s2

t1

t2

l

n3

n4

l1

l2

o1

o3

o2

a3

g

n3

n4

l1

l2

o1

o2

p1

p2

a1

a2

r1

r2

q1

q2

g1

n1

n2

g2

t1

t2

? (time (genplan propcompbuttons))

407 states

1,118 milliseconds.

605,728 bytes of memory allocated.

(PROG A B A D E D)

? (time (multiplan propcompbuttons))

14 states

53 milliseconds

22,320 bytes of memory allocated.

(PROG A B A D E D)

Partition time: 1 millisecond.

Conclusion

Critique: Game playing is frivolous.

Reply: General game playing is

serious business.

AI: general problem solving

multiple agents, resource bounds

DB: Update through views with dynamic

constraints and long-running transactions

Systems: Automatic Programming redux

Motivating Applications:

Enterprise Management, Computational Law

Funding by Darpa, SAP, etc.

Like General Game Theory, traditional Game Theory is concerned with games in general.

Game Theory is concerned with the analysis of game trees. There is little concern for how those trees are communicated. There is little concern for trade-offs between deliberation and action.

In General Game Playing, the game “tree” is communicated at runtime. Hence there is an emphasis on description and use of this description. There is the possibility of new forms of rationality.

Planning:

Inputs: State machine, initial state, goal states

Output: Plan that transforms initial to goal state

General Game Playing:

Inputs: State machine, initial state, goal states

Objective: Reach a goal state (plan/execute?)

Difference:

Presence of execution environment

with time bounds on combined deliberation/action

Study of interleaved planning and execution

Critique: Human Intelligence is arguably the product of eons of evolution. We are, to some extent at least, “wired” to function well in this world. General Game Players have nothing at their disposal but mathematics.

Reply 1: A key indicator of intelligence is the ability of an individual to function in radically new environments.

Reply 2: A track of the competition in which the job is to figure out the rules.

Critique: Real intelligence requires the ability to figure out new environments.

Reply: Well, there is no question that that ability is essential. However, real intelligence also requires the ability to use theories once they are formed.

This is the domain of specification-based systems / declarative systems / and so forth; and interest in this problem dates to the beginning of the field.

The main advantage we expect the advice taker to have is that its behavior will be improvable merely by making statements to it, telling it about its … environment and what is wanted from it. To make these statements will require little, if any, knowledge of the program or the previous knowledge of the advice taker.

The potential use of computers by people to accomplish tasks can be “one-dimensionalized” into a spectrum representing the nature of the instruction that must be given the computer to do its job. Call it the what-to-how spectrum. At one extreme of the spectrum, the user supplies his intelligence to instruct the machine with precision exactly how to do his job step-by-step. ... At the other end of the spectrum is the user with his real problem. ... He aspires to communicate what he wants done ... without having to lay out in detail all necessary subgoals for adequate performance.

computer/robot

v

A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.

Another approach to merging plans is to interleave the actions of one plan with actions of the other, for each operational action performing a non-operational action in the other factors.

Show that the propositions in a factor do not change unless one of the actions in the factor is executed.

Note that we must be able to prove this for every proposition in the factor.

Ideas

* Novelty with reversibility

* Goal-monotonic observables

* Bad states, useless moves

<insert your good idea here>

s1

a

d

b

d

e

c

a

d

g

g

b

d

h

c

e

d

g

e

h

f

h

Decompose states into “relations”.

Use relational operators to capture behavior.

q

p

r

s

p

r

2,1

q

a

b

c

- Relational Net Fragment
- View Definitions Physics
- true(r(X,Y,Z)) :- next(r(X,Y,Z)) :-
- true(p(X,Y)) & true(p(X,Y)) &
- true(q(Y,Z)) true(q(Y,Z))

p

r

2,1

q

X

O

X

X

O

O

X

X

X

O

X

O

O

O

O

X

O

X

Modeling other players

Intra-game modeling useful only for long games

Inter-game modeling possible but

Unfortunately, identity information not available

Must recognize style of play

Assumptions

Rationality?

Aidriven.learner Zhenzhong Xu (AiDriven)

Apple-rt Xinxin Sheng, Paul Breimyer (NCSU)

Cluneplayer* Jim Clune (UCLA)

Entropy Richard Holcombe (private citizen)

Fluxplayer S. Schiffel, M. Thielscher (Dresden)

Ggp_remote_agent Mohammed Shafiei (Sharif U, Iran)

Jigsawbot Raghuvar Nadig (IIT Mumbai)

Luckylemming S. Diestelhorst, M. Günther (Dresden)

Nnrg.hazel J. Resinger, I. Karpov,E. Bahceci (UTA)

Ogre David Kaiser (Florida Intl University)

Pires5600 G. Kuhlmann, Peter Stone (UT Austin)

The Michael Marlen (Kansas State)

* 2005 World Champion

Preliminary Rounds over 3 months

4 Rounds in all

72 matches per player (478 matches in total)

42 different games

Final at AAAI-06

1 on 1 Showdown between top two scorers

Winner take all

Fluxplayer

Stephan Schiffel and Michael Thielscher

Dresden University