Game playing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 78

Game Playing PowerPoint PPT Presentation


  • 182 Views
  • Uploaded on
  • Presentation posted in: General

Game Playing. Human Game Playing Intellectual Activity Skill Comparison. Computer Game Playing Testbed for AI Limitations. General Game Playing. General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at “runtime”.

Download Presentation

Game Playing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Game playing

Game Playing

Human Game Playing

  • Intellectual Activity

  • Skill Comparison

Computer Game Playing

  • Testbed for AI

  • Limitations


General game playing

General Game Playing

General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at “runtime”.

Translation: They don’t know the rules until the game starts.

Additional Complexities:

uncertainty (moves of other players)

resource bounds (game clock)


Technology

Technology

Unlike specialized game players (e.g. Deep Blue), they do not use algorithms designed in advance for specific games.

Artificial Intelligence Technologies

knowledge representation

reasoning, reformulation, learning

rational behavior w/ uncertainty, resource bounds


Knight zone chess

Knight-Zone Chess


Bughouse chess

Bughouse Chess


Variety of games

Variety of Games


Annual aaai ggp competition

Annual AAAI GGP Competition

Annual AAAI Competition

Held at AAAI conference

Administered by Stanford

(Stanford folks not eligible to participate)

Reward

2005 - Jim Clune

2006 - Stephan Schiffel and Michael Thielscher

2007 - Yngvi Bjornsson and Hilmar Finsson

2008 - Yngvi Bjornsson and Hilmar Finsson


Aaai 05 winner jim clune

AAAI-05 Winner Jim Clune


Aaai 06 winners

AAAI-06 Winners


Other games other winners

Other Games, Other Winners


Carbon versus silicon

Carbon versus Silicon

Come cheer on your favorite.

The battle continues at AAAI-2007.


Game playing

General Game Description


Single player game as a state machine

Single Player Game as a State Machine

a

b

s2

s5

s8

a

a

b

a

a

b

d

b

b

s1

s3

s6

s9

s11

d

a

c

a

a

c

b

b

a

s7

s4

s10


Composite actions for multiple players

Composite Actions for Multiple Players

aa

ab

s2

s5

s8

aa

bb

ab

aa

aa

ab

bb

ab

ab

s1

s3

s6

s9

s11

ba

aa

ba

ba

ab

bb

bb

aa

aa

s4

s7

s10


Direct description

Direct Description

Since all of the games that we are considering are finite, it is possible in principle to communicate game information in the form of transition graphs.

Problem:Size of description. Even though everything is finite, the graphs can be large.

Ideas:

1. Different Conceptualization

2. Compact Encoding


Propositions

s1

Propositions

Decompose states into “propositions”.

Benefit - n propositions can represent 2n states.

p

q

s

r


Propositional net components

Propositional Net Components

Propositions

Connectives

Transitions

p

q

r


Propositional net

p

r

q

Propositional Net


Propositional net fragment

x31

o11

x21

o21

x11

o31

o32

x22

x12

x32

o22

o12

or1

xr2

xr1

or3

xr3

or2

o33

x13

o23

x23

x33

o13

Propositional Net Fragment


Propositional gdl

Propositional GDL

row1o :- row1x :-

true(11o), true(11x),

true(12o), true(12x),

true(13o) true(13x)

row2o :- row1x :-

true(21o), true(21x),

true(22o), true(22x),

true(23o) true(23x)

row3o :- row1x :-

true(31o), true(31x),

true(32o), true(32x),

true(33o) true(33x)


Relational gdl

Relational GDL

  • row(M,P) :-

  • true(cell(M,1,P)) &

  • true(cell(M,2,P)) &

  • true(cell(M,3,P))


Tic tac toe

Tic-Tac-Toe

init(cell(1,1,b))

init(cell(1,2,b))

init(cell(1,3,b))

init(cell(2,1,b))

init(cell(2,2,b))

init(cell(2,3,b))

init(cell(3,1,b))

init(cell(3,2,b))

init(cell(3,3,b))

init(control(x))

legal(P,mark(X,Y)) :-

true(cell(X,Y,b)) &

true(control(P))

legal(x,noop) :-

true(control(o))

legal(o,noop) :-

true(control(x))

next(cell(M,N,P)) :-

does(P,mark(M,N))

next(cell(M,N,Z)) :-

does(P,mark(M,N)) &

true(cell(M,N,Z)) & Z#b

next(cell(M,N,b)) :-

does(P,mark(J,K)) &

true(cell(M,N,b)) &

(M#J | N#K)

next(control(x)) :-

true(control(o))

next(control(o)) :-

true(control(x))

terminal:- line(P)

terminal:- ~open

goal(x,100) :- line(x)

goal(x,50) :- draw

goal(x,0) :- line(o)

goal(o,100) :- line(o)

goal(o,50) :- draw

goal(o,0) :- line(x)

row(M,P) :-

true(cell(M,1,P)) &

true(cell(M,2,P)) &

true(cell(M,3,P))

column(N,P) :-

true(cell(1,N,P)) &

true(cell(2,N,P)) &

true(cell(3,N,P))

diagonal(P) :-

true(cell(1,1,P)) &

true(cell(2,2,P)) &

true(cell(3,3,P))

diagonal(P) :-

true(cell(1,3,P)) &

true(cell(2,2,P)) &

true(cell(3,1,P))

line(P) :- row(M,P)

line(P) :- column(N,P)

line(P) :- diagonal(P)

open :-true(cell(M,N,b))

draw :- ~line(x) &

~line(o)


No built in assumptions

No Built-in Assumptions

  • What we see:

    • next(cell(M,N,x)) :-

    • does(white,mark(M,N)) &

    • true(cell(M,N,b))

  • What the player sees:

    • next(welcoul(M,N,himenoing)) :-

    • does(himenoing,dukepse(M,N)) &

    • true(welcoul(M,N,lorenchise))


Game playing

General Game Playing


Logic

Logic

Possible to use logical reasoning for everything

Computing legality of moves

Computing consequences of actions

Computing goal achievement

Computing termination

Easy to convert from logic to other representations

Simplicity of logical formulation

Simple, widely published algorithms

3-4 orders or magnitude speedup

no asymptotic change


State representation

X

O

X

State Representation

cell(1,1,x)

cell(1,2,b)

cell(1,3,b)

cell(2,1,b)

cell(2,2,o)

cell(2,3,b)

cell(3,1,b)

cell(3,2,b)

cell(3,3,x)

control(black)


Legality computation

Legality Computation

legal(W,mark(X,Y)) :-cell(1,1,x)

true(cell(X,Y,b)) cell(1,2,b)

true(control(W))cell(1,3,b)

cell(2,1,b)

legal(white,noop) :-cell(2,2,o)

true(cell(X,Y,b)) cell(2,3,b)

true(control(black))cell(3,1,b)

cell(3,2,b)

legal(black,noop) :-cell(3,3,x)

true(cell(X,Y,b)) control(black)

true(control(white))

Conclusions:

legal(white,noop) legal(black,mark(1,2))

legal(black,mark(3,2))


Update computation

Update Computation

cell(1,1,x)cell(1,1,x)

cell(1,2,b)cell(1,2,b)

cell(1,3,b)cell(1,3,o)

cell(2,1,b) blackcell(2,1,b)

cell(2,2,o)cell(2,2,o)

cell(2,3,b) mark(1,3)cell(2,3,b)

cell(3,1,b)cell(3,1,b)

cell(3,2,b)cell(3,2,b)

cell(3,3,x)cell(3,3,x)

control(black)control(white)

next(cell(M,N,o)) :-

does(black,mark(M,N)) &

true(cell(M,N,b))


Game tree expansion

X

O

X

O

O

X

X

X

X

O

X

O

X

O

X

X

X

O

O

O

O

X

O

X

O

X

X

Game Tree Expansion


Game tree search

X

O

X

O

X

X

X

X

X

O

X

O

X

O

X

O

O

X

X

X

X

X

O

O

X

O

O

O

X

X

O

X

O

O

X

X

X

X

O

X

O

X

O

X

X

X

O

O

O

O

X

O

X

O

X

X

Game Tree Search


Resource limitations

Resource Limitations

Large state spaces

~5000 states in Tic-Tac-Toe

~1044 states in Chess

Limited Resources

Memory

Time (start clock, move clock)


Incremental search

Incremental Search

Incremental Search

Expand Game Graph incrementally

As much as time allows

Minimax/etc. evaluation on non-terminal states

using an evaluation function of some sort

But how do we evaluate non-terminal states?

In traditional game playing, the rules are known in advance, and the programmer can invent an evaluation function. Not possible with GGP.


Non guaranteed evaluation functions

Non-Guaranteed Evaluation Functions

Ideas

Goal proximity (everyone)

Maximize mobility (Barney Pell)

Minimize opponent’s mobility (Jim Clune)

Depth Charge (random probe)

<insert your good idea here>


Aaai 06 final cylinder checkers

AAAI-06 Final - Cylinder Checkers


Monte carlo method depth charge

Monte Carlo Method / Depth Charge

Basic Idea

(1) Explore game graph to some level

storing generated states

(2) Beyond this, explore to end of game

making random choices for moves of all players

not storing states (to limit space growth)

(3) Assign expected utilities to states by

summing utilities and dividing by number of trials

Features

Fast because no search

Small space because nothing stored


Monte carlo

Monte Carlo


Monte carlo1

Monte Carlo

100

0

0

0

0

100

100

0

0

0

0

0

100

0

100

100


Monte carlo2

Monte Carlo

25

50

75

0

100

0

0

0

0

100

100

0

0

0

0

0

100

0

100

100


Game playing

Metagaming


Definition

Definition

Metagaming is the process of analyzing and/or modifying a game offline (e.g. during the startclock or between moves or in a separate thread) so as to improve performance online (once the game begins, recommences, etc.).

Examples of Metagaming Techniques:

Headstart

End Game Book

Factoring

Dead State Removal

And so forth


Hodgepodge

Hodgepodge

Hodgepodge = Chess + Othello

Branching factor: aBranching factor: b

Analysis of joint game:

Branching factor as given to players: a*b

Fringe of tree at depth n as given: (a*b)n

Fringe of tree at depth n factored: an+bn


Game reformulation

Game Reformulation

Examples

Symmetry, e.g. Tic-Tac-Toe

Independence, e.g Hodgepodge

Bottlenecks, e.g. Triathalon

Techniques:

Graph Analysis (factoring, branching factor, …)

Proofs (esp those using mathematical induction)

Trade-off - cost of finding structure vs savings

Sometimes cost varies with size of description

rather than size of expanded game graph

Use of startclock versus playclock


Example

p1

p2

a1

a2

r1

r2

q1

q2

s1

n1

n2

s2

t1

t2

Example

l

n3

n4

l1

l2

o1

a3

o2

a4

g


Game factoring

Game Factoring

Method:

Compute factors

Use factors to generate “subplans”

Reassemble overall plan from subplans

Patterns:

Disjunctive Factors

* Conjunctive Factors

* Interleaved Factors

Sequential Factors

Overall factors versus conditional factors


Conjunctive factoring

Conjunctive Factoring

Conjunctive Goals

Delete conjunctive goal

Make conjuncts goals for each factor

Conjunctive Legality

Delete legality conjunction

Make conjuncts goals for each factor

Solve problem, conjoin solutions.

Can delete disjunctions too, but treating disjunctions as conjunctions might destroy completeness.


Example1

p1

p2

a1

a2

r1

r2

q1

q2

s1

n1

n2

s2

t1

t2

Example

l

n3

n4

l1

l2

o1

o3

o2

a3

g


Example2

Example

n3

n4

l1

l2

o1

o2

p1

p2

a1

a2

r1

r2

q1

q2

g1

n1

n2

g2

t1

t2


Performance

Performance

? (time (genplan propcompbuttons))

407 states

1,118 milliseconds.

605,728 bytes of memory allocated.

(PROG A B A D E D)

? (time (multiplan propcompbuttons))

14 states

53 milliseconds

22,320 bytes of memory allocated.

(PROG A B A D E D)

Partition time: 1 millisecond.


Game playing

Conclusion


General game playing is not a game

General Game Playing is not a game

Critique: Game playing is frivolous.

Reply: General game playing is

serious business.

AI: general problem solving

multiple agents, resource bounds

DB: Update through views with dynamic

constraints and long-running transactions

Systems: Automatic Programming redux

Motivating Applications:

Enterprise Management, Computational Law

Funding by Darpa, SAP, etc.


General game playing and game theory

General Game Playing and Game Theory

Like General Game Theory, traditional Game Theory is concerned with games in general.

Game Theory is concerned with the analysis of game trees. There is little concern for how those trees are communicated. There is little concern for trade-offs between deliberation and action.

In General Game Playing, the game “tree” is communicated at runtime. Hence there is an emphasis on description and use of this description. There is the possibility of new forms of rationality.


General game playing and planning

General Game Playing and Planning

Planning:

Inputs: State machine, initial state, goal states

Output: Plan that transforms initial to goal state

General Game Playing:

Inputs: State machine, initial state, goal states

Objective: Reach a goal state (plan/execute?)

Difference:

Presence of execution environment

with time bounds on combined deliberation/action

Study of interleaved planning and execution


Tabula rasa

Tabula Rasa

Critique: Human Intelligence is arguably the product of eons of evolution. We are, to some extent at least, “wired” to function well in this world. General Game Players have nothing at their disposal but mathematics.

Reply 1: A key indicator of intelligence is the ability of an individual to function in radically new environments.

Reply 2: A track of the competition in which the job is to figure out the rules.


Using what we learn

Using What We Learn

Critique: Real intelligence requires the ability to figure out new environments.

Reply: Well, there is no question that that ability is essential. However, real intelligence also requires the ability to use theories once they are formed.

This is the domain of specification-based systems / declarative systems / and so forth; and interest in this problem dates to the beginning of the field.


John mccarthy

John McCarthy

The main advantage we expect the advice taker to have is that its behavior will be improvable merely by making statements to it, telling it about its … environment and what is wanted from it. To make these statements will require little, if any, knowledge of the program or the previous knowledge of the advice taker.


Ed feigenbaum

Ed Feigenbaum

The potential use of computers by people to accomplish tasks can be “one-dimensionalized” into a spectrum representing the nature of the instruction that must be given the computer to do its job. Call it the what-to-how spectrum. At one extreme of the spectrum, the user supplies his intelligence to instruct the machine with precision exactly how to do his job step-by-step. ... At the other end of the spectrum is the user with his real problem. ... He aspires to communicate what he wants done ... without having to lay out in detail all necessary subgoals for adequate performance.


Robert heinlein

computer/robot

v

Robert Heinlein

A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.


Relative inertia

Relative Inertia

Another approach to merging plans is to interleave the actions of one plan with actions of the other, for each operational action performing a non-operational action in the other factors.

Show that the propositions in a factor do not change unless one of the actions in the factor is executed.

Note that we must be able to prove this for every proposition in the factor.


Guaranteed evaluation functions

Guaranteed Evaluation Functions

Ideas

* Novelty with reversibility

* Goal-monotonic observables

* Bad states, useless moves

<insert your good idea here>


Relational nets

s1

a

d

b

d

e

c

a

d

g

g

b

d

h

c

e

d

g

e

h

f

h

Relational Nets

Decompose states into “relations”.

Use relational operators to capture behavior.

q

p

r

s

p

r

2,1

q


Competitive games

Competitive Games


Single player games

a

b

c

Single Player “Games”


Aaai 05 racetrack corridor

AAAI-05 - Racetrack Corridor


Aaai 05 runner up david kaiser

AAAI-05 Runner-up David Kaiser


Relational net in relational logic

Relational Net in Relational Logic

  • Relational Net Fragment

  • View Definitions Physics

    • true(r(X,Y,Z)) :- next(r(X,Y,Z)) :-

    • true(p(X,Y)) & true(p(X,Y)) &

    • true(q(Y,Z)) true(q(Y,Z))

p

r

2,1

q


Delayed gratification

X

O

X

X

O

O

Delayed Gratification


Double tic tac toe

X

X

X

O

X

O

O

O

O

X

O

X

Double Tic-Tac-Toe


Modeling other players

Modeling Other Players

Modeling other players

Intra-game modeling useful only for long games

Inter-game modeling possible but

Unfortunately, identity information not available

Must recognize style of play

Assumptions

Rationality?


Dominance examples

Dominance Examples


Aaai 06 final competitors

AAAI-06 Final Competitors

Aidriven.learner Zhenzhong Xu (AiDriven)

Apple-rt Xinxin Sheng, Paul Breimyer (NCSU)

Cluneplayer* Jim Clune (UCLA)

Entropy Richard Holcombe (private citizen)

Fluxplayer S. Schiffel, M. Thielscher (Dresden)

Ggp_remote_agent Mohammed Shafiei (Sharif U, Iran)

Jigsawbot Raghuvar Nadig (IIT Mumbai)

Luckylemming S. Diestelhorst, M. Günther (Dresden)

Nnrg.hazel J. Resinger, I. Karpov,E. Bahceci (UTA)

Ogre David Kaiser (Florida Intl University)

Pires5600 G. Kuhlmann, Peter Stone (UT Austin)

The Michael Marlen (Kansas State)

* 2005 World Champion


Aaai 06 competition logistics

AAAI-06 Competition Logistics

Preliminary Rounds over 3 months

4 Rounds in all

72 matches per player (478 matches in total)

42 different games

Final at AAAI-06

1 on 1 Showdown between top two scorers

Winner take all


Aaai 06 leaderboard

AAAI-06 Leaderboard


Aaai 06 final game cylinder checkers

AAAI-06 Final Game - Cylinder Checkers


Aaai 06 winner

AAAI-06 Winner

Fluxplayer

Stephan Schiffel and Michael Thielscher

Dresden University


  • Login