1 / 98

# Game Description - PowerPoint PPT Presentation

General Game Playing Lecture 3. Game Description. Michael Genesereth / Nat Love Spring 2006. Finite Synchronous Games. Finite environment Environment with finitely many states One initial state and one or more terminal states Finite Players

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Game Description' - fritz-dale

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Game Description

Michael Genesereth / Nat Love Spring 2006

Finite environment

Environment with finitely many states

One initial state and one or more terminal states

Finite Players

Fixed finite number of players

Each with finitely many “actions”

Each with one or more goal states

Synchronous Update

All players move on all steps (some no ops)

Environment changes only in response to moves

aa

ab

s2

s5

s8

aa

bb

ab

aa

aa

ab

bb

ab

ab

s1

s3

s6

s9

s11

ba

aa

ba

ba

ab

bb

bb

aa

aa

s4

s7

s10

S ={s1, s2, s3, s4, s5, s6, s7, s8, s9, s10, s11}

I1={a, b} I2={a, b}

u(s1,a,a,s2) u(s4,a,a,s7) u(s7,a,a,s10) u(s10,b,a,s9)

u(s1,a,b,s3) u(s4,a,b,s3)u(s8,a,b,s5) u(s10,b,b,s11)

u(s1,b,a,s4) u(s6,a,a,s5) u(s8,b,b,s11) u(s11,a,a,s11)

u(s1,b,b,s1) u(s6,a,b,s9) u(s9,a,a,s8)

u(s2,a,a,s5) u(s6,b,a,s7) u(s9,a,b,s11)

u(s2,a,b,s3) u(s6,b,b,s3)

i=s1

T ={s3, s8, s11}

G1={s8, s11}G2={s3, s7}

An n-player game is a structure with components:

S - finite set of states

I1, …, In - finite sets of actions, one for each player

u  S  I1 ... In S - update relation

i  S - initial game state

T  S - the terminal states

G1, ..., Gn - where Gi S - the goal relations

Good News: Since all of the games that we are considering are finite, it is possible in principle to communicate game information in the form of sets of objects and tuples of objects.

Problem: Size of description. Even though everything is finite, these sets can be large.

In many cases, worlds are best thought of in terms of propositions, e.g. whether a particular light is on or off. Actions often affect a subset of propositions.

States represent all possible ways the world can be. As such, the number of states is exponential in the number of such propositions, and the action tables are correspondingly large.

Idea - represent propositions directly and describe actions in terms of their effects on indidvidual propositions rather than entire states.

X

O

X

cell(1,1,x)

cell(1,2,b)

cell(1,3,b)

cell(2,1,b)

cell(2,2,o)

cell(2,3,b)

cell(3,1,b)

cell(3,2,b)

cell(3,3,x)

control(black)

cell(1,1,x) cell(1,1,x)

cell(1,2,b) cell(1,2,b)

cell(1,3,b)cell(1,3,o)

cell(2,1,b) cell(2,1,b)

cell(2,2,o) cell(2,2,o)

cell(2,3,b) cell(2,3,b)

cell(3,1,b) cell(3,1,b)

cell(3,2,b) cell(3,2,b)

cell(3,3,x) cell(3,3,x)

control(black)control(white)

noop

mark(1,3)

init(cell(1,1,b))

init(cell(1,2,b))

init(cell(1,3,b))

init(cell(2,1,b))

init(cell(2,2,b))

init(cell(2,3,b))

init(cell(3,1,b))

init(cell(3,2,b))

init(cell(3,3,b))

init(control(x))

legal(W,mark(X,Y)) :-

true(cell(X,Y,b)) &

true(control(W))

legal(white,noop) :-

true(cell(X,Y,b)) &

true(control(black))

legal(black,noop) :-

true(cell(X,Y,b)) &

true(control(white))

Vocabulary: Object Variables: X, Y, Z

Object Constants: a, b, c

Function Constants: f, g, h

Relation Constants:p, q, r

Logical Operators:~, &, |, :-, #

Terms:

Variables: X, Y, Z

Object Constants: a, b, c

Functional Terms: f(a), g(a,b), h(a,b,c)

Sentences:

Simple Sentences: p(a,g(a,b),c)

Logical Sentences: r(X,Y) :- p(X,Y) & ~q(Y)

A rule is safe if and only if every variable in the head appears in some positive subgoal in the body.

Safe Rule:

Unsafe Rule:

In GDL, we require all rules to be safe.

The dependency graph for a set of rules is a directed graph in which (1) the nodes are the relations mentioned in the head and bodies of the rules and (2) there is an arc from a node p to a node q whenever p occurs with the body of a rule in which q is in the head.

t

s

r

p

q

A set of rules is recursive if it contains a cycle. Otherwise, it is non-recursive.

A set of rules is recursive if it contains a cycle. Otherwise, it is non-recursive.

t

s

r

p

q

The negation in a set of rules is said to be stratified if there is no recursive cycle in the dependency graph involving a negation.

Stratified Negation:

Negation that is not stratified:

In GDL, we require that all negations be stratified.

Database applications start with a partial database, i.e. sentences for some relations (extensional relations) and not others (intensional relations). Rules are then written to define the intensional relations in terms of the extensional relations.

rules

Extensional Intensional

Given an extensional database and a set of rules, we can obtain the database’s closure as follows.

Database applications start with a partial database, i.e. sentences for some relations (extensional relations) and not others (intensional relations). Rules are then written to define the intensional relations in terms of the extensional relations.

Given an extensional database and a set of rules, we can obtain the database’s closure as follows.

The value of a single non-recursive rule on a database D is the set of all rule heads obtained by consistently substituting ground terms from D for variables in such a way that the substituted subgoals are all in D.

Sample Rule:

Database:

Extension:

The value of a set ofrules with a common relation on a database D is the union of the values on the individual rules.

Sample Rules:

Sample Database:

Value:

The value of a set of non-recursive rules with different head relations is obtained by evaluating rules in order in which their head relations appear in the corresponding dependency graph.

Sample Rules:

Value Computation:

To compute the value of a recursive rule, start with the empty relation. Compute the value using multiple rule computation. Iterate till no new tuples are added.

Sample Rules:

Value Computation:

There are various ways to compute the value of negative rules.

In classical negation, a negation is true only if the negated sentence is known to be false (i.e. there must be rules concluding negated sentences). This is the norm in computational logic systems. In GDL, we do not have such rules.

In negation as failure, a negation is true if and only if the negated sentence is not known to be true. This is the norm in database systems.

Definition:

Value Computation:

Object Constants:

0, 1, 2, 3, … - numbers

Relation Constants:

role(player)

init(proposition)

true(proposition)

next(proposition)

legal(player,action)

does(player,action)

goal(proposition)

terminal

Object constants:

white, black - players

x, o, b - marks

Function Constants:

mark(number,number) --> action

cell(number,number,mark) --> proposition

control(player) --> proposition

RelationConstants:

row(number,player)

column(number,player)

diagonal(player)

line(player)

open

Extensional Relations:

does(player,action)

true(proposition)

Intensional Relations:

role(player)

init(proposition)

legal(player,action)

next(proposition)

goal(proposition,score)

terminal

role(white)

role(black)

init(cell(1,1,b))

init(cell(1,2,b))

init(cell(1,3,b))

init(cell(2,1,b))

init(cell(2,2,b))

init(cell(2,3,b))

init(cell(3,1,b))

init(cell(3,2,b))

init(cell(3,3,b))

init(control(x))

legal(W,mark(X,Y)) :-

true(cell(X,Y,b)) &

true(control(W))

legal(white,noop) :-

true(cell(X,Y,b)) &

true(control(black))

legal(black,noop) :-

true(cell(X,Y,b)) &

true(control(white))

next(cell(M,N,x)) :-

does(white,mark(M,N))

next(cell(M,N,o)) :-

does(black,mark(M,N))

next(cell(M,N,Z)) :-

does(W,mark(M,N)) &

true(cell(M,N,Z)) & Z#b

next(cell(M,N,b)) :-

does(W,mark(J,K)) &

true(cell(M,N,b)) & (M#J | N#K)

next(control(white)) :-

true(control(black))

next(control(black)) :-

true(control(white))

row(M,W) :- diagonal(W) :-

true(cell(M,1,W)) & true(cell(1,1,W)) &

true(cell(M,2,W)) & true(cell(2,2,W)) &

true(cell(M,3,W)) true(cell(3,3,W))

column(N,W) :- diagonal(W) :-

true(cell(1,N,W)) & true(cell(1,3,W)) &

true(cell(2,N,W)) & true(cell(2,2,W)) &

true(cell(3,N,W)) true(cell(3,1,W))

line(W) :- row(M,W)

line(W) :- column(N,W)

line(W) :- diagonal(W)

open :-true(cell(M,N,b))

goal(white,100) :- line(x)

goal(white,50) :- ~line(x) & ~line(o) & ~open

goal(white,0) :- line(o)

goal(black,100) :- line(o)

goal(white,50) :- ~line(x) & ~line(o) & ~open

goal(white,0) :- line(x)

terminal:- line(W)

terminal:- ~open

• What we see:

• next(cell(M,N,x)) :-

• does(white,mark(M,N)) &

• true(cell(M,N,b))

• What the player sees:

• next(welcoul(M,N,himenoing)) :-

• does(himenoing,dukepse(M,N)) &

• true(welcoul(M,N,lorenchise))

Knowledge Interchange Format is a standard for programmatic exchange of knowledge represented in relational logic.

Syntax is prefix version of standard syntax.

Some operators are renamed: not, and, or.

Case-independent. Variables are prefixed with ?.

r(X,Y) <= p(X,Y) & ~q(Y)

(<= (r ?x ?y) (and (p ?x ?y) (not (q ?y))))

(<= (r ?x ?y) (p ?x ?y) (not (q ?y)))

Semantics is the same.

Start Message

(start idrole (s1 … sn) startclockplayclock)

Play Message

(play id (a1 ... ak))

Stop Message

(stop id (a1 ... ak))

p

q

r

a

b

c

p

p

q

r

p

q

p

p

q

r

p

q

init(q)

legal(robot,a)

legal(robot,b)

legal(robot,c)

next(p) :- does(robot,a) & -true(p)

next(p) :- does(robot,b) & true(q)

next(p) :- does(robot,c) & true(p)

next(q) :- does(robot,a) & true(q)

next(q) :- does(robot,b) & true(p)

next(q) :- does(robot,c) & true(q)

next(r) :- does(robot,a) & true(r)

next(r) :- does(robot,b) & true(r)

next(r) :- does(robot,c) & true(q)

goal :- true(p) & -true(q) & true(r)

term :- true(p) & -true(q) & true(r)

S ={s1, s2, s3, s4, s5, s6, s7, s8}

I = {a, b, c}

u(s1,a,s5) u(s000,b,s010) u(s000,c,s001)

u(s2,a,s2) u(s000,b,s010) u(s000,c,s001)

u(s3,a,s3) u(s000,b,s010) u(s000,c,s001)

u(s4,a,s4) u(s000,b,s010) u(s000,c,s001)

I = s1

T = {s8}

G = {s8}

P ={p, q, r}

I = {a, b, c}

u(s1,a,s5) u(s000,b,s010) u(s000,c,s001)

u(s2,a,s2) u(s000,b,s010) u(s000,c,s001)

u(s3,a,s3) u(s000,b,s010) u(s000,c,s001)

u(s4,a,s4) u(s000,b,s010) u(s000,c,s001)

I = s1

T = {s8}

G = {s8}

cell(1,1,x) cell(1,1,x)

cell(1,2,b) cell(1,2,b)

cell(1,3,b) cell(1,3,o)

cell(2,1,b) cell(2,1,b)

cell(2,2,o) cell(2,2,o)

cell(2,3,b) cell(2,3,b)

cell(3,1,b) cell(3,1,b)

cell(3,2,b) cell(3,2,b)

cell(3,3,x) cell(3,3,x)

control(black) control(white)

noop

mark(1,3)

p

q

r

s

a

b

c

d

S ={s000, s001, s010, s011, s100, s101, s110, s111}

I = {a, b, c}

u(s000,a,s100) u(s000,b,s010) u(s000,c,s001)

u(s001,a,s001) u(s000,b,s010) u(s000,c,s001)

u(s010,a,s010) u(s000,b,s010) u(s000,c,s001)

u(s011,a,s011) u(s000,b,s010) u(s000,c,s001)

I = s0

T = {sF}

G = {sF}

In many cases, worlds are best thought of in terms of features, e.g. red or green, left or right, high or low. Actions often affect subset of features.

States represent all possible ways the world can be. As such, the number of states is exponential in the number of “features” of the world, and the action tables are correspondingly large.

Idea - represent features directly and describe how actions change individual features rather than entire states.

Propositions

Connectives

Transitions

p

q

r

Pressing button a toggles p.

Pressing button b interchanges p and q.

p

q

a

b

s110

Propositional Nets as State Machines

State Machines as Propositional Nets

One proposition per state

Only one proposition is true at each point in time

Propositional Nets vs State Machines

Expressively equivalent and interconvertible

State Machines can be exponentially larger

e.g. state machine for Tic-Tac-Toe has 5478 states

propositional net has 45 propositions

Propositional Nets vs Petri Nets

Propositional Nets are computable

(equivalent to Petri nets with finitely many tokens)

Propositional Nets are composable

without revealing inner details of components

p

r

2,1

1.3

q

x31

o11

x21

o21

x11

o31

o12

o22

x32

x12

x22

o32

or3

xr3

or2

xr1

xr2

or1

o13

x13

o23

x23

x33

o33

s1

a

d

b

d

p

c

e

a

g

r

b

g

2,1

1.3

c

h

d

g

q

e

h

f

h

Decompose states into “relations”.

Use relational operators to capture behavior.

q

p

r

s

Relational Nets vs Propositional Nets

Expressively equivalent and interconvertible

Number of Tuples = Number of Propositions

Fewer Relations than propositions

Fewer connectives

Relational Nets vaguely related to RMDPs

p

r

2,1

1.3

q

a

d

b

d

p

c

e

a

g

r

b

g

2,1

1.3

c

h

d

g

q

e

h

f

h

Relational Net Fragment

Encoding

r(X,Z) :- p(X,Y) & q(Y,Z)

p

r

2,1

1.3

q

• Relational Net Fragment

• Encoding without delay Encoding with delay

• true(r(X,Z)) :- next(r(X,Z)) :-

• true(p(X,Y)) & true(p(X,Y)) &

• true(q(Y,Z)) true(q(Y,Z))

X

O

X

mark(1,1)

cell(1,1,x)

cell(1,1,b)

mark(1,2)

cell(1,2,x)

cell(1,2,b)

mark(1,3)

cell(1,3,x)

cell(1,3,b)

• Direct encoding in relational logic:

• next(cell(1,1,x)) <=

• does(mark(1,1)) &

• true(cell(1,1,b))

• Use of variables to compact description:

• next(cell(M,N,x)) <=

• does(mark(M,N)) &

• true(cell(M,N,b))

• Game-specific “views” / “macros”:

• row(M,W) <=

• true(cell(M,1,W)) &

• true(cell(M,2,W)) &

• true(cell(M,3,W))

• Object Variables: X, Y, Z

Object Constants: a, b, c

Function Constants: f, g, h

Relation Constants:p, q, r

Logical Operators:~, &, |, :-, distinct

Terms: X, Y, Z, a, b, c, f(a), g(a,b), h(a,b,c)

Relational Sentences: p(a,b)

Logical Sentences: r(X,Y) <= p(X,Y) & ~q(Y)

An expression is ground iff it contains no variables.

The Herbrand base is the set of all ground relational sentences.

legal(W,mark(X,Y)) 

true(cell(X,Y,b)) 

true(control(W))

legal(white,noop) 

true(cell(X,Y,b)) 

true(control(o))

legal(black,noop) 

true(cell(X,Y,b)) 

true(control(x))

next(cell(M,N,x)) 

does(white,mark(M,N)) 

true(cell(M,N,b))

next(cell(M,N,o)) 

does(black,mark(M,N)) 

true(cell(M,N,b))

next(cell(M,N,W)) 

true(cell(M,N,W)) 

distinct(W,b)

next(cell(M,N,b)) 

does(W,mark(J,K)) 

true(cell(M,N,b)) 

(distinct(M,J) | distinct(N,K))

next(control(x)) 

true(control(o))

next(control(o)) 

true(control(x))

goal(white,100)  line(x)

goal(white,0)  line(o)

goal(black,100)  line(o)

goal(white,0)  line(x)

line(W)  row(M,W)

line(W)  column(N,W)

line(W)  diagonal(W)

row(M,W) 

true(cell(M,1,W)) 

true(cell(M,2,W)) 

true(cell(M,3,W))

column(N,W) 

true(cell(1,N,W)) 

true(cell(2,N,W)) 

true(cell(3,N,W))

diagonal(W) 

true(cell(1,1,W)) 

true(cell(2,2,W)) 

true(cell(3,3,W))

diagonal(W) 

true(cell(1,3,W)) 

true(cell(2,2,W)) 

true(cell(3,1,W))

terminal  line(W)

terminal  ~open

open  true(cell(M,N,b))

Of necessity, game descriptions are logically incomplete in that they do not uniquely specify the moves of the players.

Every game description contains complete definitions for legality, termination, goalhood, and update in terms of the primitive moves and the does relation.

The upshot is that in every state every player can determine legality, termination, goalhood and, given a joint move, can update the state.

A game is playable if and only if every player has at least one legal move in every non-terminal state.

Note that in chess, if a player cannot move, it is a stalemate. Fortunately, this is a terminal state.

In GGP, we guarantee that every game is playable.

A game is strongly winnable if and only if, for some player, there is a sequence of individual moves of that player that leads to a terminating goal state for that player.

A game is weakly winnable if and only if, for every player, there is a sequence of joint moves of the players that leads to a terminating goal state for that player.

In GGP, every game is weakly winnable, and all single player games are strongly winnable.

In Extensive Normal Form, a game is modeled as a tree with actions of one player at each node.

In State Machine Form, a game is modeled as a graph and players’ moves are all synchronous.

In GGP, a game must be described formally. While ENF and SMF are expressively equivalent for finite games, SMF descriptions are simpler.

Some players may create game trees from game descriptions; however, searching game graphs can be more efficient.

State Machines

Propositional Nets

Relational Nets

Tabular Encoding

Logical Encoding

An n-player game is a structure with components:

S - finite set of states

I1, …, In - finite sets of actions, one for each player

l1, ..., ln - where li  Ii  S - the legality relations

u  S  I1 ... In S - update relation

i  S - initial game state

T  S - the terminal states

G1, ..., Gn - where Gi S - the goal relations

s110

p

r

q

Define states in terms of propositions.

Use propositional connectives to capture behavior.

p

q

s

r

A marking for a propositional net is a function from the propositions to boolean values.

m: P  {true,false}

A marking is acceptable iff it obeys the logical properties of all connectives.

Negation with input x and output y:

m(y)=true  m(x)=false

Conjunction with inputs x and y and output z:

m(z)=true  m(x)=true  m(y)=true

Disjunction with inputs x and y and output z:

m(z)=true  m(x)=true  m(y)=true

A transition is enabled by a marking m iff all of its inputs are marked true.

The update for a marking m is the partial marking m* that assigns true to the outputs of all transitions enabled by m and false to the outputs of all other transitions.

A successor m’ of a marking m is any complete, acceptable marking consistent with m*.

p

r

q

cell(1,1,x) cell(1,1,x)

cell(1,2,b) cell(1,2,b)

cell(1,3,b) cell(1,3,o)

cell(2,1,b) cell(2,1,b)

cell(2,2,o) cell(2,2,o)

cell(2,3,b) cell(2,3,b)

cell(3,1,b) cell(3,1,b)

cell(3,2,b) cell(3,2,b)

cell(3,3,x) cell(3,3,x)

control(black) control(white)

noop

mark(1,3)

Assume evaluation function f partitions states into n categories S1, …, Sn.

Consider probabilities p1, …, pn of winning in each category. (More generally, consider expected utilities u1, …, un.) Use these probabilities (utilities) as evaluation function values for the corresponding categories.

Choosing a move that leads to a category with maximal value maximizes chances of winning.

An ideal evaluation function is one that reflects the expected utility of each state. (In the case of win-lose games, it is the probability of winning.)

For each terminal state, it is the payoff in that state.

For each nonterminal state, it is the maximum of the expected utilities of the legal actions in that state. (The expected utility of an action in a state is the sum of the expected values of the states resulting from that action weighted by probabilities of the opponents’ actions.)

Choosing moves that maximize expected value

NB: Different priors possible. Random is common.