- 266 Views
- Uploaded on

Download Presentation
## Experience-Based Chess Play

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Experience-Based Chess Play

March 16, 2005

Robert Levinson

Machine Intelligence Lab

University of California,

Santa Cruz

Outline

- Review of State-of-the-art in Games
- Review of Computer Chess Method
- Blindspots and Unsolved Issues

==================

4. Morph

4a. Philosophy and Results

4b. Patterns/Evaluation

4c. Learning

* Td-learning

* Neural Nets

* Genetic Algorithms

Why Chess?

- Human/computer approaches very different
- Most studied game
- Well-known internationally and by public
- Cognitive studies available
- Accurate, well-defined rating system !
- Complex and Non-Uniform
- John McCarthy, Alan Turing, Claude Shannon,

Herb Simon and Ken Thompson and….

- Game theoretic value unknown
- Active Research Community/Journal

Kasparov vs. Deep Blue

- 1. Deep Blue can examine and evaluate up to 200,000,000 chess positions per second
- Garry Kasparov can examine and evaluate up to three chess positions per second
- 2. Deep Blue has a small amount of chess knowledge and an enormous amount of calculation ability.
- Garry Kasparov has a large amount of chess knowledge and a somewhat smaller amount of calculation ability.
- 3. Garry Kasparov uses his tremendous sense of feeling and intuition to play world champion-calibre chess.
- Deep Blue is a machine that is incapable of feeling or intuition.
- 4. Deep Blue has benefitted from the guidance of five IBM research scientists and one international grandmaster.
- Garry Kasparov is guided by his coach Yuri Dokhoian and by his own driving passion to play the finest chess in the world.
- 5. Garry Kasparov is able to learn and adapt very quickly from his own successes and mistakes.
- Deep Blue, as it stands today, is not a "learning system." It is therefore not capable of utilizing artificial intelligence to either learn from its opponent or "think" about the current position of the chessboard.
- 6. Deep Blue can never forget, be distracted or feel intimidated by external forces (such as Kasparov's infamous "stare").
- Garry Kasparov is an intense competitor, but he is still susceptible to human frailties such as fatigue, boredom and loss of concentration.

Recent Man vs. Machine Matches

- Garry Kasparov versus Deep Junior, January 26 - February 7, 2003

in New York City, USA. Result: 3 - 3 draw.

- Evgeny Bareev versus Hiarcs-X, January 28 - 31 , 2003 in Maastricht, Netherlands. Result: 2 - 2 draw.
- Vladimir Kramnik versus Deep Fritz, October 2 - 22, 2002 in Manama, Bahrain. Result: 4 - 4 draw.

Minimax Example

terminal nodes: values calculated from some evaluation function

Max

Min

4

7

9

6

9

8

8

5

6

7

5

2

3

2

5

4

9

3

other nodes: values calculated via minimax algorithm

Min

Max

4

7

6

2

6

3

4

5

1

2

5

4

1

2

6

3

4

3

Min

4

7

9

6

9

8

8

5

6

7

5

2

3

2

5

4

9

3

5

Max

Max makes first move down the left hand side of the tree, in the expectation that countermove by Min is now predicted. After Min makes move, tree will need to be regenerated and the minimax procedure re-applied to new tree

Possible later moves

5

3

4

Min

7

6

5

5

6

4

Max

4

7

6

2

6

3

4

5

1

2

5

4

1

2

6

3

4

3

Min

4

7

9

6

9

8

8

5

6

7

5

2

3

2

5

4

9

3

Computer chess

- Let's say you start with a chess board set up for the start of a game. Each player has 16 pieces. Let's say that white starts. White has 20 possible moves:
- The white player can move any pawn forward one or two positions.
- The white player can move either knight in two different ways.
- The white player chooses one of those 20 moves and makes it. For the black player, the options are the same: 20 possible moves. So black chooses a move.
- Now white can move again. This next move depends on the first move that white chose to make, but there are about 20 or so moves white can make given the current board position, and then black has 20 or so moves it can make, and so on.

Chess complexity

There are 20 possible moves for white. There are 20 * 20 = 400 possible moves for black, depending on what white does. Then there are 400 * 20 = 8,000 for white. Then there are 8,000 * 20 = 160,000 for black, and so on.

If you were to fully develop the entire tree for all possible chess moves, the total number of board positions is about 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000, or 10120.

There have only been 1026 nanoseconds since the Big Bang. There are thought to be only 1075 atoms in the entire universe.

How computer chess works

No computer is ever going to calculate the entire tree. What a chess computer tries to do is generate the board-position tree five or 10 or 20 moves into the future.

Assuming that there are about 20 possible moves for any board position, a five-level tree contains 3,200,000 board positions.

A 10-level tree contains about 10,000,000,000,000 (10 trillion) positions. The depth of the tree that a computer can calculate is controlled by the speed of the computer playing the game.

Is computer chess intelligent?

The minimax algorithm alternates between the maximums and minimums as it moves up the tree.

This process is completely mechanical and involves no insight. It is simply a brute force calculation that applies an evaluation function to all possible board positions in a tree of a certain depth.

What is interesting is that this sort of technique works pretty well. On a fast-enough computer, the algorithm can look far enough ahead to play a very good game. If you add in learning techniques that modify the evaluation function based on past games, the machine can even improve over time.

The key thing to keep in mind, however, is that this is nothing like human thought. But does it have to be?

Search Depth

Hmmm. What to do?

- Black is to move:

…Ra5!

- White to move:

What is strategy?

- the art of devising or employing plans or stratagems toward a goal where favorable.

Science

Would you say a traditional chess program's

chess strength is based mainly on a firm axiomatic theory about the strengths and balances of competing differential semiotic trajectory units in

a multiagent hypergeoemetric topological manifold-like zero-sum dynamic environment

or simply because

it is an accurate short-term calculator??!

#define DOUBLED_PAWN_PENALTY 10

#define ISOLATED_PAWN_PENALTY 20

#define BACKWARDS_PAWN_PENALTY 8

#define PASSED_PAWN_BONUS 20

#define ROOK_SEMI_OPEN_FILE_BONUS 10

#define ROOK_OPEN_FILE_BONUS 15

#define ROOK_ON_SEVENTH_BONUS 20

/* the values of the pieces */

int piece_value[6] = 100,300,350,500,900,0

Cheating!Goal: ELO 3000+

2800+ World Champion

2600+ GM

2400+ IM

2200+ FM (> 99 percent)

- Expert

1600 Median (> 50 percent)

1543 Morph

1000 Novice (beginning tournament player)

555 Random

Milestones

- MorphI: 1995. Graph Matching

Draws GnuChess every 10 games. 800 rating

- Morph II: 1998. Plays any game, compiles

neural net based on First Order Logic

rules. Optimal Tic-Tac-Toe, NIM.

- Morph IV : 2003. Neural Neighborhoods + Improved Eval

Reaches 1036.

- Summer 2004: 1450 (improved implementation)
- Today : 1558 (neural net +pattern variety + genetic alg.)
- Next: genetically selected patterns and nets.

Total Information = Diversity + Symmetry

- Diversity corresponds to Comp Sci “Complexity” = resources required.
- Diversity can often only be resolved with Combinatorial Search

Exploit Symmetry !!

“Invariant with respect to transformation.”

“Shared information between objects

or systems or their representations.”

AB+AC = A(B+C).

Symmetry Synonyms

- similarity
- commonality
- structure
- mutual information
- relationship
- pattern
- redundancy

Experts

Levels of Learning

Levels:

0. None – Brute Force It

1. Empirically –Statistical Understanding

- Inductive

1a. Supervise

b. Reward/Punish + TD Learning

c. Imitate

2. Analytically – Mathematical - Deductive

But Must Be Efficient!

Chess and Stock Trading

Patterns and their interpretation

- Technical Analysis: Forecasting market activity based on the study of price charts, trends and other technical data.
- Fundamental Analysis: Forecasting based on analysis of economic and geopolitical data.
- A Trading plan is formulated and executed only after a thorough and systematic Technical and Fundamental analysis.
- All trading plans incorporate Risk Management procedures.

THE MORPH ARCHITECTURE

Pattern-based!

CHESS GAMES

NEW POSITION

INTUITIVELY

BEST

MOVE

COMPILED MEMORY

Pattern Processor

& Matcher

4-6 PLY

Search

WEIGHTS

ADB

Morph patterns

- graph patterns
- both nodes, edge labeled
- direct attack, indirect attack, discovered attack
- material patterns
- Piece/square tables
- (later : graph patterns realized as neural

networks)

The game of Chess – example

- White has over 30 possible moves
- If black’s turn – can capture pawn at c3 and check (also fork)

Pattern weight formulation of search knowledge

- weights
- real number within the reinforcement value range, e.g {0,1}
- expected value of reinforcement given that the current state satisfied the pattern
- pws : <p1, .7>
- the states that have p1 as a feature are more likely to lead to a win than a loss
- advantage
- low-level of granularity and uniformity

Morph - evaluation asproduct

x1 x2 x1*x2

.5 .5 0.25

.2 .8 0.16

.2 .5 0.10

.8 .5 0.40

.8 .8 0.64

.2 .2 0.04

.1 .8 0.08

.2 .9 0.18

1 .8 0.80

0 .2 0.00

Learning Ingredients

- Graph Patterns
- Neural Networks
- Temporal Difference Learning
- Simulated Annealing
- Representation Change
- Genetic Algorithms

Modifying the weight of patterns

1. each state in the sequence of states that proceeded the

reinforcement value is assigned a new value using temporal

difference learning:

W B W B W (assume W won)

old: 0.6 0.35 0.70 0.3 0.8

new: 0.675 0.25 0.85 0.0* 1.0*

2. the new value assigned to each state is propagated down to the

patterns that matched the state

Difference between Morph and traditional TD learning:

1. the feature set changes throughout the learning process

2. use a simulated annealing type scheme

to give the weight of each pattern its own learning rate

3. the more a pattern gets updated,

the slower its learning rate becomes…

DB comprises a complex system of many particles

optimal configuration : each weight has its proper value

The average error serves as a the objective evaluation

function

average error : the difference between APS's

prediction of a state's value and that

provided by temporal-difference learning

Biological inspirations

- Some numbers…
- The human brain contains about 10 billion nerve cells (neurons)
- Each neuron is connected to the others through 10000 synapses
- Properties of the brain
- It can learn, reorganize itself from experience
- It adapts to the environment
- It is robust and fault tolerant

Biological neuron

- A neuron has
- A branching input (dendrites)
- A branching output (the axon)
- The information circulates from the dendrites to the axon via the cell body
- Axon connects to dendrites via synapses
- Synapses vary in strength
- Synapses may be excitatory or inhibitory

What is an artificial neuron ?

- Definition : Non linear, parameterized function with restricted output range

y

w0

x1

x2

x3

The information is propagated from the inputs to the outputs

Computations ofnon linear functions from input variables by compositions of algebraic functions

Time has no role (NO cycle between outputs and inputs)

Feed Forward Neural NetworksOutput layer

2nd hidden

layer

1st hidden

layer

x1

…..

x2

xn

Can have arbitrary topologies

Can model systems with internal states (dynamic ones)

Delays are associated to a specific weight

Training is more difficult

Performance may be problematic

Stable Outputs may be more difficult to evaluate

Unexpected behavior (oscillation, chaos, …)

Recurrent Neural Networks0

1

0

0

1

0

0

1

x1

x2

Properties of Neural Networks

- Supervised networks are universal approximators (Non recurrent networks)
- Theorem : Any limited function can be approximated by a neural network with a finite number of hidden neurons to an arbitrary precision

One or more hidden layers

Sigmoid activations functions

Multi-Layer PerceptronOutput layer

2nd hidden

layer

1st hidden

layer

Input data

B

B

A

B

A

A

B

B

A

B

A

A

B

B

A

B

A

Different non linearly separable problemsTypes of

Decision Regions

Exclusive-OR

Problem

Classes with

Meshed regions

Most General

Region Shapes

Structure

Single-Layer

Half Plane

Bounded By

Hyperplane

Two-Layer

Convex Open

Or

Closed Regions

Abitrary

(Complexity

Limited by No.

of Nodes)

Three-Layer

Neural Networks – An Introduction Dr. Andrew Hunter

Perceptron Update

We employ :

- Gradient descent: w0+w1x1+w2x2 +…w17x17
- Exponential gradient: w0+w1e^-x1+..w17e^-x17
- Non-linear terms: w0+w12x1x2 +…
- Triples: w0+w123x1x2x3 + …
- Soon: some higher order terms:

w0 + w1457x1x4x5x7 +….

***genetically selected***…

Genetic Algorithms

May Help Supervise Training and Development of the Neural Net structure!

BONUS:

We believe use of diversity in weight updating is underated!!

CharlesDarwin (1809-1882)

- Sailed on the Beagle for about 5 years.

- 1859 Wrote “Origins of Species” a very popular scientific treatise

- Botanist, Zoologist, Geologist, General Man of Science

- Goal: Develop an A.I. learning method using principles of Darwin’s amazing Beagle trip and subsequent theories of biological evolution

Genetic Algorithms (GA)

- John Holland, father of Genetic Algorithms: “Computer programs that "evolve" in ways that resemble natural selection can solve complex problems even their creators do not fully understand “
- GA allows A.I. knowledge discovery not yet known to humans or machines
- Adaptable to complex domains where human knowledge is limited, or sufficient domain knowledge cannot be easily provided
- New hypotheses are generated by mutating and recombining current hypotheses
- At each step we have a population of hypotheses from which we select the most fit
- GA does parallel search over different parts of the hypothesis space

Blondie24: A GA Rock Star Success ??

- David Fogel’s Neural Net/GA Checker Playing Program
- Final rating: 2045.85 (Master Level) Better than 95% of all checkers players
- Neural Network: 2 hidden layers with 40 and 10 nodes respectively, fully connected
- Initial: 15 randomly-weighted NN
- Each of the 15 NNs produce 1 offspring (using mutation), total of 30 NNs
- Each player plays against 5 randomly-selected opponents using depth 4 alphabeta
- Top 15 performers retained
- 250 generations (time taken about 1 month)

Morph GA Fitness, Selection & Variation

- Initial Experiment: 20 Nan (2-Tuple) Brains play Round Robin each Generation, for several hundred Generations
- Fitness: +1 point given for game won, -1 for game lost. Draw receives no points. Points totalled for each Nan-Brain at Generation end
- Selection: Ten highest scoring Brains are selected for survival, remaining Brains eliminated .
- Variation: 10-50% of wgts are randomly mutated using a Box Mueller algorithm for Normal Distribution & a Sigmoid Function
- Initial Results: 300+ ELO Points gained from initial 555 (Random) ELO using Nan (2-Tuple) Neighborhood Knowledge Representation Schema

Future Experiments

- We are confident we can achieve higher ELO rating levels using GA but it is important we do not cheat (provide specific chess knowledge)
- Variations upon the Knowledge Representation, including X-Tuple Neighborhood Networks
- Evolving meta-parameters using Genetic Algorithms e.g. range of mutating weights, size of knowledge representation, # ofbrains competing
- Lamarckian Evolution: Learning during a generation (not just selection/mutation alone) i.e., TD updating
- Using GA with alternative Knowledge Representations including various Neural Net configurations

Grandmaster Database

Database: all (507,734 games)

Report: 1.d4 Nf6 2.Bg5 Ne4 3.h4 (173 games)

ECO: A45s [Trompowsky: Raptor Variation]

Generated by Scid 3.0, 2001.11.15

1. STATISTICS AND HISTORY

-------------------------

1.1 Statistics

Games 1-0 =-= 0-1 Score

-----------------------------------------------------------

All report games 173 61 44 68 47.9%

Both rated 2600+ 0 0 0 0 0.0%

Both rated 2500+ 17 8 5 4 61.7%

Both rated 2400+ 33 16 9 8 62.1%

Both rated 2300+ 67 26 21 20 54.4%

Download Presentation

Connecting to Server..