Recent progress in the design and analysis of admissible heuristic functions
Download
1 / 76

Recent Progress in the Design and Analysis of Admissible Heuristic Functions - PowerPoint PPT Presentation


  • 260 Views
  • Uploaded on

Recent Progress in the Design and Analysis of Admissible Heuristic Functions. Richard E. Korf Computer Science Department University of California, Los Angeles. Thanks . Victoria Cortessis Ariel Felner Rob Holte Kaoru Mulvihill Nathan Sturtevant. Heuristics come from Abstractions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Recent Progress in the Design and Analysis of Admissible Heuristic Functions' - Jims


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Recent progress in the design and analysis of admissible heuristic functions l.jpg

Recent Progress in the Design and Analysis of Admissible Heuristic Functions

Richard E. Korf

Computer Science Department

University of California, Los Angeles


Thanks l.jpg
Thanks Heuristic Functions

  • Victoria Cortessis

  • Ariel Felner

  • Rob Holte

  • Kaoru Mulvihill

  • Nathan Sturtevant


Heuristics come from abstractions l.jpg
Heuristics come from Abstractions Heuristic Functions

  • Admissible (lower bound) heuristic evaluation functions are often the cost of exact solutions to abstract problems.

  • E.g. If we relax the Fifteen Puzzle to allow a tile to move to any adjacent position, the cost of an exact solution to this simplified problem is Manhattan distance.


Outline of talk l.jpg
Outline of Talk Heuristic Functions

  • New methods for the design of more accurate heuristic evaluation functions.

  • A new method to predict the running time of admissible heuristic search algorithms


Fifteen puzzle l.jpg
Fifteen Puzzle Heuristic Functions

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15


Fifteen puzzle6 l.jpg
Fifteen Puzzle Heuristic Functions

  • Invented by Sam Loyd in 1870s

  • “...engaged the attention of nine out of ten persons of both sexes and of all ages and conditions of the community.”

  • $1000 prize to swap positions of two tiles


Swap two tiles l.jpg
Swap Two Tiles Heuristic Functions

1

2

3

1

2

3

4

5

6

7

4

5

6

7

8

9

10

11

8

9

10

11

13

13

14

15

12

15

14

12

(Johnson & Storey, 1879) proved it’s impossible.


Twenty four puzzle l.jpg
Twenty-Four Puzzle Heuristic Functions

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24


Rubik s cube l.jpg
Rubik’s Cube Heuristic Functions


Rubik s cube10 l.jpg
Rubik’s Cube Heuristic Functions

  • Invented in 1974 by Erno Rubik of Hungary

  • Over 100 million sold worldwide

  • Most famous combinatorial puzzle ever


Finding optimal solutions l.jpg
Finding Optimal Solutions Heuristic Functions

  • Input: A random solvable initial state

  • Output: A shortest sequence of moves that maps the initial state to the goal state

  • Generalized sliding-tile puzzle is NP Complete (Ratner and Warmuth, 1986)

  • People can’t find optimal solutions.

  • Progress measured by size of problems that can be solved optimally.


Sizes of problem spaces l.jpg

8 Puzzle: 10 Heuristic Functions5 .01 seconds

23 Rubik’s Cube: 106 .2 seconds

15 Puzzle: 1013 6 days

33 Rubik’s Cube: 1019 68,000 years

24 Puzzle: 1025 12 billion years

Sizes of Problem Spaces

Brute-Force Search Time (10 million nodes/second)

Problem

Nodes


A algorithm l.jpg
A* Algorithm Heuristic Functions

  • Hart, Nilsson, and Raphael, 1968

  • Best-first search with cost function f(n)=g(n)+h(n)

  • If h(n) is admissible (a lower bound estimate of optimal cost), A* guarantees optimal solutions if they exist.

  • A* stores all the nodes it generates, exhausting available memory in minutes.


Iterative deepening a ida l.jpg
Iterative-Deepening-A* (IDA*) Heuristic Functions

  • IDA* (Korf, 1985) is a linear-space version of A*, using the same cost function.

  • Each iteration searches depth-first for solutions of a given length.

  • IDA* is simpler, and often faster than A*, due to less overhead per node.

  • Key to performance is accurate heuristic evaluation function.


Manhattan distance heuristic l.jpg

15 Heuristic Functions

1

2

3

1

2

3

4

5

6

7

4

5

6

7

8

9

10

11

8

9

10

11

13

14

12

13

14

15

12

Manhattan Distance Heuristic

Manhattan distance is 6+3=9 moves


Performance on 15 puzzle l.jpg
Performance on 15 Puzzle Heuristic Functions

  • Random 15 puzzle instances were first solved optimally using IDA* with Manhattan distance heuristic (Korf, 1985).

  • Optimal solution lengths average 53 moves.

  • 400 million nodes generated on average.

  • Average solution time is about 50 seconds on current machines.


Limitation of manhattan distance l.jpg
Limitation of Manhattan Distance Heuristic Functions

  • To solve a 24-Puzzle instance, IDA* with Manhattan distance would take about 65,000 years on average.

  • Assumes that each tile moves independently

  • In fact, tiles interfere with each other.

  • Accounting for these interactions is the key to more accurate heuristic functions.


Example linear conflict l.jpg
Example: Linear Conflict Heuristic Functions

3

1

1

3

Manhattan distance is 2+2=4 moves


Slide19 l.jpg

Example: Linear Conflict Heuristic Functions

3

1

1

3

Manhattan distance is 2+2=4 moves


Slide20 l.jpg

Example: Linear Conflict Heuristic Functions

3

1

3

1

Manhattan distance is 2+2=4 moves


Slide21 l.jpg

Example: Linear Conflict Heuristic Functions

3

1

3

1

Manhattan distance is 2+2=4 moves


Slide22 l.jpg

Example: Linear Conflict Heuristic Functions

3

1

3

1

Manhattan distance is 2+2=4 moves


Slide23 l.jpg

Example: Linear Conflict Heuristic Functions

1

3

1

3

Manhattan distance is 2+2=4 moves


Slide24 l.jpg

Example: Linear Conflict Heuristic Functions

1

3

1

3

Manhattan distance is 2+2=4 moves, but linear conflict adds 2 additional moves.


Linear conflict heuristic l.jpg
Linear Conflict Heuristic Heuristic Functions

  • Hansson, Mayer, and Yung, 1991

  • Given two tiles in their goal row, but reversed in position, additional vertical moves can be added to Manhattan distance.

  • Still not accurate enough to solve 24-Puzzle

  • We can generalize this idea further.


More complex tile interactions l.jpg

14 Heuristic Functions

7

3

3

7

15

12

11

11

13

12

13

14

15

7

13

3

M.d. is 20 moves, but 28 moves are needed

12

7

15

3

11

11

14

12

13

14

15

12

11

3

M.d. is 17 moves, but 27 moves are needed

7

14

7

13

3

11

15

12

13

14

15

More Complex Tile Interactions

M.d. is 19 moves, but 31 moves are needed.


Pattern database heuristics l.jpg
Pattern Database Heuristics Heuristic Functions

  • Culberson and Schaeffer, 1996

  • A pattern database is a complete set of such positions, with associated number of moves.

  • e.g. a 7-tile pattern database for the Fifteen Puzzle contains 519 million entries.


Heuristics from pattern databases l.jpg

5 Heuristic Functions

10

14

7

1

2

3

8

3

6

1

4

5

6

7

15

12

9

8

9

10

11

2

11

4

13

12

13

14

15

Heuristics from Pattern Databases

31 moves is a lower bound on the total number of moves needed to solve this particular state.


Precomputing pattern databases l.jpg
Precomputing Pattern Databases Heuristic Functions

  • Entire database is computed with one backward breadth-first search from goal.

  • All non-pattern tiles are indistinguishable, but all tile moves are counted.

  • The first time each state is encountered, the total number of moves made so far is stored.

  • Once computed, the same table is used for all problems with the same goal state.


What about the non pattern tiles l.jpg
What About the Non-Pattern Tiles? Heuristic Functions

  • Given more memory, we can compute additional pattern databases from the remaining tiles.

  • In fact, we can compute multiple pattern databases from overlapping sets of tiles.

  • The only limit is the amount of memory available to store the pattern databases.


Combining multiple databases l.jpg

5 Heuristic Functions

10

14

7

1

2

3

8

3

6

1

4

5

6

7

15

12

9

8

9

10

11

2

11

4

13

12

13

14

15

Combining Multiple Databases

31 moves needed to solve red tiles

22 moves need to solve blue tiles

Overall heuristic is maximum of 31 moves


Applications of pattern databases l.jpg
Applications of Pattern Databases Heuristic Functions

  • On 15 puzzle, IDA* with pattern database heuristics is about 10 times faster than with Manhattan distance (Culberson and Schaeffer, 1996).

  • Pattern databases can also be applied to Rubik’s Cube.


Corner cubie pattern database l.jpg
Corner Cubie Pattern Database Heuristic Functions

This database contains 88 million entries


6 edge cubie pattern database l.jpg
6-Edge Cubie Pattern Database Heuristic Functions

This database contains 43 million values


Remaining 6 edge cubie database l.jpg
Remaining 6-Edge Cubie Database Heuristic Functions

This database also contains 43 million values


Rubik s cube heuristic l.jpg
Rubik’s Cube Heuristic Heuristic Functions

  • All three databases are precomputed in less than an hour, and use less than 100 Mbytes.

  • During problem solving, for each state, the different sets of cubies are used to compute indices into their pattern databases.

  • The overall heuristic value is the maximum of the three different database values.


Performance on rubik s cube l.jpg
Performance on Rubik’s Cube Heuristic Functions

  • IDA* with this heuristic found the first optimal solutions to randomly scrambled Rubik’s Cubes (Korf, 1997).

  • Median optimal solution length is 18 moves

  • Most problems can be solved in about a day now, using larger pattern databases.


Related work l.jpg
Related Work Heuristic Functions

  • Culberson and Schaeffer’s 1996 work on pattern databases in the Fifteen Puzzle.

  • Armand Prieditis’ ABSOLVER program discovered a “center-corner” heuristic for Rubik’s Cube in 1993.

  • Herbert Kociemba independently developed a powerful Rubik’s Cube program in 1990s.

  • Mike Reid has built faster Cube programs.


Memory adds speed l.jpg
Memory Adds Speed Heuristic Functions

  • More memory can hold larger databases.

  • Larger databases provide more accurate heuristic values.

  • More accurate heuristics speed up search.

  • E.g. Doubling the memory almost doubles the search speed.

  • See (Holte and Hernadvolgyi, 1999, 2000) for more details.


Limitation of general databases l.jpg
Limitation of General Databases Heuristic Functions

  • The only way to admissibly combine values from multiple general pattern databases is to take the maximum of their values.

  • For more accuracy, we would like to add heuristic values from different databases.

  • We can do this on the tile puzzles because each move only moves one tile at a time.

  • Joint work with Ariel Felner


Additive pattern databases l.jpg
Additive Pattern Databases Heuristic Functions

  • Culberson and Schaeffer counted all moves needed to correctly position the pattern tiles.

  • In contrast, we count only moves of the pattern tiles, ignoring non-pattern moves.

  • If no tile belongs to more than one pattern, then we can add their heuristic values.

  • Manhattan distance is a special case of this, where each pattern contains a single tile.


Example additive databases l.jpg

1 Heuristic Functions

2

3

4

5

6

7

8

9

10

11

12

13

15

14

Example Additive Databases

The 7-tile database contains 58 million entries.The 8-tile database contains 519 million entries.


Computing the heuristic l.jpg

5 Heuristic Functions

10

14

7

1

2

3

8

3

6

1

4

5

6

7

15

12

9

8

9

10

11

2

11

4

13

12

13

14

15

Computing the Heuristic

20 moves needed to solve red tiles

25 moves needed to solve blue tiles

Overall heuristic is sum, or 20+25=45 moves


Performance on 15 puzzle44 l.jpg
Performance on 15 Puzzle Heuristic Functions

  • IDA* with a heuristic based on these additive pattern databases can optimally solve random 15 puzzle instances in less than 29 milliseconds on average.

  • This is about 1700 times faster than with Manhattan distance on the same machine.


24 puzzle additive databases l.jpg
24 Puzzle Additive Databases Heuristic Functions

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24


Performance on 24 puzzle l.jpg
Performance on 24 Puzzle Heuristic Functions

  • Each database contains 128 million entries.

  • IDA* using these databases can optimally solve random 24-puzzle problems.

  • Optimal solutions average 100 moves.

  • Billions to trillions of nodes generated.

  • Over 2 million nodes per second.

  • Running times from 18 seconds to 10 days.

  • Average is a day and a half.


Moral of the story so far l.jpg
Moral of the Story (so far) Heuristic Functions

  • Our implementation of disjoint additive pattern databases is tailored to tile puzzles.

  • In general, most heuristics assume that subproblems are independent.

  • Capturing some of the interactions between subproblems yields more accurate heuristics

  • We’ve also applied this to graph partioning.


Time complexity of admissible heuristic search algorithms l.jpg

Time Complexity of Admissible Heuristic Search Algorithms Heuristic Functions

Joint work with Michael Reid (Brown University)


Previous work on this problem l.jpg
Previous Work on This Problem Heuristic Functions

  • Pohl 1977, Gaschnig 1979, Pearl 1984

  • Assumes an abstract analytic model of the problem space

  • Characterizes heuristic by its accuracy as an estimate of optimal solution cost

  • Produces asymptotic results.

  • Doesn’t predict runtime on real problems.


How long does a heuristic search algorithm take to run l.jpg
How Long does a Heuristic Search Algorithm take to Run? Heuristic Functions

  • Branching factor of problem space

  • Solution depth of problem instance

  • Heuristic evaluation function

Depends on:


Branching factor average number of children of a node l.jpg
Branching Factor: average number of children of a node Heuristic Functions

  • In tile puzzles, a cell has 2,3,or 4 neighbors.

  • Eliminating inverse of last move gives a node branching factor of 1, 2, or 3.

  • Exact asymptotic branching factor depends on relative frequency of each type of node, in limit of large depth.

  • Joint work with Stefan Edelkamp


One wrong way to do it l.jpg
One Wrong Way to do it Heuristic Functions

  • 15 puzzle has 4 center cells (b=3), 4 corner cells (b=1), and 8 side cells (b=2).

  • Therefore, b= (4*3+4*1+8*2)/16=2.

  • Assumes all blank positions equally likely


The correct answer l.jpg
The Correct Answer Heuristic Functions

  • The asymptotic branching factor of the Fifteen Puzzle is 2.13040

  • The derivation is left as an exercise.


Derivation of time complexity l.jpg
Derivation of Time Complexity Heuristic Functions

  • First, consider brute-force search.

  • Then, consider heuristic search complexity.

  • Illustrate result with example search tree.


Slide55 l.jpg

h(n) Heuristic Functions

b0

b1

0

1

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

g(n)

1

1

1


Slide56 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

2

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

g(n)

1

1

1

1

2

3

2

1


Slide57 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

3

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

1

1

1

g(n)

1

2

3

2

1

3

6

7

6

3


Slide58 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

b4

3

4

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

1

1

1

g(n)

1

2

3

2

1

3

6

7

6

3

9

16

19

16

9


Slide59 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

b4

3

b5

4

5

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

1

1

1

1

2

3

2

1

g(n)

3

6

7

6

3

9

16

19

16

9

25

44

51

44

25


Slide60 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

g(n)

b4

3

b5

4

b6

5

6

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

1

1

1

1

2

3

2

1

3

6

7

6

3

9

16

19

16

9

25

44

51

44

25

69

120

139

120

69


Heuristic distribution function l.jpg
Heuristic Distribution Function Heuristic Functions

  • The distribution of heuristic values converges, independent of initial state.

  • Let P(x) be the proportion of states with heuristic value <= x, in limit of large depth.

  • For pattern database heuristics, P(x) can be computed exactly from the databases.

  • For general heuristics, P(x) can be approximated by random sampling.


Nodes expanded by a or ida l.jpg
Nodes Expanded by A* or IDA* Heuristic Functions

  • Running time is proportional to number of node expansions.

  • If C* is the cost of an optimal solution, A* or IDA* will expand all nodes n for which f(n)=g(n)+h(n)<=C*, in the worst case.


Slide63 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

g(n)

b4

3

b5

4

b6

5

6

Brute-Force Search Tree

0

1

3

4

2

nodes

depth

1

1

1

1

1

2

3

2

1

3

6

7

6

3

9

16

19

16

9

25

44

51

44

25

69

120

139

120

69


Slide64 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

g(n)

b4

3

b5

4

b6

5

6

IDA* Iteration with Depth Limit 6

0

1

3

4

2

nodes

depth

1

1

1

1

1

2

3

2

1

3

6

7

6

3

9

16

19

16

9

25

44

51

44

25

69

120

139

120

69


Slide65 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

g(n)

b4

3

b5

4

b6

5

6

IDA* Iteration with Depth Limit 6

0

1

3

4

2

nodes

depth

1

1

1

1

1

2

3

2

1

3

6

7

6

3

9

16

19

13

6

25

44

35

19

0

69

69

44

0

0


Slide66 l.jpg

h(n) Heuristic Functions

b0

b1

0

b2

1

b3

2

g(n)

b4

3

b5

4

b6

5

6

IDA* Iteration with Depth Limit 6

0

1

3

4

2

nodes

depth

1

1

1

1

1

2

3

2

1

3

6

7

6

3

9

16

19

13

6

25

44

35

19

69

69

44


Slide67 l.jpg

h(n) Heuristic Functions

0

1

2

g(n)

3

4

5

6

IDA* Iteration with Depth Limit 6

nodes expanded

0

1

3

4

2

depth

b0

1

b1

1

1

1

b2

1

2

3

2

1

b3 P(3)

3

6

7

6

3

b4 P(2)

9

16

19

13

6

b5 P(1)

25

44

35

19

b6 P(0)

69

69

44


The main result l.jpg

d Heuristic Functions

bi P(d-i)

i=0

b0P(d)+b1P(d-1)+...bdP(0)=

The Main Result

  • b is the branching factor,

  • d is the optimal solution depth,

  • P(x) is the heuristic distribution function

  • The number of nodes expanded by the last iteration of IDA* in the worst case is:


Assumptions of the analysis l.jpg
Assumptions of the Analysis Heuristic Functions

  • h is consistent, meaning that in any move, h never decreases more than g increases.

  • The graph is searched as a tree (e.g. IDA*).

  • Goal nodes are ignored.

  • The distribution of heuristic values at a given depth approximates the equilibirum heuristic distribution at large depth.

  • We don’t assume that costs are uniform.


Experiments on rubik s cube l.jpg
Experiments on Rubik’s Cube Heuristic Functions

  • Heuristic function is based on corner cubie and two edge cubie pattern databases.

  • Heuristic distribution is computed exactly from databases, assuming database values are independent of each other.

  • Theory predicts average node expansions by IDA* to within 1%.


Experiments on fifteen puzzle l.jpg
Experiments on Fifteen Puzzle Heuristic Functions

  • Heuristic is Manhattan distance.

  • Heuristic distribution is approximated by10 billion random samples of problem space.

  • Theory predicts average node expansions within 2.5% at typical solution depths.

  • Accuracy of theory increases with depth.


Experiments on eight puzzle l.jpg
Experiments on Eight Puzzle Heuristic Functions

  • Heuristic is Manhattan distance.

  • Heuristic distribution is computed exactly by exhaustive search of entire space.

  • Average node expansions by IDA* is computed exactly by running every solvable initial state as deep as we need to.

  • Theory predicts experimental results exactly


Practical importance l.jpg
Practical Importance Heuristic Functions

  • Ability to predict performance of heuristic search from heuristic distribution allows us to choose among alternative heuristics.

  • More efficient than running large numbers of searches in the problem domain

  • Can predict performance even if we can’t run any problem instances (e.g. Twenty-Four puzzle with Manhattan distance).


Complexity of heuristic search l.jpg
Complexity of Heuristic Search Heuristic Functions

  • Complexity of brute-force search is O(bd).

  • Previous results predicted O((b-K)d) for complexity of heuristic search, reducing the effective branching factor.

  • Our theory predicts O(bd-k), reducing the effective depth of search by a constant.

  • This is confirmed by our experiments.

  • k is roughly the expected value of heuristic.


Summary l.jpg
Summary Heuristic Functions

  • More powerful admissible heuristics can be automatically computed by capturing some of the interactions between subproblems.

  • The time complexity of heuristic search algorithms can be accurately predicted from the branching factor, search depth, and heuristic distribution function.


Conclusions l.jpg
Conclusions Heuristic Functions

  • Recent progress in this area has come from more accurate heuristic functions.

  • Admissible heuristics can also be used to speed up searches for sub-optimal solutions.

  • New methods have emerged for constructing such heuristic functions.

  • Applying these methods to new problems still requires work.


ad