CSM6120 Introduction to Intelligent Systems

1 / 56

# CSM6120 Introduction to Intelligent Systems - PowerPoint PPT Presentation

CSM6120 Introduction to Intelligent Systems. Search 3. Quick review. Uninformed techniques. Informed search. What we’ll look at: Heuristics Hill-climbing Best-first search Greedy search A* search. Heuristics. A heuristic is a rule or principle used to guide a search

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'CSM6120 Introduction to Intelligent Systems' - derron

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### CSM6120Introduction to Intelligent Systems

Search 3

Quick review
• Uninformed techniques
Informed search
• What we’ll look at:
• Heuristics
• Hill-climbing
• Best-first search
• Greedy search
• A* search
Heuristics
• A heuristic is a rule or principle used to guide a search
• It provides a way of giving additional knowledge of the problem to the search algorithm
• Must provide a reasonably reliable estimate of how far a state is from a goal, or the cost of reaching the goal via that state
• A heuristic evaluation function is a way of calculating or estimating such distances/cost
Heuristics and algorithms
• A correct algorithm will find you the best solution given good data and enough time
• It is precisely specified
• A heuristic gives you a workable solution in a reasonable time
• It gives a guided or directed solution
Evaluation function
• There are an infinite number of possible heuristics
• Criteria is that it returns an assessment of the point in the search
• If an evaluation function is accurate, it will lead directly to the goal
• More realistically, this usually ends up as “seemingly-best-search”
• Traditionally, the lowest value after evaluation is chosen as we usually want the lowest cost or nearest
Heuristic evaluation functions
• Estimate of expected utility value from a current position
• E.g. value for pieces left in chess
• Way of judging the value of a position
• Humans have to do this as we do not evaluate all possible alternatives
• These heuristics usually come from years of human experience
• Performance of the game playing program is very dependent on the quality of the function
Heuristic evaluation functions
• Must agree with the ordering a utility function would give at terminal states (leaf nodes)
• Computation must not take long
• For non-terminal states, the evaluation function must strongly correlate with the actual chance of ‘winning’
• The values can be learned using machine learning techniques
Heuristics for the 8-puzzle
• Number of tiles out of place (h1)
• Manhattan distance (h2)
• Sum of the distance of each tile from its goal position
• Tiles can only move up or down  city blocks
The 8-puzzle
• Using a heuristic evaluation function:
• h(n) = sum of the distance each tile is from its goal position
Goal state

Current state

Current state

h1=5

h2=1+1+1+2+2=7

h1=1

h2=1

Search algorithms
• Hill climbing
• Best-first search
• Greedy best-first search
• A*
Iterative improvement
• Consider all states laid out on the surface of a landscape
• The height at any point corresponds to the result of the evaluation function
Hill-climbing (greedy local)
• Until current-state = goal-state OR there is no change in current-state do:
• a) Get the children of current-state and apply evaluation function to each child
• b) If one of the children has a better score, then set current-state to the child with the best score
• Loop that moves in the direction of decreasing (increasing) value
• Terminates when a “dip” (or “peak”) is reached
• If more than one best direction, the algorithm can choose at random
Hill-climbing drawbacks
• Local minima (maxima)
• Local, rather than global minima (maxima)
• Plateau
• Area of state space where the evaluation function is essentially flat
• The search will conduct a random walk
• Ridges
• Causes problems when states along the ridge are not directly connected - the only choice at each point on the ridge requires uphill (downhill) movement
Best-first search
• Like hill climbing, but eventually tries all paths as it uses list of nodes yet to be explored
• While priority-queue not empty do:
• a) Remove best node from the priority-queue
• b) If it is the goal node, return success. Otherwise find its successors
• c) Apply evaluation function to successors and add to priority-queue
Best-first search
• Different best-first strategies have different evaluation functions
• Some use heuristics only, others also use cost functions: f(n) = g(n) + h(n)
• For Greedy and A*, our heuristic is:
• Heuristic function h(n) = estimated cost of the cheapest path from node n to a goal node
• For now, we will introduce the constraint that if n is a goal node, then h(n) = 0
Greedy best-first search
• Greedy BFS tries to expand the node that is ‘closest’ to the goal assuming it will lead to a solution quickly
• f(n) = h(n)
• aka “greedy search”
• Differs from hill-climbing – allows backtracking
• Implementation
• Expand the “most desirable” node into the frontier queue
• Sort the queue in decreasing order of desirability
Greedy best-first search
• Complete
• No, GBFS can get stuck in loops (e.g. bouncing back and forth between cities)
• Time complexity
• O(bm) but a good heuristic can have dramatic improvement
• Space complexity
• O(bm) – keeps all the nodes in memory
• Optimal
• No! (A – S – F – B = 450, shorter journey is possible)
Practical 2
• Implement greedy best-first search for pathfinding
• Look at code for AStarPathFinder.java
A* search
• A* (A star) is the most widely known form of Best-First search
• It evaluates nodes by combining g(n) and h(n)
• f(n) = g(n) + h(n)
• Where
• g(n) = cost so far to reach n
• h(n) = estimated cost to goal from n
• f(n) = estimated total cost of path through n

start

g(n)

n

h(n)

goal

A* search
• When h(n) = actual cost to goal, h*(n)
• Only nodes in the correct path are expanded
• Optimal solution is found
• When h(n) < h*(n)
• Optimal solution is found
• When h(n) > h*(n)
• Optimal solution can be overlooked
A* search
• Complete and optimal if h(n) does not overestimate the true cost of a solution through n
• Time complexity
• Exponential in [relative error of h x length of solution]
• The better the heuristic, the better the time
• Best case h is perfect, O(d)
• Worst case h = 0, O(bd) same as BFS
• Space complexity
• Keeps all nodes in memory and save in case of repetition
• This is O(bd) or worse
• A* usually runs out of space before it runs out of time
A* exercise

NodeCoordinatesSL Distance to K

A (5,9) 8.0

B (3,8) 7.3

C (8,8) 7.6

D (5,7) 6.0

E (7,6) 5.4

F (4,5) 4.1

G (6,5) 4.1

H (3,3) 2.8

I (5,3) 2.0

J (7,2) 2.2

K (5,1) 0.0

GBFS exercise

NodeCoordinatesDistance

A (5,9) 8.0

B (3,8) 7.3

C (8,8) 7.6

D (5,7) 6.0

E (7,6) 5.4

F (4,5) 4.1

G (6,5) 4.1

H (3,3) 2.8

I (5,3) 2.0

J (7,2) 2.2

K (5,1) 0.0

f(n) = g(n) + h(n)

• What algorithm does A* emulate if we set
• h(n) = -2*g(n)
• h(n) = 0
• Can you make A* behave like Breadth-First Search?
• A heuristic h(n) is admissible if for every node n,

h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n

• An admissible heuristic never overestimates the cost to reach the goal
• Example: hSLD(n) (never overestimates the actual road distance)
• Theorem: If h(n) is admissible, A* is optimal (for tree-search)
Optimality of A* (proof)
• Suppose some suboptimal goal G2has been generated and is in the frontier. Let n be an unexpanded node in the frontier such that n is on a shortest path to an optimal goal G
• f(G2) = g(G2) since h(G2) = 0 (true for any goal state)
• g(G2) > g(G) since G2 is suboptimal
• f(G) = g(G) since h(G) = 0
• f(G2) > f(G) from above
Optimality of A* (proof)
• f(G2) > f(G)
• h(n) ≤ h*(n) since h is admissible
• g(n) + h(n) ≤ g(n) + h*(n)
• f(n) ≤ f(G)
• Hence f(G2) > f(n), and A* will never select G2 for expansion
Heuristic functions
• Admissible heuristic example: for the 8-puzzleh1(n) = number of misplaced tilesh2(n) = total Manhattan distance i.e. no of squares from desired location of each tileh1(S) = ??h2(S) = ??
Heuristic functions
• Admissible heuristic example: for the 8-puzzle h1(n) = number of misplaced tilesh2(n) = total Manhattan distance i.e. no of squares from desired location of each tileh1(S) = 6h2(S) = 4+0+3+3+1+0+2+1 = 14
Heuristic functions
• Dominance/Informedness
• if h1(n)  h2(n) for all n (both admissible)then h2 dominates h1 and is better for searchTypical search costs:
• d = 12 IDS = 3,644,035 nodes A*(h1) = 227 nodes A*(h2) = 73 nodes
• d = 24 IDS  54,000,000,000 nodes A*(h1) = 39,135 nodes A*(h2) = 1,641 nodes
Heuristic functions
• Admissible heuristicexample: for the 8-puzzleh1(n) = number of misplaced tilesh2(n) = total Manhattan distance i.e. no of squares from desired location of each tileh1(S) = 6h2(S) = 4+0+3+3+1+0+2+1 = 14

But how do we come up with a heuristic?

Relaxed problems
• Admissible heuristics can be derived from the exact solution cost of a relaxed version of the problem
• If the rules of the 8-puzzle are relaxed so that a tile can move anywhere, then h1(n) gives the shortest solution
• If the rules are relaxed so that a tile can move to any adjacent square, then h2(n) gives the shortest solution
• Key point: the optimal solution cost of a relaxed problem is no greater than the optimal solution cost of the real problem
Choosing a strategy
• What sort of search problem is this?
• How big is the search space?
• What is known about the search space?
• What methods are available for this kind of search?
• How well do each of the methods work for each kind of problem?
Which method?
• Exhaustive search for small finite spaces when it is essential that the optimal solution is found
• A* for medium-sized spaces if heuristic knowledge is available
• Random search for large evenly distributed homogeneous spaces
• Hill climbing for discrete spaces where a sub-optimal solution is acceptable
Summary
• What is search for?
• How do we define/represent a problem?
• How do we find a solution to a problem?
• Are we doing this in the best way possible?
Pathfinding applet
• http://www.stefan-baur.de/cs.web.mashup.pathfinding.html