Search Techniques

Search Techniques MSc AI module

Search • In order to build a system to solve a problem we need to: • Define and analyse the problem • Acquire the knowledge • Represent the knowledge • Choose the best problem solving technique • This where SEARCH TECHNIQUES are important

SEARCH • Searches are used to search a problem space - not for a particular piece of data, but for a PATH that joins the initial problem description to the desired (or GOAL) state. • The path represents the solution steps. • Searching for a solution develops a SOLUTION SPACE. • A problem space is procedurally developed as the search progresses (not pre-defined like a data structure) • We are defining a problem as a STATE SPACE SEARCH & this forms the basis of many AI problem-solving techniques.

SEARCH • For some problems we’re only interested in the solution. E.g. crosswords • For others, we’re interested in the PATH i.e. how we arrived at the solution. • e.g. finding the shortest distance between two places, or the Towers of Hanoi puzzle. • Also interested in the optimal solution, i.e. final state and the cost

Generate & Test • This is the simplest form of Search. • List each possible candidate solution in turn and check to see if it satisfies constraints. • Either stop at 1st solution or keep going for next. • e.g. 3x3 magic square • Want to make every row, column, diagonal add up to 15 • There are (9!) 362880 candidate solutions to this – combinatorial explosion if we generate & check them all.

Generate & Test (fig3.4 finlay & dix) • The idea of representing the approach to solving this as a tree, enables the elimination of branches beyond certain levels

Generate & Test - fig 3.5 finlay & Dix • Another useful representation technique is a graph. We can represent states and move between them on graphs. These can be ‘translated’ into a tree notation to represent the search space.

Towers of Hanoi Search Tree

SEARCH • Methods for pruning search trees are integral to many search algorithms. • This enables ‘unfruitful’ branches not to be expanded. • More sophisticated methods include the idea of heuristics to select best branches. • Often use a heuristic evaluation function – a number calculated for each node/path – that says how good/bad it’s likely to be. • N.B. entire trees aren’t necessarily constructed in the computers memory. • They represent the solution space i.e. the possible solutions.

Depth First Search and Breadth First Search • Simplest type • Depth first: • 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 • Breadth first: • 1 2 8 11 3 5 9 12 14 4 6 7 10 13 15 16 • Depending on where the goal state is, one may reach a solution more quickly than the other. • These are known as Blind Searches.

Depth First Search and Breadth First Search • Advantages of Depth First Search: • Requires less memory, only nodes on current path are stored (breadth first stores everything) • May find a solution very quickly (e.g. if solution is at end of first branch). • Advantages of Breadth First Search: • Will not get trapped exploring a blind alley (i.e. if first branch goes on forever) • Guaranteed to find a solution, and if there are a few solutions, the minimal one will be found first.

Branch & Bound • If searching for a solution & want to find the best one based on cost (cost can be anything – distance, time, etc.) • some branches can be ignored below a certain level if the cost of expanding it beyond that level is greater than the cost of the already found solution. • There are variants of this based on depth first & breadth first.

Branch & Bound • e.g. Looking for the shortest path between Gainesville & Key West. 1 Jacksonville Gainesville 2 5 4 3 Orlando 2 3 Ft. Pierce Tampa 4 5 3 Miami 3 Key west

Branch and Bound – step by step development of a search tree

Branch and Bound Cont.

Branch and Bound cont

Heuristic Search • Consider wanting to find the best route to Glasgow from Leicester. • Search space too big to consider every possibility • Construct a scoring function that provides estimates re: which paths/nodes are most promising. • Promising ones explored first: i.e. try paths that seem to be getting nearer the goal state. • Uses an evaluation function, such as ‘as the crow flies’ distance between start town and target town.

Hill Climbing • Imagine we’re at the hospital, heading for the church There is no through road from the park

Hill Climbing • Using ‘as crow flies’ evaluation function as heuristic would opt for route to the shop rather than route to the park. • The Hill Climbing algorithm is as follows: • Start with current-state = initial-state • Until current-state = goal- state OR there is no change in current-state DO • 1. get successors of current-state and use evaluation function to assign score to each successor • 2. if one of successors has a better score than current-state then set the new curren- state to be the successor with the best score.

Hill Climbing • This terminates when there are no successor states that are better than the current state. • Problem: can reach a dead end (local maxima) • If we were trying to get from the library to the university, using hill-climbing would take us from the library, to the hospital, & to the park – which is a dead end, no new state would bring us nearer, so it stops.

Hill Climbing • Also, see example below. If we’re aiming for the park from the library. We would go to the school, and then no new nodes improve it so the algorithm stops.

Best First Search • This is like Hill Climbing but is exhaustive and will eventually search all possible paths. This therefore avoids local maxima problem of hill climbing. • To do this it keeps a list of all the nodes that are still to be explored. • Start withagenda (i.e. list of nodes) = initial state • While agenda is not empty do: • Remove best node from agenda • If it is a goal then return with success otherwise find successors • Assign successor nodes a score using evaluation function and add the scored nodes to the agenda

Best First Search • Consider the above tree. The number at each node indicates the cost to solution estimates, (so lowest is best). • Breadth first – A B C D E F G • Depth First - A B D E G • Hill Climbing – A C then gets stuck at local maxima • Best First – A C B E G (this needs a good evaluation function to get the best results from using it.)

A* Algorithm • Best first uses estimated cost to goal, but doesn’t use the cost so far when choosing the node to search next. • A* attempts to minimise the total cost of the solution path • Evaluation function = Cost (est to goal) + Cost (from start)

Comparison of A* and Best First • A* guarantees to find the shortest path if the evaluation function is suitable • It is important not to overestimate the cost-to-goal function.

A* • Looking for the shortest path between Gainesville & Key West. 1 Jacksonville Gainesville 2 5 4 3 Underestimates Gainsevill – 8 Jacksonville – 9 Orlando – 6 Tampa – 4 Ft. Pierce – 5 Miami – 2 Key West - 0 Orlando 2 3 Tampa Ft. Pierce 4 5 3 Miami Key west 3

A* - Solution

A* - solution

Search Techniques