Intro to Planning

Intro to Planning Or, how to represent the planning problem in logic

The Planning Problem Input: • An “initial state” • A “goal state” • A set of actions, each of which can take you from one state to another one Output: A sequence of actions that, when executed in order starting in the initial state, guarantee reaching the goal state

Sound Familiar?

Graph Traversal as a Planning Problem • “initial state” is the start node in the graph • “goal state” is the goal node in the graph • Each “action” is a traversal of one of the edges in the graph, which takes you from an existing state (a node in the graph) to another state (another node in the graph). The output is a sequence of actions (edges) that take an agent from the start state to the goal state.

Problems with Graphs as Representations The algorithms we used for search in graphs work great: they are efficient, and they find optimal paths. However, some planning problems are difficult to represent as graphs. For example, • Uncertainty: the agent may not be omniscient (all-knowing), so it doesn’t know the whole graph at each time step. We’ve talked about some sources of uncertainty before: • partial observability (agent doesn’t perceive the world fully/accurately) • Stochasticity (actions can have multiple outcomes) • multi-agent (other intelligent agents operate in the environment) • Dynamism (the world changes over time, without the agent doing anything) • computational limits, ignorance, laziness, storage limits, etc. Consider the problem of planning a traversal of Tuttleman Hall to get from the entrance to room 305. If you didn’t know the building, you’d need to include actions for looking around the hall for the right room number, or for determining whether there are stairs or elevators, and where they are. Your subsequent actions would depend on the outcomes of these actions, so you can’t represent them in the graph at the beginning. (partial observability) Even if you know the building well, you still can’t plan your route out from the beginning, since you don’t know if people will be in the way (multi-agent/stochasticity).

Problems with Graphs as Representations The algorithms we used for search in graphs work great: they are efficient, and they find optimal paths. However, some planning problems are difficult to represent as graphs. For example, 2. Complexity: the complete graph might be enormous (or infinite), so it’s unrealistic to assume that the whole thing is given as an input. For example, consider the problem of corralling 100 sheep (s1 through s100) into 10 pens (p1-p10). All sheep start in an open field (F). The objective is to get s1-s10 into p1, s11-s20 into p2, etc. The allowed actions are moving one sheep from one location (F or p1-p10) to another location. Quiz: If we represent this as a graph, how many total nodes would there be? How many total edges?

Answer: Problems with Graphs as Representations The algorithms we used for search in graphs work great: they are efficient, and they find optimal paths. However, some planning problems are difficult to represent as graphs. For example, 2. Complexity: the complete graph might be enormous (or infinite), so it’s unrealistic to assume that the whole thing is given as an input. For example, consider the problem of routing corralling 100 sheep (s1 through s100) into 10 pens (p1-p10). All sheep start in an open field (F). The objective is to get s1-s10 into p1, s11-s20 into p2, etc. The allowed actions are moving one sheep from one location (F or p1-p10) to another location. Quiz: If we represent this as a graph, how many total nodes would there be? How many total edges? The number of nodes: A node represents a position for all 100 sheep. There’s 11 possible places for s1, 11 for s2, 11 for s3, …, and 11 for s100. So there are 11 x 11 x … x 11 (100 times) = 11100 = around 1.4 x 10104 nodes, or more than a googol (10100). The number of edges: For every node, there are 100 possible sheep to move, and 11-1 = 10 possible places to move it to, so 1000 edges per node. So there are a total of 11100 x 1000 = around 1.4 x 10107 edges.

Planning generalizes Graph Search Planning lets us consider problems with more complexity and uncertainty than graph search. The main difference is that the input includes “states” and “actions” rather than nodes and edges. In very simple cases, these are the same thing, but not always. Note: The main difference is in representation, rather than inference or learning.

Handling Complexity with Better Representations We’ll start by talking about representations that don’t suffer (as much) from combinatorial explosions. Later, we’ll talk about handling partial observability, stochasticity, and other causes of uncertainty.

Example Planning Problem Initial state: sheep are in the field, as is the robot. Goal: get sheep into the corral. Actions: L: fly left, from corral to field. R: fly right, from field to corral. G: grab a sheep. U: ungrab, or let go of, a sheep.

Quiz: Planning Problem Initial state: Which of the following is a plan? And which of the plans actually achieves the goal, starting from the initial state? [L, L, L] [U, G, U, G, M, K, Z] [G, R, U] [L, G, R, U, L, G, R, U, L, R] [G, R, U, L, G, R, U, L] Goal:

Answers: Planning Problem Initial state: • Which of the following is a plan? And which of the plans actually achieves the goal, starting from the initial state? • [L, L, L] • Plan, unsuccesful • [U, G, U, G, M, K, Z] • Not a plan (M, K, Z are not actions in this planning problem) • [G, R, U] • Plan, unsuccessful • [L, G, R, U, L, G, R, U, L, R] • Plan, successful • [G, R, U, L, G, R, U, L] • Plan, unsuccessful (robot ends in the wrong spot) Goal:

Quiz: Describe States in Logic Initial state: Using the following boolean variables, come up with PL formulas to describe the initial state and goal state: Robot_has_sheep_1 Robot_has_sheep_2 Robot_in_field Sheep_1_in_field Sheep_2_in_field Goal:

Answer: Describe States in Logic Initial state: Using the following boolean variables, come up with PL formulas to describe the initial state and goal state: Robot_has_sheep_1 Robot_has_sheep_2 Robot_in_field Sheep_1_in_field Sheep_2_in_field Initial: Robot_in_field∧ Sheep_1_in_field ∧ Sheep_2_in_field Goal: Robot_in_field∧ Sheep_1_in_field ∧ Sheep_2_in_field Goal:

Generalizing with PL Initial state: Suppose we don’t actually care where the robot ends up, just that the sheep are in the corral. We can describe this goal just by removing the variable Robot_in_field from the goal description. New Goal: Sheep_1_in_field ∧ Sheep_2_in_field So long as Sheep_1_in_field and Sheep_2_in_field are both false, any assignment of T or F to Robot_in_field will make the goal formula true. Goal:

Quiz: Describe States in FOL Initial state: Using the following constants and relations, write FOL sentences to describe the initial and goal states. Constants: B (robot) S1, S2 (sheep) F (field) C (corral) Relations: Sheep(x) – true if x is a sheep Holding(x, y) – true if x is holding y At(x, y) – true if x is at location y Goal:

Answer: Describe States in FOL Initial state: Using the following constants and relations, write FOL sentences to describe the initial and goal states. Constants: B (robot) S1, S2 (sheep) F (field) C (corral) Relations: Sheep(x) – true if x is a sheep Holding(x, y) – true if x is holding y At(x, y) – true if x is at location y Initial state: At(S1, F) ∧ At(S2, F) ∧ At(B, F) Goal state: At(S1, C) ∧ At(S2, C) Goal:

Quiz: Generalizing with FOL Initial state: Like with PL, FOL lets us describe goal states that include multiple possible worlds. Unlike PL, it also has convenient ways of generalizing further. Suppose there were 100 sheep instead of 2. Write an FOL statement that describes the goal that all of the sheep are in the corral. Goal:

Answer: Generalizing with FOL Initial state: Like with PL, FOL lets us describe goal states that include multiple possible worlds. Unlike PL, it also has convenient ways of generalizing further. Suppose there were 100 sheep instead of 2. Write an FOL statement that describes the goal that all of the sheep are in the corral. Answer: ∀s. Sheep(s) ⇒ At(s, C) This formula succinctly captures the goal state, regardless of how many sheep are involved. Goal:

Describing Actions We’ve talked a bunch about how to represent the start and goal states. What about actions? Let’s go over two commonly-used approaches.

STRIPS Actions STRIPS is a language for representing the meaning of actions. Here are some examples: Move Left: Pre: At(B, C) Eff: At(B, F)  At(B, C) Ungrab(x, y): Pre: Holding(B, x)  At(B, y) Eff: At(x, y)  Holding(B, x) Each action has a list of arguments, a description of preconditions (what must be true before the action can take place), and a list of effects (what is true after the action takes place). Notice that the effects include things that become true, and things that become false. Preconditions and effects CANNOT use quantifiers (in STRIPS).

Quiz: STRIPS Actions Write STRIPS action descriptions for the Move Right and Grab actions.

Answer: STRIPS Actions Write STRIPS action descriptions for the Move Right and Grab actions. Make sure that the robot can’t grab something if it’s already holding something. Move Right: Pre: At(B, F) Eff: At(B, C)  At(B, F) Grab(x, y): Pre: Holding(B, S1)  Holding(B, S2)  At(B, y)  At(x, y) Eff: At(x, y)  Holding(B, x) Note: You need to modify At(sheep, location), either in the Grab/Ungrab actions’ effects, or in the Move right/Move left actions’ effects. My version here modifies them in the Grab/Ungrab actions. Note 2: If you want to avoid adding a conjunct to the precondition of Grab for each sheep in the world, you can create a new boolean variable called handsFull. The Precondition for Grab would require this to be false, and the effects would make it true. The preconditions for Ungrab would require handsFull to be true, and the effects would make it false. The only other change is that the initial condition would need to specify handsFull.

Quiz: State Changes with STRIPS Initial state: Initial state: At(S1, F) ∧ At(S2, F) ∧ At(B, F) Given the initial state above, describe the state of the world after each of the following actions takes place, in order: G(s1, F) R U(s1, C) L U(s1, F) Goal:

Answer: State Changes with STRIPS Initial state: Initial state: At(S1, F) ∧ At(S2, F) ∧ At(B, F) Given the initial state above, describe the state of the world after each of the following actions takes place, in order: After G(s1, F): At(S1, F) ∧ At(S2, F) ∧ At(B, F) ∧ Holding(B, S1) After R: At(S2, F) ∧ At(B, F) ∧ Holding(B, S1) ∧ At(B, C) After U(s1, C): At(S2, F) ∧ Holding(B, S1) ∧ At(B, C) ∧ At(S1, C) After L: At(S2, F) ∧ At(B, C) ∧ At(S1, C) ∧ At(B, F) After U(s1, F): Preconditions aren’t met (Holding(B, s1)), so this action can’t be taken in the current state.

Search Strategies for Finding a Plan Initial state: Strategy 1: Forward (or progression) search Keep a priority queue of states (each described by FOL or PL) When it’s time to explore a node, apply all actions whose preconditions are met, and add the resulting states to the priority queue Stop when a state is taken from the queue that matches the goal state. G(s1) R G(s2) Goal:

Search Strategies for Finding a Plan Initial state: Strategy 1: Forward (or progression) search Notice: this algorithm is very similar to our graph search algorithms, but it doesn’t require the complete graph as input. Also notice: I haven’t (yet) specified how to compute the priorities for the priority queue. But you can use cost (eg, number of actions), or heuristics, or a combination of the two. G(s2) G(s1) R Goal:

Search Strategies for Finding a Plan Strategy 2: Backward (or regression) search Start by adding the goal state to the priority queue, instead of the initial state. At each iteration, find all actions whose effects match the current node, and add the previous states (before the action) to the queue. Stop when you get a node that matches the initial state. Initial state: Goal: U(s2) R

Search Strategies for Finding a Plan Strategy 2: Backward (or regression) search Note: this is basically the same, but there are cases when it’s a lot more efficient than forward search. Consider the case of 1000 sheep, and the goal is to get s457 into the corral. Forward search has 1001 possible actions to consider in the initial state, while backward search only has to consider a small number. Initial state: Goal: U(s2) R

Heuristics for Planning A popular strategy is to automatically generate heuristics for a planning problem, from the descriptions of the actions. Here’s the general idea: Create a relaxed planning problem by simplifying all of the actions. For each node, use a depth-first or breadth-first search to solve the relaxed planning problem. Use the path cost for the plan from the relaxed problem as the heuristic value for the node in the full planning problem. To make this work out, we need to make sure that the relaxed planning problem is much, much easier to solve than the original planning problem, since we need to solve the relaxed planning problem many times (each time we explore a node).

Heuristics for Planning Here’s an example of a strategy for generating a relaxed planning problem from STRIPS action descriptions. Start with your existing actions, e.g.: Grab(x, y): Pre: Holding(B, S1)  Holding(B, S2)  At(B, y)  At(x, y) Eff: At(x, y)  Holding(B, x) Start removing preconditions, to get relaxed action descriptions for a strictly easier planning problem: Grab(x): Pre: At(B, y)  At(x, y) Eff: At(x, y)  Holding(B, x) In this version, the robot can hold as many sheep as it likes.

Heuristics for Planning Here’s an example of a strategy for generating a relaxed planning problem from STRIPS action descriptions. Start with your existing actions, e.g.: Grab(x, y): Pre: Holding(B, S1)  Holding(B, S2)  At(B, y)  At(x, y) Eff: At(x, y)  Holding(B, x) Alternatively, or in addition, you can remove negative effects, e.g.: Grab(x): Pre: Eff: Holding(B, x) In this version, the robot can hold as many sheep as it wants, it doesn’t have to be in the same square as the sheep.

Intro to Planning