cisc453 winter 2010 n.
Skip this Video
Loading SlideShow in 5 Seconds..
CISC453 Winter 2010 PowerPoint Presentation
Download Presentation
CISC453 Winter 2010

CISC453 Winter 2010

125 Views Download Presentation
Download Presentation

CISC453 Winter 2010

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. CISC453 Winter 2010 Planning & Acting in the Real World AIMA3eCh 11 Time & Resources Hierarchical Techniques Relaxing Environmental Assumptions

  2. Overview • extending planning language & algorithms • 1. allow actions that have durations & resource constraints • yields a new "scheduling problem" paradigm • incorporating action durations & timing, required resources • 2. hierarchical planning techniques • control the complexity of large scale plans by hierarchical structuring of actions • 3. uncertain environments • non-deterministic domains • 4. multiagent environments Planning & Acting in the Real World

  3. Scheduling versus Planning • recall from classical planning (Ch 10) • PDDL representations only allowed us to decide the relative ordering among planning actions • up till now we've concentrated on what actions to do, given their PRECONDs & EFFECTs • in the real world, other properties must be considered • actions occur at particular moments in time, have a beginning and an end, occupy or require a certain amount of time • for a new category of Scheduling Problems we need to consider the absolute times when an event or action will occur & the durations of the events or actions • typically these are solved in 2 phases: planning then scheduling • a planning phase selects actions, respecting ordering constraints • this might be done by a human expert, and automated planners are suitable if they yield minimal ordering constraints • then a scheduling phase incorporates temporal information so that the result meets resource & deadline constraints

  4. Time, Schedules & Resources • the Job-Shop Scheduling (JSS) paradigm includes • the requirement to complete a set of jobs • each job consists of a sequence of actions with ordering constraints • each action • has a given duration and may also require some resources • resource constraints indicate the type of resource, the number of it that are required, and whether the resource is consumed in the action or is reusable • the goal is to determine a schedule • one that minimizes the total time required to complete alljobs, (the makespan) • while respecting resource requirements & constraints Planning & Acting in the Real World

  5. Job-Shop Scheduling Problem (JSSP) • JSSP involves a list of jobs to do • where a job is a fixed sequence of actions • actions have quantitative time durations & ordering constraints • actions use resources (which may be shared among jobs) • to solve the JSSP: find a schedule that • determines a start time for each action • 1. that obeys all hard constraints • e.g. no temporal overlap between mutex actions (those using the same one-action-at-a-time resource) • 2. for our purposes, we'll operationalize cost as the total time to perform all actions and jobs • note that the cost function could be more complex (it could include the resources used, time delays incurred, ...) • our example: automobile assembly scheduling • the jobs: assemble two cars • each job has 3 actions: add the engine, add the wheels, inspect the whole car • a resource constraint is that we do the engine & wheel actions at a special one-car-only work station

  6. Ex: Car Construction Scheduling • the job shop scheduling problem of assembling 2 cars • includes required times & resource constraints • notation: A < B indicates action A must precede action B Jobs({AddEngine1 < AddWheels1 < Inspect1}, {AddEngine2 < AddWheels2 < Inspect2}) Resources (EngineHoists(1), WheelStations(1), Inspectors(2), LugNuts(500)) Action(AddEngine1, DURATION: 30, USE: EngineHoists(1)) Action(AddEngine2, DURATION: 60, USE: EngineHoists(1)) Action(AddWheels1, DURATION:30, CONSUME: LugNuts(20), USE: WheelStations(1)) Action(AddWheels2, DURATION:15, CONSUME: LugNuts(20), USE: WheelStations(1)) Action(Inspecti DURATION: 10, USE: Inspectors(1)) Planning & Acting in the Real World

  7. Car Construction Scheduling • note that the action schemas • list resources as numerical quantities, not named entities • so Inspectors(2), rather than Inspector(I1) & Inspector(I2) • this process of aggregation is a general one • it groups objects that are indistinguishable with respect to the current purpose • this can help reduce complexity of the solution • for example, a candidate schedule that requires (concurrently) more than the number of aggregated resources can be rejected without having to exhaustively try assignments of individuals to actions Planning & Acting in the Real World

  8. Planning + Scheduling for JSSP • Planning + Scheduling for Job-Shop Problems • scheduling differs from standard planning problem • considers when an action starts and when it ends • so in addition to order (planning), duration is also considered • we begin with ignoring the resource constraints, solving the temporal domain issues to minimize the makespan • this requires finding the earliest start times for all actions consistent with the problem's ordering constraints • we create a partially-ordered plan, representing ordering constraints in a directed graph of actions • then we apply the critical path method to determine thestart and end times for each action Planning & Acting in the Real World

  9. Graph of POP + Critical Path • the critical path is the path with longest total duration • it is "critical" in that it sets the duration for the whole plan and delaying the start of any action on it extends the whole plan • it is the sequence of actions, each of which has no slack • each must begin at a particular time, otherwise the whole plan is delayed • actions off the critical path have a window of time given by the earliest possible start time ES & the latest possible start time LS • the illustrated solution assumes no resource constraints • note that the 2 engines are being added simultaneously • the figure shows [ES, LS] for each action, & slack is LS - ES • the time required is indicated below the action name & bold links mark the critical path

  10. JSSP: (1)Temporal Constraints • schedule for the problem • is given by ES & LS times for all actions • note the 15 minutes slack for each action in the top job, versus 0 (by definition) in the critical path job • formulas for ES & LS also outline a dynamic-programming algorithm for computing them • A, B are actions, A < B indicates A must come before B ES(Start) =0 ES(B) = maxA<B ES(A) + Duration(A) LS(Finish) = ES(Finish) LS(A) = minB>A LS(B) - Duration(A) • complexity is O(Nb) where N is number of actions and b is the maximum branching factor into or out of an action • so without resource constraints, given a partial ordering of actions, finding the minimum duration schedule is (a pleasant surprise!) computationally easy

  11. JSSP: (1)Temporal Constraints • timeline for the solution • grey rectangles give intervals for actions • empty portions show slack Planning & Acting in the Real World

  12. Solution from POP + Critical Path • 1. the partially-ordered plan (above) • 2. the schedule from the critical-path method (below) • notice that this solution still omits resource constraints • for example, the 2 engines are being added simultaneously

  13. Scheduling with Resources • including resource constraints • critical path calculations involve conjunctions of linear inequalities over action start & end times • they become more complicated when resource constraints are included (for example, each AddEngine action requires the 1 EngineHoist, so they cannot overlap) • they introduce disjunctions of linear inequalities for possible orderings & as a result, complexity becomes NP-hard!! • here's a solution accounting for resource constraints • reusable resources are in the left column, actions align with resources • this shortest solution schedule requires 115 minutes

  14. Scheduling with Resources • including resource constraints • notice • that the shortest solution is 30 minutes longer than the critical path without resource constraints • that multiple inspector resource units are not needed for this job, indicating the possibility for reallocation of this resource • that the "critical path" now is: AddEngine1, AddEngine2, AddWheels2, Inspect2. • the remaining actions have considerable slack time, they can begin much later without affecting the total plan time

  15. Scheduling with Resources • for including resource constraints • a variety of solution techniques have been tested • one simple approach uses the minimum slack heuristic • at each step schedule next the unscheduled action that has its predecessors scheduled & has the least slack • update ES & LS for impacted actions & repeat • note the similarity to minimum-remaining values (MRV) heuristic of CSPs • applied to this example, it yields a 130 minute solution • 15 minutes longer than the optimal solution • difficult scheduling problems may require a different approach • they may involve reconsidering actions & constraints, integrating the planning & scheduling phases by including durations & overlaps in constructing the POP • this approach is a focus of current research interest Planning & Acting in the Real World

  16. Time & Resource Constraints • summary • alternative approaches to planning with time & resource constraints • 1. serial: plan, then schedule • use a partial or full-order planner • then schedule to determine actual start times • 2. interleaved: mix planning and scheduling • for example, include resource constraints during partial planning • these can determine conflicts between actions • notes: • remember that so far we are still working in classical planning environments • so, fully observable, deterministic, static and discrete Planning & Acting in the Real World

  17. Hierarchical Planning • next • we add techniques to the handle plan complexity issue • HTN: hierarchical task network planning • this works in a top-down fashion • similar to the stepwise refinement approach to programming • plans that are built from a fixed set of small atomic actions will become unwieldy as the planning problem grows large • we need to plan at a higher level of abstraction • reduce complexity byhierarchical decomposition of plan steps • at each level of the hierarchy a planning task is reduced to a small number of activities at the next lower level • the low number of activities • means the computational cost of arranging these activities can be lowered Planning & Acting in the Real World

  18. Hierarchical Planning • an example: the Hawaiian vacation plan • recall: the AIMA authors live/work in San Francisco Bay area • go to SFO airport • take flight to Honolulu • do vacation stuff for 2 weeks • take flight back to SFO • go Home • each action in this plan actually embodies another planning task • for example: the go to SFO airport action might be expanded • drive to long term parking at SFO • park • take shuttle to passenger terminal • & each action can be decomposed until the level consists of actions that can be executed without deliberation • note: some component actions might not be refined until plan execution time (interleaving: a somewhat different topic) Planning & Acting in the Real World

  19. Hierarchical Planning • basic approach • at each level, each component is reduced to a small number of activities at the next lower level • this keeps the computational cost of arranging them low • otherwise, there are too many individual atomic actions for non-trivial problems (yielding high branching factor & depth) • the formalism is HTN planning • Hierarchical Task Network planning • notes • we retain the basic environmental assumptions as for classical planning • what we previously simply called actions are now "primitive actions" • we add HLAs: High Level Actions (like go to SFO airport) • each has 1 or more possible refinements • refinements are sequences of actions, either HLAs or primitive actions

  20. Hierarchical Task Network • alternative refinements: notation • for the HLA: Go(Home, SFO) Refinement (Go(Home, SFO), STEPS: [Drive(Home, SFOLongTermParking), Shuttle(SFOLongTermParking, SFO)]) Refinement (Go(Home, SFO), STEPS: [Taxi(Home, SFO)]) • the HLAs and their refinements • capture knowledge about how to do things • terminology: if the HLA refines to only primitive actions • it is called an implementation • the implementation of a high-level plan (sequence of HLAs) • concatenates the implementations for each HLA • the preconditions/effects representation of primitive action schemas allows a decision about whether an implementation of a high-level plan achieves the goal

  21. Hierarchical Task Network • HLAs & refinements & plan goals • in the HTN approach, the goal is achieved if any implementation achieves it • this is the case since an agent may choose the implementation to execute (unlike non-deterministic environments where "nature" chooses) • in the simplest case there's a single implementation of an HLA • we get preconds/effects from the implementation, and then treat the HLA as a primitive action • where there are multiple implementations, either • 1. search over implementations for 1 that solves the problem • OR • 2. reason over HLAs directly • derive provably correct abstract plans independent of the specific implementations Planning & Acting in the Real World

  22. Search Over Implementations • 1. the search approach • this involves generation of refinements by replacing an HLA in the current plan with a candidate refinement until the plan achieves the goal • the algorithm on the next slide shows a version using breadth-first tree search, considering plans in the order of the depth of nesting of refinements • note that other search versions (graph-search) and strategies (depth-first, iterative deepening) may be formulated by re-designing the algorithm • explores the space of sequences derived from knowledge in the HLA library re: how things should be done • the action sequences of refinements & their preconditions code knowledge about the planning domain • HTN planners can generate very large plans with little search Planning & Acting in the Real World

  23. Search Over Implementations • the search algorithm for refinements of HLAs function HIERARCHICAL-SEARCH(problem, hierarchy) returns a solution or failure frontier a FIFO queue with [Act] as the only element loop do if EMPTY?(frontier) then return failure plan  POP(frontier) /* chooses the shallowest plan in frontier */ hla  the first HLA in plan, or null if none prefix, suffix  the action subsequences before and after hla in plan outcome  RESULT(problem.INITIAL-STATE, prefix) ifhla is null then /* so plan is primitive & outcome is its result */ ifoutcome satisfies problem.GOAL then returnplan /* insert all refinements of the current hla into the queue */ else for eachsequencein REFINEMENTS(hla, outcome, hierarchy) do frontier  INSERT(APPEND(prefix, sequence, suffix), frontier) Planning & Acting in the Real World

  24. HTN Examples • O-PLAN: an example of a real-world system • the O-PLAN system does both planning & scheduling, commercially for the Hitachi company • one specific sample problem concerns a product line of 350 items involving 35 machines and 2000+ different operations • for this problem, the planner produces a 30-day schedule of 3x8-hour shifts, with 10s of millions of steps • a major benefit of the hierarchical structure with the HTN approach is the results are often easily understood by humans • abstracting away from excessive detail • (1) makes large scale planning/scheduling feasible • (2) enhances comprehensibility Planning & Acting in the Real World

  25. HTN Efficiency • computational comparisons for a hypothetical domain • assumption 1: a non-hierarchical progression planner with d primitive actions, b possibilities at each state: O(bd) • assumption 2: an HTN planner with r refinements of each non-primitive, each with k actions at each level • how many different refinement trees does this yield? • depth: number of levels below the root = logkd • then the number of internal refinement nodes = 1 + k + k2 + … + klogkd-1 = (d - 1)/(k - 1) • each internal node has r possible refinements, so r(d - 1)/(k - 1) possible regular decomposition trees • the message: keeping r small & k large yields big savings (roughly kth root of non-hierarchical cost if b & r are comparable) • nice as a goal, but long action sequences that are useful over a range of problems are rare Planning & Acting in the Real World

  26. HTN Efficiency • HTN computational efficiency • building the plan library is critically important to achieving efficiency gains HTN planning • so, might the refinements be learned? • as one example, an agent could build plans conventionally then save them as a refinement of an HLA defined as the current task/problem • one goal is "generalizing" the methods that are built, eliminating problem-instance specific detail, keeping only key plan components Planning & Acting in the Real World

  27. Hierarchical Planning • we've just looked at the approach of searching over fully refined plans • that is, full implementations • the algorithm refines plans to primitive actions in order to check whether they achieve the problem goal • now we move on to searching for abstract solutions • the checking occurs at the level of HLAs • possibly with preconditions/effects descriptions for HLAs • the result is that search is in the much smaller HLA space, after which we refine the resulting plan Planning & Acting in the Real World

  28. Hierarchical Planning • searching for abstract solutions • this approach will require that HLA descriptions have the downward refinement property • every high level plan that apparently solves the problem (from the description of its steps) has at least 1 implementation that achieves the goal • since search is not at the level of sequences of primitive actions, a core issue is the describing of effects of actions (HLAs) with multiple implementations • assuming a problem description with only +ve preconds & goals, we might describe an HLA's +ve effects in terms of those achieved by every implementation, and its -ve effects in terms of those resulting from any implementation • this would satisfy the downward refinement property • however, requiring an effect to be true for every implementation is too restrictive, it assumes that an adversary chooses the implementation (assumes an underlying non-deterministic model)

  29. Plan Search in HLA Space • plan search in HLA space • there are alternative models for which implementation is chosen, either • (1) demonic non-determinism where some adversary makes the choice • (2) angelic non-determinism, where the agent chooses • if we adopt angelic semantics for HLA descriptions • the resulting notation uses simple set operations/notation • the key concept is that of the reachable set for some HLA h & state s, notation: Reach(s, h) • this is the set of states reachable by any implementation of h (since under angelic semantics, the agent gets to choose) • for a sequence of HLAs [h1, h2] the reachable set is the union of all reachable sets from applying h2 in each state in the reachable set of h1 (for notation details see p 411) • a sequence of HLAs forming a high level plan is a solution if its reachable set intersects the set of goal states Planning & Acting in the Real World

  30. Plan Search in HLA Space • illustration of reachable sets, sequences of HLAs • dots are states, shaded areas = goal states • darker arrows: possible implementations of h1 • lighter arrows: possible implementations of h2 • (a) reachable set for HLA h1 • (b) reachable set for the sequence [h1, h2] • circled dots show the sequence achieving the goal Planning & Acting in the Real World

  31. Planning in HLA Space • using this model • planning consists of searching in HLA space for a sequence with a reachable set that intersects the goal, then refining that abstract plan • note: we haven't considered yet the issue of representing reachable sets as the effects of HLAs • our basic planning model has states as conjunctions of fluents • if we treat the fluents of a planning problem as state variables, then under angelic semantics an HLA controls the values of these variables, depending on which implementation is actually selected • HLA may have 9 different effects on a given variable • if it starts true, in can always keep it true, always make it false, or have a choice & similarly for a variable that is initially false • any combination of the 3 choices for each case is possible, yielding 32 or 9 effects Planning & Acting in the Real World

  32. Planning in HLA Space • using this model • so there are 9 possible combinations of choices for the effects on variables • we introduce some additional notation to capture this idea • note some slight formatting differences between the details of the notation used here versus in the textbook • ~ indicates possibility, the dependence on the agent's choice of implementation • ~+A indicates the possibility of adding A • ~-A represents the possible deleting of A • ~±A stands for possibly adding or deleting A Planning & Acting in the Real World

  33. Planning in HLA Space • possible effects of HLAs • a simple example uses the HLA for going to the airport Go(Home, SFO) Refinement (Go(Home, SFO), STEPS: [Drive(Home, SFOLongTermParking), Shuttle(SFOLongTermParking, SFO)]) Refinement (Go(Home, SFO), STEPS: [Taxi(Home, SFO)]) • this HLA has ~-Cash as a possible effect, since the agent may choose the refinement of going by taxi & have to pay • we can use this notation & angelic reachable state semantics to illustrate how an HLA sequence [h1, h2] reaches a goal • it's often the case that an HLA's effects can only be approximated (since it may have infinitely many implementations & produce arbitrarily "wiggly" reachable sets) • we use approximate descriptions of result states of HLAs that are • optimistic: REACH+(s, h) or pessimistic: REACH-(s, h) • one may overestimate, the other underestimate • here's the definition of the relationship • REACH-(s, h) REACH(s, h) REACH+(s, h)

  34. Planning in HLA Space • possible effects of HLAs using approximate descriptions of result states • with approximate descriptions, we need to reconsider how to apply/interpret the goal test • (1) if the optimistic reachable set for a plan does not intersect the goal, then the plan is not a solution • (2) if the pessimistic reachable set for a plan intersects the goal, then the plan is a solution • (3) if the optimistic set intersects but the pessimistic set does not, the goal test is not decided & we need to refine the plan to resolve residual ambiguity

  35. Planning in HLA Space • illustration • shading shows the set of goal states • reachable sets: R+ (optimistic) shown by dashed boundary, R- (pessimistic) by solid boundary • in (a) the plan shown by a dark arrow achieves the goal & the plan shown by the lighter arrow does not • in (b), the plan needs further refinement since the R+ (optimistic) set intersects the goal but the R- (pessimistic) does not

  36. Planning in HLA Space • the algorithm • hierarchical planning with approximate angelic descriptions function ANGELIC-SEARCH(problem, hierarchy, initialPlan) returns solution or fail frontier a FIFO queue with initialPlan as the only element loop do if EMPTY?(frontier) then returnfail plan  POP(frontier) /* chooses shallowest node in frontier */ if REACH+(problem.INITIAL-STATE, plan) intersects problem.GOAL then /* opt'c*/ ifplan is primitive then returnplan /* REACH+ is exact for primitive plans */ guaranteed  REACH-(problem.INITIAL-STATE, plan)  problem.GOAL /* pess'c*/ /* pessimistic set includes a goal state & we're not in infinite regress of refinements */ ifguaranteed  {} and MAKING-PROGRESS(plan, initialPlan) then finalState  any element of guaranteed return DECOMPOSE(hierarchy, problem.INITIAL-STATE, plan, finalState) hla  some HLA in plan prefix, suffix  the action subsequences before & after hla in plan for eachsequence in REFINEMENTS(hla, outcome, hierarchy) do frontier  INSERT(APPEND(prefix, sequence, suffix), frontier)

  37. Planning in HLA Space • the decompose function • mutually recursive with ANGELIC-SEARCH • regress from goal to generate successful plan at next level of refinement function DECOMPOSE(hierarchy, s0, plan, sf) returns a solution solution  an empty plan whileplan is not empty do action  REMOVE-LAST(plan) si  a state in REACH-(s0, plan) such that sf  REACH-(si,action) problem  a problem with INITIAL-STATE = si and GOAL = sf solution  APPEND(ANGELIC-SEARCH(problem, hierarchy, action), solution) sf  si returnsolution

  38. Planning in HLA Space • notes • ANGELIC-SEARCH has the same basic structure as the previous algorithm (BFS in space of refinements) • the algorithm detects plans that are or aren't solutions by checking intersections of optimistic & pessimistic reachable sets with the goal • when it finds a workable abstract plan, it decomposes the original problem into subproblems, one for each step of the plan • the initial state & goal for each subproblem are derived by regressing the guaranteed reachable goal state through the action schemas for each step of the plan • angelic-search has a computational advantage over the previous hierarchical search algorithm, which in turn may have a large advantage over plain old exhaustive search Planning & Acting in the Real World

  39. Least Cost & Angelic Search • the same approach can be adapted to find a least cost solution • this generalizes the reachable set concept so that a state, instead of being reachable or not, has costs for the most efficient way of getting to it ( for unreachable states) • then optimistic & pessimistic descriptions bound the costs • the holy grail of hierarchical planning • this revision may allow finding a provably optimal abstract plan without checking all implementations • extensions: the approach can also be applied to online search in the form of hierarchical lookahead algorithms (recall LRTA*) • the resulting algorithm resembles the human approach to problems like the vacation plan • initially consider alternatives at the abstract level, over long time scales • leave parts of the plan abstract until execution time, though other parts are expanded into detail (flights, lodging) to guarantee feasibility of the plan

  40. Nondeterministic Domains • finally, we'll relax some of the environment assumptions of the classical planning model • in part, these parallel the extensions of our earlier (CISC352) discussions of search • we'll consider the issues in 3 sub-categories • (1) sensorless planning (conformant planning) • completely drop the observability property for the environment • (2) contingency planning • for partially observable & nondeterministic environments • (3) online planning & replanning • for unknown environments • however, we begin with some background Planning & Acting in the Real World

  41. BKGD: Nondeterministic Domains • note some distinct differences from the search paradigms • the factored representation of states allows an alternative belief state representation • plus, we have the availability of the domain-independent heuristics developed for classical planning • as usual, we explore issues using a prototype problem • this time it's the task of painting a chair & table so that their colors match • in the initial state, the agent has 2 cans of paint, colors unknown, likewise the chair & table colors are unknown, & only the table is visible • plus there are actions to remove the lid of a can, & to paint from an open can (see the next slide)

  42. The Furniture Painting Problem • the furniture painting problem Init(Object(Table)  Object(Chair)  Can(C1)  Can(C2)  InView(Table) Goal(Color(Chair, c)  Color(Table, c)) Action(RemoveLid(can), PRECOND: Can(can) EFFECT: Open(can)) Action(Paint(x, can), PRECOND: Object(x)  Can(can)  Color(Can, c)  Open(can) EFFECT: Color(x, c)) Planning & Acting in the Real World

  43. BKGD: Nondeterministic Domains • the environment • since it may not be fully observable, we'll allow action schemas to have variables in preconditions & effects that aren't in the action's variable list • Paint(x, can) omits the variable c representing the color of the paint in can • the agent may not know what color is in a can • in some variants, the agent will have to use percepts it gets while executing the plan, so planning needs to model sensors • the mechanism: Percept Schemas Percept (Color(x, c), PRECOND: Object(x)  InView(x)) Percept (Color(can, c), PRECOND: Can(can)  InView(can)  Open(can)) • when an object is in view, the agent will perceive its color • if an open can is in view, the agent will perceive the paint color Planning & Acting in the Real World

  44. BKGD: Nondeterministic Domains • we still need an Action Schema for inspecting objects Action (LookAt(x), PRECOND: InView(y)  (x  y) EFFECT: InView(x)  ¬ InView(y)) • in a fully observable environment, we include a percept axiom with no preconds for each fluent • of course, a sensorless agent has no percept axioms • note: it can still coerce the table & chair to the same color to solve the problem (though it won't know what color that is) • a contingent planning agent with sensors can do better • inspect the objects, & if they're the same color, done • otherwise check the paint cans & if one is the same color as an object, paint the other object with it • otherwise paint both objects any color • an online agent produces contingent plans with few branches • handling problems as they occur by replanning Planning & Acting in the Real World

  45. BKGD: Nondeterministic Domains • a contingent planner assumes that the effects of an action are successful • a replanning agent checks results, generating new plans to fix any detected flaws • in the real world we find combinations of approaches Planning & Acting in the Real World

  46. Sensorless Planning Belief States • unobservable environment = Sensorless Planning • these problems are belief state planning problems with physical transitions represented by action schemas • we assume a deterministic environment • we represent belief states as logical formulas rather than the explicit sets of atomic states we saw for sensorless search • for the prototype planning problem: furniture painting • 1. we omit the InView fluents • 2. some fluents hold in all belief states, so we can omit them for brevity: (Object(Table), Object(Chair), Can(C1), Can(C2)) • 3. the agent knows things have a color (x c Color(x, c)), but doesn't know the color of anything or the open vs closed state of cans • 4. yields an initial belief state b0 = Color(x, C(x)), where C(x) is a Skolem function to replace the existentially quantified variable • 5. we drop the closed-world assumption of classical planning, so states may contain +ve & -ve fluents & if a fluent does not appear, its value is unknown

  47. Sensorless Planning Belief States • belief states • specify how the world could be • they are represented as logical formulas • each is a set of possible worlds that satisfy the formula • in a belief state b, actions available to the agent are those with their preconds satisfied in b • given the initial belief state b0 = Color(x, C(x)), a simple solution for the painting problem plan is: [RemoveLid(Can1), Paint(Chair, Can1), Paint(Table, Can1)] • we'll update belief states as actions are taken, using the rule • b' = RESULT(b, a) = {s': s' = RESULTP(s, a) and s  b} • where RESULTP defines the physical transition model Planning & Acting in the Real World

  48. Sensorless Planning Belief States • updating belief states • we assume that the initial belief state is 1-CNF form, that is, a conjunction of literals • b' is derived based on what happens for the literals l in the physical states s that are in b when a is applied • if the truth value of a literal is known in b then in b' it is given by the current value, plus the add list of a & the delete list of a • if a literal's truth value is unknown, 1 of 3 cases applies • 1. a adds l so it must be true in b' • 2. a deletes l so it must be false in b' • 3. a does not affect l so it remains unknown (thus is not in b') Planning & Acting in the Real World

  49. Sensorless Planning Belief States • updating belief states: the example plan • recall the sensorless agent's solution plan for the furniture painting problem [RemoveLid(Can1), Paint(Chair, Can1), Paint(Table, Can1)] • apply RemoveLid(Can1) to b0 = Color(x, C(x)) (1) b1= Color(x, C(x))  Open(Can1) • apply Paint(Chair, Can1) to b1 • precondition Color(Can1, c) is satisfied by Color(x, C(x)) with the binding {x/Can1, c/C(Can1)} (2) b2 = Color(x, C(x))  Open(Can1)  Color(Chair, C(Can1)) • now apply the last action to get the next belief state, b3 (3) b3 = Color(x, C(x))  Open(Can1)  Color(Chair, C(Can1))  Color(Table, C(Can1)) • note that this satisfies the plan goal (Goal(Color(Chair, c)  Color(Table, c))with c bound to C(Can1)

  50. Sensorless Planning Belief States • the painting problem solution • this illustrates that the family of belief states given as conjunctions of literals is closed under updates defined by PDDL action schemas • so given n total fluents, any belief state is represented as a conjunction of size O(n) (despite the O(2n) states in the world) • however, this is only the case when action schemas have the same effects for all states in which their preconds are satisfied • if an action's effects depends on the state, dependencies among fluents are introduced & the 1-CNF property does not apply • illustrated by an example from the simple vacuum world on the next slides Planning & Acting in the Real World