1 / 81

820 likes | 952 Views

CISC453 Winter 2010. Planning & Acting in the Real World AIMA3 e Ch 11 Time & Resources Hierarchical Techniques Relaxing Environmental Assumptions. Overview. extending planning language & algorithms 1. allow actions that have durations & resource constraints

Download Presentation
## CISC453 Winter 2010

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**CISC453 Winter 2010**Planning & Acting in the Real World AIMA3eCh 11 Time & Resources Hierarchical Techniques Relaxing Environmental Assumptions**Overview**• extending planning language & algorithms • 1. allow actions that have durations & resource constraints • yields a new "scheduling problem" paradigm • incorporating action durations & timing, required resources • 2. hierarchical planning techniques • control the complexity of large scale plans by hierarchical structuring of actions • 3. uncertain environments • non-deterministic domains • 4. multiagent environments Planning & Acting in the Real World**Scheduling versus Planning**• recall from classical planning (Ch 10) • PDDL representations only allowed us to decide the relative ordering among planning actions • up till now we've concentrated on what actions to do, given their PRECONDs & EFFECTs • in the real world, other properties must be considered • actions occur at particular moments in time, have a beginning and an end, occupy or require a certain amount of time • for a new category of Scheduling Problems we need to consider the absolute times when an event or action will occur & the durations of the events or actions • typically these are solved in 2 phases: planning then scheduling • a planning phase selects actions, respecting ordering constraints • this might be done by a human expert, and automated planners are suitable if they yield minimal ordering constraints • then a scheduling phase incorporates temporal information so that the result meets resource & deadline constraints**Time, Schedules & Resources**• the Job-Shop Scheduling (JSS) paradigm includes • the requirement to complete a set of jobs • each job consists of a sequence of actions with ordering constraints • each action • has a given duration and may also require some resources • resource constraints indicate the type of resource, the number of it that are required, and whether the resource is consumed in the action or is reusable • the goal is to determine a schedule • one that minimizes the total time required to complete alljobs, (the makespan) • while respecting resource requirements & constraints Planning & Acting in the Real World**Job-Shop Scheduling Problem (JSSP)**• JSSP involves a list of jobs to do • where a job is a fixed sequence of actions • actions have quantitative time durations & ordering constraints • actions use resources (which may be shared among jobs) • to solve the JSSP: find a schedule that • determines a start time for each action • 1. that obeys all hard constraints • e.g. no temporal overlap between mutex actions (those using the same one-action-at-a-time resource) • 2. for our purposes, we'll operationalize cost as the total time to perform all actions and jobs • note that the cost function could be more complex (it could include the resources used, time delays incurred, ...) • our example: automobile assembly scheduling • the jobs: assemble two cars • each job has 3 actions: add the engine, add the wheels, inspect the whole car • a resource constraint is that we do the engine & wheel actions at a special one-car-only work station**Ex: Car Construction Scheduling**• the job shop scheduling problem of assembling 2 cars • includes required times & resource constraints • notation: A < B indicates action A must precede action B Jobs({AddEngine1 < AddWheels1 < Inspect1}, {AddEngine2 < AddWheels2 < Inspect2}) Resources (EngineHoists(1), WheelStations(1), Inspectors(2), LugNuts(500)) Action(AddEngine1, DURATION: 30, USE: EngineHoists(1)) Action(AddEngine2, DURATION: 60, USE: EngineHoists(1)) Action(AddWheels1, DURATION:30, CONSUME: LugNuts(20), USE: WheelStations(1)) Action(AddWheels2, DURATION:15, CONSUME: LugNuts(20), USE: WheelStations(1)) Action(Inspecti DURATION: 10, USE: Inspectors(1)) Planning & Acting in the Real World**Car Construction Scheduling**• note that the action schemas • list resources as numerical quantities, not named entities • so Inspectors(2), rather than Inspector(I1) & Inspector(I2) • this process of aggregation is a general one • it groups objects that are indistinguishable with respect to the current purpose • this can help reduce complexity of the solution • for example, a candidate schedule that requires (concurrently) more than the number of aggregated resources can be rejected without having to exhaustively try assignments of individuals to actions Planning & Acting in the Real World**Planning + Scheduling for JSSP**• Planning + Scheduling for Job-Shop Problems • scheduling differs from standard planning problem • considers when an action starts and when it ends • so in addition to order (planning), duration is also considered • we begin with ignoring the resource constraints, solving the temporal domain issues to minimize the makespan • this requires finding the earliest start times for all actions consistent with the problem's ordering constraints • we create a partially-ordered plan, representing ordering constraints in a directed graph of actions • then we apply the critical path method to determine thestart and end times for each action Planning & Acting in the Real World**Graph of POP + Critical Path**• the critical path is the path with longest total duration • it is "critical" in that it sets the duration for the whole plan and delaying the start of any action on it extends the whole plan • it is the sequence of actions, each of which has no slack • each must begin at a particular time, otherwise the whole plan is delayed • actions off the critical path have a window of time given by the earliest possible start time ES & the latest possible start time LS • the illustrated solution assumes no resource constraints • note that the 2 engines are being added simultaneously • the figure shows [ES, LS] for each action, & slack is LS - ES • the time required is indicated below the action name & bold links mark the critical path**JSSP: (1)Temporal Constraints**• schedule for the problem • is given by ES & LS times for all actions • note the 15 minutes slack for each action in the top job, versus 0 (by definition) in the critical path job • formulas for ES & LS also outline a dynamic-programming algorithm for computing them • A, B are actions, A < B indicates A must come before B ES(Start) =0 ES(B) = maxA<B ES(A) + Duration(A) LS(Finish) = ES(Finish) LS(A) = minB>A LS(B) - Duration(A) • complexity is O(Nb) where N is number of actions and b is the maximum branching factor into or out of an action • so without resource constraints, given a partial ordering of actions, finding the minimum duration schedule is (a pleasant surprise!) computationally easy**JSSP: (1)Temporal Constraints**• timeline for the solution • grey rectangles give intervals for actions • empty portions show slack Planning & Acting in the Real World**Solution from POP + Critical Path**• 1. the partially-ordered plan (above) • 2. the schedule from the critical-path method (below) • notice that this solution still omits resource constraints • for example, the 2 engines are being added simultaneously**Scheduling with Resources**• including resource constraints • critical path calculations involve conjunctions of linear inequalities over action start & end times • they become more complicated when resource constraints are included (for example, each AddEngine action requires the 1 EngineHoist, so they cannot overlap) • they introduce disjunctions of linear inequalities for possible orderings & as a result, complexity becomes NP-hard!! • here's a solution accounting for resource constraints • reusable resources are in the left column, actions align with resources • this shortest solution schedule requires 115 minutes**Scheduling with Resources**• including resource constraints • notice • that the shortest solution is 30 minutes longer than the critical path without resource constraints • that multiple inspector resource units are not needed for this job, indicating the possibility for reallocation of this resource • that the "critical path" now is: AddEngine1, AddEngine2, AddWheels2, Inspect2. • the remaining actions have considerable slack time, they can begin much later without affecting the total plan time**Scheduling with Resources**• for including resource constraints • a variety of solution techniques have been tested • one simple approach uses the minimum slack heuristic • at each step schedule next the unscheduled action that has its predecessors scheduled & has the least slack • update ES & LS for impacted actions & repeat • note the similarity to minimum-remaining values (MRV) heuristic of CSPs • applied to this example, it yields a 130 minute solution • 15 minutes longer than the optimal solution • difficult scheduling problems may require a different approach • they may involve reconsidering actions & constraints, integrating the planning & scheduling phases by including durations & overlaps in constructing the POP • this approach is a focus of current research interest Planning & Acting in the Real World**Time & Resource Constraints**• summary • alternative approaches to planning with time & resource constraints • 1. serial: plan, then schedule • use a partial or full-order planner • then schedule to determine actual start times • 2. interleaved: mix planning and scheduling • for example, include resource constraints during partial planning • these can determine conflicts between actions • notes: • remember that so far we are still working in classical planning environments • so, fully observable, deterministic, static and discrete Planning & Acting in the Real World**Hierarchical Planning**• next • we add techniques to the handle plan complexity issue • HTN: hierarchical task network planning • this works in a top-down fashion • similar to the stepwise refinement approach to programming • plans that are built from a fixed set of small atomic actions will become unwieldy as the planning problem grows large • we need to plan at a higher level of abstraction • reduce complexity byhierarchical decomposition of plan steps • at each level of the hierarchy a planning task is reduced to a small number of activities at the next lower level • the low number of activities • means the computational cost of arranging these activities can be lowered Planning & Acting in the Real World**Hierarchical Planning**• an example: the Hawaiian vacation plan • recall: the AIMA authors live/work in San Francisco Bay area • go to SFO airport • take flight to Honolulu • do vacation stuff for 2 weeks • take flight back to SFO • go Home • each action in this plan actually embodies another planning task • for example: the go to SFO airport action might be expanded • drive to long term parking at SFO • park • take shuttle to passenger terminal • & each action can be decomposed until the level consists of actions that can be executed without deliberation • note: some component actions might not be refined until plan execution time (interleaving: a somewhat different topic) Planning & Acting in the Real World**Hierarchical Planning**• basic approach • at each level, each component is reduced to a small number of activities at the next lower level • this keeps the computational cost of arranging them low • otherwise, there are too many individual atomic actions for non-trivial problems (yielding high branching factor & depth) • the formalism is HTN planning • Hierarchical Task Network planning • notes • we retain the basic environmental assumptions as for classical planning • what we previously simply called actions are now "primitive actions" • we add HLAs: High Level Actions (like go to SFO airport) • each has 1 or more possible refinements • refinements are sequences of actions, either HLAs or primitive actions**Hierarchical Task Network**• alternative refinements: notation • for the HLA: Go(Home, SFO) Refinement (Go(Home, SFO), STEPS: [Drive(Home, SFOLongTermParking), Shuttle(SFOLongTermParking, SFO)]) Refinement (Go(Home, SFO), STEPS: [Taxi(Home, SFO)]) • the HLAs and their refinements • capture knowledge about how to do things • terminology: if the HLA refines to only primitive actions • it is called an implementation • the implementation of a high-level plan (sequence of HLAs) • concatenates the implementations for each HLA • the preconditions/effects representation of primitive action schemas allows a decision about whether an implementation of a high-level plan achieves the goal**Hierarchical Task Network**• HLAs & refinements & plan goals • in the HTN approach, the goal is achieved if any implementation achieves it • this is the case since an agent may choose the implementation to execute (unlike non-deterministic environments where "nature" chooses) • in the simplest case there's a single implementation of an HLA • we get preconds/effects from the implementation, and then treat the HLA as a primitive action • where there are multiple implementations, either • 1. search over implementations for 1 that solves the problem • OR • 2. reason over HLAs directly • derive provably correct abstract plans independent of the specific implementations Planning & Acting in the Real World**Search Over Implementations**• 1. the search approach • this involves generation of refinements by replacing an HLA in the current plan with a candidate refinement until the plan achieves the goal • the algorithm on the next slide shows a version using breadth-first tree search, considering plans in the order of the depth of nesting of refinements • note that other search versions (graph-search) and strategies (depth-first, iterative deepening) may be formulated by re-designing the algorithm • explores the space of sequences derived from knowledge in the HLA library re: how things should be done • the action sequences of refinements & their preconditions code knowledge about the planning domain • HTN planners can generate very large plans with little search Planning & Acting in the Real World**Search Over Implementations**• the search algorithm for refinements of HLAs function HIERARCHICAL-SEARCH(problem, hierarchy) returns a solution or failure frontier a FIFO queue with [Act] as the only element loop do if EMPTY?(frontier) then return failure plan POP(frontier) /* chooses the shallowest plan in frontier */ hla the first HLA in plan, or null if none prefix, suffix the action subsequences before and after hla in plan outcome RESULT(problem.INITIAL-STATE, prefix) ifhla is null then /* so plan is primitive & outcome is its result */ ifoutcome satisfies problem.GOAL then returnplan /* insert all refinements of the current hla into the queue */ else for eachsequencein REFINEMENTS(hla, outcome, hierarchy) do frontier INSERT(APPEND(prefix, sequence, suffix), frontier) Planning & Acting in the Real World**HTN Examples**• O-PLAN: an example of a real-world system • the O-PLAN system does both planning & scheduling, commercially for the Hitachi company • one specific sample problem concerns a product line of 350 items involving 35 machines and 2000+ different operations • for this problem, the planner produces a 30-day schedule of 3x8-hour shifts, with 10s of millions of steps • a major benefit of the hierarchical structure with the HTN approach is the results are often easily understood by humans • abstracting away from excessive detail • (1) makes large scale planning/scheduling feasible • (2) enhances comprehensibility Planning & Acting in the Real World**HTN Efficiency**• computational comparisons for a hypothetical domain • assumption 1: a non-hierarchical progression planner with d primitive actions, b possibilities at each state: O(bd) • assumption 2: an HTN planner with r refinements of each non-primitive, each with k actions at each level • how many different refinement trees does this yield? • depth: number of levels below the root = logkd • then the number of internal refinement nodes = 1 + k + k2 + … + klogkd-1 = (d - 1)/(k - 1) • each internal node has r possible refinements, so r(d - 1)/(k - 1) possible regular decomposition trees • the message: keeping r small & k large yields big savings (roughly kth root of non-hierarchical cost if b & r are comparable) • nice as a goal, but long action sequences that are useful over a range of problems are rare Planning & Acting in the Real World**HTN Efficiency**• HTN computational efficiency • building the plan library is critically important to achieving efficiency gains HTN planning • so, might the refinements be learned? • as one example, an agent could build plans conventionally then save them as a refinement of an HLA defined as the current task/problem • one goal is "generalizing" the methods that are built, eliminating problem-instance specific detail, keeping only key plan components Planning & Acting in the Real World**Hierarchical Planning**• we've just looked at the approach of searching over fully refined plans • that is, full implementations • the algorithm refines plans to primitive actions in order to check whether they achieve the problem goal • now we move on to searching for abstract solutions • the checking occurs at the level of HLAs • possibly with preconditions/effects descriptions for HLAs • the result is that search is in the much smaller HLA space, after which we refine the resulting plan Planning & Acting in the Real World**Hierarchical Planning**• searching for abstract solutions • this approach will require that HLA descriptions have the downward refinement property • every high level plan that apparently solves the problem (from the description of its steps) has at least 1 implementation that achieves the goal • since search is not at the level of sequences of primitive actions, a core issue is the describing of effects of actions (HLAs) with multiple implementations • assuming a problem description with only +ve preconds & goals, we might describe an HLA's +ve effects in terms of those achieved by every implementation, and its -ve effects in terms of those resulting from any implementation • this would satisfy the downward refinement property • however, requiring an effect to be true for every implementation is too restrictive, it assumes that an adversary chooses the implementation (assumes an underlying non-deterministic model)**Plan Search in HLA Space**• plan search in HLA space • there are alternative models for which implementation is chosen, either • (1) demonic non-determinism where some adversary makes the choice • (2) angelic non-determinism, where the agent chooses • if we adopt angelic semantics for HLA descriptions • the resulting notation uses simple set operations/notation • the key concept is that of the reachable set for some HLA h & state s, notation: Reach(s, h) • this is the set of states reachable by any implementation of h (since under angelic semantics, the agent gets to choose) • for a sequence of HLAs [h1, h2] the reachable set is the union of all reachable sets from applying h2 in each state in the reachable set of h1 (for notation details see p 411) • a sequence of HLAs forming a high level plan is a solution if its reachable set intersects the set of goal states Planning & Acting in the Real World**Plan Search in HLA Space**• illustration of reachable sets, sequences of HLAs • dots are states, shaded areas = goal states • darker arrows: possible implementations of h1 • lighter arrows: possible implementations of h2 • (a) reachable set for HLA h1 • (b) reachable set for the sequence [h1, h2] • circled dots show the sequence achieving the goal Planning & Acting in the Real World**Planning in HLA Space**• using this model • planning consists of searching in HLA space for a sequence with a reachable set that intersects the goal, then refining that abstract plan • note: we haven't considered yet the issue of representing reachable sets as the effects of HLAs • our basic planning model has states as conjunctions of fluents • if we treat the fluents of a planning problem as state variables, then under angelic semantics an HLA controls the values of these variables, depending on which implementation is actually selected • HLA may have 9 different effects on a given variable • if it starts true, in can always keep it true, always make it false, or have a choice & similarly for a variable that is initially false • any combination of the 3 choices for each case is possible, yielding 32 or 9 effects Planning & Acting in the Real World**Planning in HLA Space**• using this model • so there are 9 possible combinations of choices for the effects on variables • we introduce some additional notation to capture this idea • note some slight formatting differences between the details of the notation used here versus in the textbook • ~ indicates possibility, the dependence on the agent's choice of implementation • ~+A indicates the possibility of adding A • ~-A represents the possible deleting of A • ~±A stands for possibly adding or deleting A Planning & Acting in the Real World**Planning in HLA Space**• possible effects of HLAs • a simple example uses the HLA for going to the airport Go(Home, SFO) Refinement (Go(Home, SFO), STEPS: [Drive(Home, SFOLongTermParking), Shuttle(SFOLongTermParking, SFO)]) Refinement (Go(Home, SFO), STEPS: [Taxi(Home, SFO)]) • this HLA has ~-Cash as a possible effect, since the agent may choose the refinement of going by taxi & have to pay • we can use this notation & angelic reachable state semantics to illustrate how an HLA sequence [h1, h2] reaches a goal • it's often the case that an HLA's effects can only be approximated (since it may have infinitely many implementations & produce arbitrarily "wiggly" reachable sets) • we use approximate descriptions of result states of HLAs that are • optimistic: REACH+(s, h) or pessimistic: REACH-(s, h) • one may overestimate, the other underestimate • here's the definition of the relationship • REACH-(s, h) REACH(s, h) REACH+(s, h)**Planning in HLA Space**• possible effects of HLAs using approximate descriptions of result states • with approximate descriptions, we need to reconsider how to apply/interpret the goal test • (1) if the optimistic reachable set for a plan does not intersect the goal, then the plan is not a solution • (2) if the pessimistic reachable set for a plan intersects the goal, then the plan is a solution • (3) if the optimistic set intersects but the pessimistic set does not, the goal test is not decided & we need to refine the plan to resolve residual ambiguity**Planning in HLA Space**• illustration • shading shows the set of goal states • reachable sets: R+ (optimistic) shown by dashed boundary, R- (pessimistic) by solid boundary • in (a) the plan shown by a dark arrow achieves the goal & the plan shown by the lighter arrow does not • in (b), the plan needs further refinement since the R+ (optimistic) set intersects the goal but the R- (pessimistic) does not**Planning in HLA Space**• the algorithm • hierarchical planning with approximate angelic descriptions function ANGELIC-SEARCH(problem, hierarchy, initialPlan) returns solution or fail frontier a FIFO queue with initialPlan as the only element loop do if EMPTY?(frontier) then returnfail plan POP(frontier) /* chooses shallowest node in frontier */ if REACH+(problem.INITIAL-STATE, plan) intersects problem.GOAL then /* opt'c*/ ifplan is primitive then returnplan /* REACH+ is exact for primitive plans */ guaranteed REACH-(problem.INITIAL-STATE, plan) problem.GOAL /* pess'c*/ /* pessimistic set includes a goal state & we're not in infinite regress of refinements */ ifguaranteed {} and MAKING-PROGRESS(plan, initialPlan) then finalState any element of guaranteed return DECOMPOSE(hierarchy, problem.INITIAL-STATE, plan, finalState) hla some HLA in plan prefix, suffix the action subsequences before & after hla in plan for eachsequence in REFINEMENTS(hla, outcome, hierarchy) do frontier INSERT(APPEND(prefix, sequence, suffix), frontier)**Planning in HLA Space**• the decompose function • mutually recursive with ANGELIC-SEARCH • regress from goal to generate successful plan at next level of refinement function DECOMPOSE(hierarchy, s0, plan, sf) returns a solution solution an empty plan whileplan is not empty do action REMOVE-LAST(plan) si a state in REACH-(s0, plan) such that sf REACH-(si,action) problem a problem with INITIAL-STATE = si and GOAL = sf solution APPEND(ANGELIC-SEARCH(problem, hierarchy, action), solution) sf si returnsolution**Planning in HLA Space**• notes • ANGELIC-SEARCH has the same basic structure as the previous algorithm (BFS in space of refinements) • the algorithm detects plans that are or aren't solutions by checking intersections of optimistic & pessimistic reachable sets with the goal • when it finds a workable abstract plan, it decomposes the original problem into subproblems, one for each step of the plan • the initial state & goal for each subproblem are derived by regressing the guaranteed reachable goal state through the action schemas for each step of the plan • angelic-search has a computational advantage over the previous hierarchical search algorithm, which in turn may have a large advantage over plain old exhaustive search Planning & Acting in the Real World**Least Cost & Angelic Search**• the same approach can be adapted to find a least cost solution • this generalizes the reachable set concept so that a state, instead of being reachable or not, has costs for the most efficient way of getting to it ( for unreachable states) • then optimistic & pessimistic descriptions bound the costs • the holy grail of hierarchical planning • this revision may allow finding a provably optimal abstract plan without checking all implementations • extensions: the approach can also be applied to online search in the form of hierarchical lookahead algorithms (recall LRTA*) • the resulting algorithm resembles the human approach to problems like the vacation plan • initially consider alternatives at the abstract level, over long time scales • leave parts of the plan abstract until execution time, though other parts are expanded into detail (flights, lodging) to guarantee feasibility of the plan**Nondeterministic Domains**• finally, we'll relax some of the environment assumptions of the classical planning model • in part, these parallel the extensions of our earlier (CISC352) discussions of search • we'll consider the issues in 3 sub-categories • (1) sensorless planning (conformant planning) • completely drop the observability property for the environment • (2) contingency planning • for partially observable & nondeterministic environments • (3) online planning & replanning • for unknown environments • however, we begin with some background Planning & Acting in the Real World**BKGD: Nondeterministic Domains**• note some distinct differences from the search paradigms • the factored representation of states allows an alternative belief state representation • plus, we have the availability of the domain-independent heuristics developed for classical planning • as usual, we explore issues using a prototype problem • this time it's the task of painting a chair & table so that their colors match • in the initial state, the agent has 2 cans of paint, colors unknown, likewise the chair & table colors are unknown, & only the table is visible • plus there are actions to remove the lid of a can, & to paint from an open can (see the next slide)**The Furniture Painting Problem**• the furniture painting problem Init(Object(Table) Object(Chair) Can(C1) Can(C2) InView(Table) Goal(Color(Chair, c) Color(Table, c)) Action(RemoveLid(can), PRECOND: Can(can) EFFECT: Open(can)) Action(Paint(x, can), PRECOND: Object(x) Can(can) Color(Can, c) Open(can) EFFECT: Color(x, c)) Planning & Acting in the Real World**BKGD: Nondeterministic Domains**• the environment • since it may not be fully observable, we'll allow action schemas to have variables in preconditions & effects that aren't in the action's variable list • Paint(x, can) omits the variable c representing the color of the paint in can • the agent may not know what color is in a can • in some variants, the agent will have to use percepts it gets while executing the plan, so planning needs to model sensors • the mechanism: Percept Schemas Percept (Color(x, c), PRECOND: Object(x) InView(x)) Percept (Color(can, c), PRECOND: Can(can) InView(can) Open(can)) • when an object is in view, the agent will perceive its color • if an open can is in view, the agent will perceive the paint color Planning & Acting in the Real World**BKGD: Nondeterministic Domains**• we still need an Action Schema for inspecting objects Action (LookAt(x), PRECOND: InView(y) (x y) EFFECT: InView(x) ¬ InView(y)) • in a fully observable environment, we include a percept axiom with no preconds for each fluent • of course, a sensorless agent has no percept axioms • note: it can still coerce the table & chair to the same color to solve the problem (though it won't know what color that is) • a contingent planning agent with sensors can do better • inspect the objects, & if they're the same color, done • otherwise check the paint cans & if one is the same color as an object, paint the other object with it • otherwise paint both objects any color • an online agent produces contingent plans with few branches • handling problems as they occur by replanning Planning & Acting in the Real World**BKGD: Nondeterministic Domains**• a contingent planner assumes that the effects of an action are successful • a replanning agent checks results, generating new plans to fix any detected flaws • in the real world we find combinations of approaches Planning & Acting in the Real World**Sensorless Planning Belief States**• unobservable environment = Sensorless Planning • these problems are belief state planning problems with physical transitions represented by action schemas • we assume a deterministic environment • we represent belief states as logical formulas rather than the explicit sets of atomic states we saw for sensorless search • for the prototype planning problem: furniture painting • 1. we omit the InView fluents • 2. some fluents hold in all belief states, so we can omit them for brevity: (Object(Table), Object(Chair), Can(C1), Can(C2)) • 3. the agent knows things have a color (x c Color(x, c)), but doesn't know the color of anything or the open vs closed state of cans • 4. yields an initial belief state b0 = Color(x, C(x)), where C(x) is a Skolem function to replace the existentially quantified variable • 5. we drop the closed-world assumption of classical planning, so states may contain +ve & -ve fluents & if a fluent does not appear, its value is unknown**Sensorless Planning Belief States**• belief states • specify how the world could be • they are represented as logical formulas • each is a set of possible worlds that satisfy the formula • in a belief state b, actions available to the agent are those with their preconds satisfied in b • given the initial belief state b0 = Color(x, C(x)), a simple solution for the painting problem plan is: [RemoveLid(Can1), Paint(Chair, Can1), Paint(Table, Can1)] • we'll update belief states as actions are taken, using the rule • b' = RESULT(b, a) = {s': s' = RESULTP(s, a) and s b} • where RESULTP defines the physical transition model Planning & Acting in the Real World**Sensorless Planning Belief States**• updating belief states • we assume that the initial belief state is 1-CNF form, that is, a conjunction of literals • b' is derived based on what happens for the literals l in the physical states s that are in b when a is applied • if the truth value of a literal is known in b then in b' it is given by the current value, plus the add list of a & the delete list of a • if a literal's truth value is unknown, 1 of 3 cases applies • 1. a adds l so it must be true in b' • 2. a deletes l so it must be false in b' • 3. a does not affect l so it remains unknown (thus is not in b') Planning & Acting in the Real World**Sensorless Planning Belief States**• updating belief states: the example plan • recall the sensorless agent's solution plan for the furniture painting problem [RemoveLid(Can1), Paint(Chair, Can1), Paint(Table, Can1)] • apply RemoveLid(Can1) to b0 = Color(x, C(x)) (1) b1= Color(x, C(x)) Open(Can1) • apply Paint(Chair, Can1) to b1 • precondition Color(Can1, c) is satisfied by Color(x, C(x)) with the binding {x/Can1, c/C(Can1)} (2) b2 = Color(x, C(x)) Open(Can1) Color(Chair, C(Can1)) • now apply the last action to get the next belief state, b3 (3) b3 = Color(x, C(x)) Open(Can1) Color(Chair, C(Can1)) Color(Table, C(Can1)) • note that this satisfies the plan goal (Goal(Color(Chair, c) Color(Table, c))with c bound to C(Can1)**Sensorless Planning Belief States**• the painting problem solution • this illustrates that the family of belief states given as conjunctions of literals is closed under updates defined by PDDL action schemas • so given n total fluents, any belief state is represented as a conjunction of size O(n) (despite the O(2n) states in the world) • however, this is only the case when action schemas have the same effects for all states in which their preconds are satisfied • if an action's effects depends on the state, dependencies among fluents are introduced & the 1-CNF property does not apply • illustrated by an example from the simple vacuum world on the next slides Planning & Acting in the Real World

More Related