1 / 23

Section 10

Section 10. Mid-term Review II. Topics. Brew the coffee!. Three operators: 1. load(x ) precond : coffee(x ), loaded(none) effects: loaded(x ), ¬ loaded(none) 2. brew(x ) precond : loaded(x ), ¬loaded(none), ¬loaded(waste) effects : ¬ loaded(x), loaded(waste), pot(x)

rae
Download Presentation

Section 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 10 Mid-term Review II

  2. Topics

  3. Brew the coffee! • Three operators: 1. load(x) precond: coffee(x), loaded(none) effects: loaded(x), ¬loaded(none) 2. brew(x) precond: loaded(x), ¬loaded(none), ¬loaded(waste) effects: ¬loaded(x), loaded(waste), pot(x) 3. unload(x) precond: loaded(x), ¬loaded(none) effects: ¬loaded(x), loaded(none) • Two types of coffee: caf & decaf; waste; none • Initial state: coffee(caf), coffee(decaf), loaded(none) • Goal state: pot(caf), pot(decaf)

  4. Graphplan! (Problem 1) • Graphplan works only for propositional planning problems! • Core elements: Expand-Graph keep track of mutex action and propositions Extract-Solution

  5. Propositionalize the PDDL • Eliminate variables by replacing them with constant symbols • Example of propositionalizedfluents: loadedCaf: loaded(caf) • Example of propositionalized actions: brewCaf: brew(caf) precond: loaded(caf), ¬loaded(none), ¬loaded(waste) effects: ¬loaded(caf), loaded(waste), pot(caf) • Propositionalized initial state: coffeeCaf, coffeeDecaf, loadedNone • Propositionalized goal state: potCaf, potDecaf

  6. Expand the Graph loadCaf loadDecaf coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedNone ¬loadedNone coffeeCaf coffeeDecaf loadedNone coffeeCaf coffeeDecaf loadedNone P0 A1 P1

  7. Keep track of the Mutex • Mutex actions Not independent Action A deletes Action B’s precondition Action A deletes Action B’s positive effect Any of the precondition pairs are mutex • Mutex propositions All producer pairs are mutex

  8. Mutex Actions and Propositions • Mutex actions in A1: (loadCaf, loadDecaf) (loadCaf, loadedNone) (loadDecaf, loadedNone) • Mutexpropositions in P1: (loadedCaf, loadedDecaf) (loadedCaf, loadedNone) (loadedDecaf, loadedNone) (loadedNone, ¬loadedNone)

  9. Continue Expand the Graph brewCaf brewDecaf PotCaf PotDecaf coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedWaste loadedNone ¬loadedNone loadCaf loadCaf loadDecaf unloadCaf loadDecaf coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedNone ¬loadedNone coffeeCaf coffeeDecaf loadedNone unloadDecaf coffeeCaf coffeeCaf coffeeDecaf coffeeDecaf loadedCaf loadedDecaf loadedNone loadedNone ¬ loadedNone

  10. Extract Solution • Graphplan starts to extract solution iff All goal state fluents appear in a proposition level None of the goal state fluent pairs is mutex • Extract the solution Graphplan gives you a valid plan, but not necessarily an optimal one (with the minimum number of actions) Multiple actions can take place in one action level!

  11. Partial-Order Planning (Problem 2) • Causal links Action A: Action B: precond: … precond: p(y), … effects: p(x), … effects: … A—p—B! • Threats Action C: precond: … effects: ¬p(z), … C is a threat to the A—B causal link!

  12. Causal Links and Threats • Causal Links Example load(x)—loaded(x)—brew(x) • Threats Example unload(x) could be a threat to the causal link above!

  13. Demotion and Promotion • A—p(x)—B, C is a threat to this causal link Demotion: C—A—B Promotion: A—B—C • load(x)—loaded(x)—brew(x) is a causal link, unload(x) is a threat to this causal link Demotion: unload(x1)—load(x2)—brew(x3) possible variable bindings: x1=waste, x2=x3=decaf Promotion: load(x1)—brew(x2)—unload(x3) possible variable bindings:x1= x2=decaf, x3=waste

  14. HTN (Problem 3) • Serve_two_things(t) task: serve_coffee_and_cake(t) precond: table(t) subtasks: serve(coffee,t), serve(cake,t) • Serve_coffee(x, t) task: serve(x,t) precond: coffee(x), table(t) subtasks: make-coffee(x), move(x, t) • Serve_cake(x, t) task: serve(x,t) precond: cake(x), table(t) subtasks: make-cake(x), move(x, t)

  15. HTN(cont’d) • Make-Caf-Coffee(x, b, m) task: make-coffee(x) precond: bean(b), caf-bean(b), coffee-maker(m), coffee(x) subtasks: load(b, m), brew(b, m, x) • Make-Decaf-Coffee(x, b, m) task: make-coffee(x) precond: bean(b), decaf-bean(b), coffee-maker(m), coffee(x) subtasks: load(b, m), brew(b, m, x) • Load(b, m)[Primitive task!] precond: bean(b), coffee-maker(m), unloaded(m) effects: loaded(b, m) • Brew(b, m, x) [Primitive task!] precond: loaded(b, m), bean(b), coffee-maker(m) effects: coffee(x), in(x, m)

  16. serve_coffee_and_cake (t0) Serve_two_things(t0) table(t0) serve(coffee) serve(cake) Serve_coffee(coffee, t0) coffee(coffee), table(t0) make-coffee(coffee) move(coffee, t0) Make-Caf-Coffee(coffee, caf-bean, machine) Make-Decaf-Coffee(coffee, decaf-bean, machine)

  17. MDP (Problem 4) • You are making a three-year investment plan now. After your research, you find there are two companies which you’re interested in investing: Boston Medicine and San Francisco Chips. • Currently the stock price is $10 per share for Boston Medicine and $12 per share for San Francisco Chips. • At the beginning of each year, you will decide which company to invest in, and once you make the decision, you will buy 1000 shares from that company. • At the end of each year, you will earn / loss money depending on whether the stock price of the company you invest goes up or down.

  18. MDP (Problem 4) • Particularly, the stock prices change according to the following transition matrices: For Boston Medicine: For San Francisco Chips:

  19. MDP (Problem 4) • States? • <prevPriceBM, prevPriceSFC, currPriceBM, currPriceSFC, prevAct> • Actions? • BM, SFC • Rewards? • <prevPriceBM, prevPriceSFC, currPriceBM, currPriceSFC, BM>  (currPriceBM-prevPriceBM)*1000 • <prevPriceBM, prevPriceSFC, currPriceBM, currPriceSFC, SFC>  (currPriceSFC - prevPriceSFC)*1000

  20. MDP (Problem 4) • Transitions?

  21. Logic-Based vs. Decision-Theory • Decision theory: • Utilities (rewards) • Uncertainties (transition probabilities) • View the world as states • Policy defines: given a state, which action to take • Logic based (propositional, PDDL) • Goal state we want to reach • Actions with preconditions and deterministic affects • Factored state representation • In HTNs, Hierarchical representation of tasks

  22. Which approach would you use? • What approach would you use to model each of the following planning problems? If both options seem reasonable, explain the advantages and limitations of each: • Planning how your team should work on a class project • Programing a robot that participates in RoboCup • Deciding where to eat on campus every day

  23. Other Questions • Assume that we wanted to model what to eat in the dining room every day using an MDP. We defined the states as the available options, and we defined rewards based on our food preferences and taking into account other considerations as not wanting to eat the same food for two days in a row. • How would you go about defining the transition function? • If we use an optimal algorithm like value iteration to solve our MDP, are we guaranteed to have the optimal policy?

More Related