Planning and execution with phase transitions
Download
1 / 20

Planning and Execution with Phase Transitions - PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on

Planning and Execution with Phase Transitions. H åkan L. S. Younes Carnegie Mellon University. Follow-up paper to Younes & Simmons’ “Solving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions” ( AAAI’04 ). Planning with Time and Probability. Temporal uncertainty

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Planning and Execution with Phase Transitions' - ella


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Planning and execution with phase transitions

Planning and Executionwith Phase Transitions

Håkan L. S. Younes

Carnegie Mellon University

Follow-up paper to Younes & Simmons’“Solving Generalized Semi-Markov Processes usingContinuous Phase-Type Distributions” (AAAI’04)


Planning with time and probability
Planning withTime and Probability

  • Temporal uncertainty

  • When to schedule service?

Service

F1

Fail

F3

Serviced

Working

Failed

Return

F2

Memoryless distributions  MDP

Planning and Execution with Phase Transitions


The role of phases
The Role of Phases

  • Memoryless property of exponential distribution makes MDPs tractable

  • Many phenomena are not memoryless

    • Lifetime of product or computer process

  • Phases introduce memory into state space

    • Modeling tool—not part of real world (phases are hidden variables)

Planning and Execution with Phase Transitions


Phase type distributions

Exponential

Two-phase Coxian

p1

2

1

1

2

(1 – p)1

n-phase Erlang

1

2

n

Phase-Type Distributions

  • Time to absorption in Markov chain with n transient states and one absorbing state

Planning and Execution with Phase Transitions


Phase type approximation
Phase-Type Approximation

F(x)

Weibull(8,4.5)

1

Erlang(7.3/8,8)

Erlang(7.3/4,4)

Exponential(1/7.3)

0

x

0

2

4

6

8

10

12

Planning and Execution with Phase Transitions


Solving mdps with phase transitions
Solving MDPs withPhase Transitions

  • Solve MDP as if phases were observable

  • Maintain belief distribution over phase configurations during execution

    • AAAI’04 paper: Simulate phase transitions

  • Use QMDP value method to select actions

    • OK, because there are typically no actions that are useful for observing phases

Planning and Execution with Phase Transitions


Factored event model
Factored Event Model

  • Asynchronous events and actions

  • Each event/action has:

    • Enabling condition e

    • (Phase-type) distribution Ge governing the time e must remain enabled before it triggers

    • A distribution pe(s′; s) determining the probability that the next state is s′ if e triggers in state s

Planning and Execution with Phase Transitions


Event example
Event Example

  • Computer cluster with crash event and reboot action for each computer

  • Crash event for computer i:

    • i = upi; p(s′; s) = 1 iff s upi and s′upi

  • Reboot action for computer i:

    • i = upi; p(s′; s) = 1 iff supi and s′ upi

Planning and Execution with Phase Transitions


Event example1
Event Example

  • Cluster with two computers

up1up2

up1up2

up1up2

crash2

crash1

t = 0

t = 0.6

t = 0.9

Planning and Execution with Phase Transitions


Exploiting structure
Exploiting Structure

  • Value iteration with ADDs:

  • Factored state representation means that some variable assignments may be invalid

    • Phase is one for disabled events

    • Binary encoding of phases with n≠ 2k

  • Value iteration with state filter f:

Planning and Execution with Phase Transitions


Phase tracking
Phase Tracking

  • Infer phase configurations from observable features of the environment

    • Physical state of process

    • Current time

  • Transient analysis for Markov chains

    • The probability of being in state s at time t

Planning and Execution with Phase Transitions


Phase tracking1
Phase Tracking

  • Belief distribution changes continuously

    • Use fixed update interval 

  • Independent phase tracking for each event

    • Q matrix is n  n for n-phase distribution

    • Phase tracking is O(n) for Erlang distribution

Planning and Execution with Phase Transitions


The foreman s dilemma
The Foreman’s Dilemma

  • When to enable “Service” action in “Working” state?

Service

Exp(10)

Fail

G

Servicedc = –0.1

Workingc = 1

Failedc = 0

Return

Exp(1)

Planning and Execution with Phase Transitions


Foreman s dilemma policy performance
Foreman’s Dilemma:Policy Performance

Failure-time distribution: W(1.6x,4.5)

100

90

80

Percent of optimal

n = 8,  = 0.5

n = 4,  = 0.5

70

n = 8, simulate

60

n = 4, simulate

SMDP

50

x

0

5

10

15

20

25

30

35

40

Planning and Execution with Phase Transitions


Foreman s dilemma enabling time for action
Foreman’s Dilemma:Enabling Time for Action

30

20

Service enabling time

optimal

10

n = 8,  = 0.5

n = 8, simulate

SMDP

0

x

0

5

10

15

20

25

30

35

40

Planning and Execution with Phase Transitions


System administration
System Administration

  • Network of n machines

  • Reward rate c(s) = k in states where k machines are up

  • One crash event and one reboot action per machine

    • At most one action enabled at any time (single agent)

Planning and Execution with Phase Transitions


System administration policy performance
System Administration:Policy Performance

Reboot-time distribution: U(0,1)

45

40

35

Expected discounted reward

n = 4,  = 

n = 2,  = 

30

n = 4, simulate

n = 2, simulate

25

n = 1

20

2

3

4

5

6

7

8

9

10

Number of machines

Planning and Execution with Phase Transitions


System administration planning time
System Administration: Planning Time

104

103

no filter, n = 5

102

Planning time

no filter, n = 4

pre-filter, n = 5

101

pre-filter, n = 4

filter, n = 5

100

filter, n = 4

10–1

2

3

4

5

6

7

8

9

10

Number of machines

Planning and Execution with Phase Transitions


Discussion
Discussion

  • Phase-type approximations add state variables, but structure can be exploited

    • Use approximate solution methods (e.g., Guestrin et al., JAIR19)

  • Improved phase tracking using transient analysis instead of phase simulation

  • Recent CTBN work with phase-type dist.:

    • modeling—not planning

Planning and Execution with Phase Transitions


Tempastic dtp
Tempastic-DTP

  • A tool for GSMDP planning:

    http://sweden.autonomy.ri.cmu.edu/tempastic/

Planning and Execution with Phase Transitions


ad