Planning and Execution with Phase Transitions

1 / 20

# Planning and Execution with Phase Transitions - PowerPoint PPT Presentation

Planning and Execution with Phase Transitions. H åkan L. S. Younes Carnegie Mellon University. Follow-up paper to Younes &amp; Simmons’ “Solving Generalized Semi-Markov Processes using Continuous Phase-Type Distributions” ( AAAI’04 ). Planning with Time and Probability. Temporal uncertainty

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Planning and Execution with Phase Transitions' - ella

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Planning and Executionwith Phase Transitions

Håkan L. S. Younes

Carnegie Mellon University

Follow-up paper to Younes & Simmons’“Solving Generalized Semi-Markov Processes usingContinuous Phase-Type Distributions” (AAAI’04)

Planning withTime and Probability
• Temporal uncertainty
• When to schedule service?

Service

F1

Fail

F3

Serviced

Working

Failed

Return

F2

Memoryless distributions  MDP

Planning and Execution with Phase Transitions

The Role of Phases
• Memoryless property of exponential distribution makes MDPs tractable
• Many phenomena are not memoryless
• Lifetime of product or computer process
• Phases introduce memory into state space
• Modeling tool—not part of real world (phases are hidden variables)

Planning and Execution with Phase Transitions

Exponential

Two-phase Coxian

p1

2

1

1

2

(1 – p)1

n-phase Erlang

1

2

n

Phase-Type Distributions
• Time to absorption in Markov chain with n transient states and one absorbing state

Planning and Execution with Phase Transitions

Phase-Type Approximation

F(x)

Weibull(8,4.5)

1

Erlang(7.3/8,8)

Erlang(7.3/4,4)

Exponential(1/7.3)

0

x

0

2

4

6

8

10

12

Planning and Execution with Phase Transitions

Solving MDPs withPhase Transitions
• Solve MDP as if phases were observable
• Maintain belief distribution over phase configurations during execution
• AAAI’04 paper: Simulate phase transitions
• Use QMDP value method to select actions
• OK, because there are typically no actions that are useful for observing phases

Planning and Execution with Phase Transitions

Factored Event Model
• Asynchronous events and actions
• Each event/action has:
• Enabling condition e
• (Phase-type) distribution Ge governing the time e must remain enabled before it triggers
• A distribution pe(s′; s) determining the probability that the next state is s′ if e triggers in state s

Planning and Execution with Phase Transitions

Event Example
• Computer cluster with crash event and reboot action for each computer
• Crash event for computer i:
• i = upi; p(s′; s) = 1 iff s upi and s′upi
• Reboot action for computer i:
• i = upi; p(s′; s) = 1 iff supi and s′ upi

Planning and Execution with Phase Transitions

Event Example
• Cluster with two computers

up1up2

up1up2

up1up2

crash2

crash1

t = 0

t = 0.6

t = 0.9

Planning and Execution with Phase Transitions

Exploiting Structure
• Factored state representation means that some variable assignments may be invalid
• Phase is one for disabled events
• Binary encoding of phases with n≠ 2k
• Value iteration with state filter f:

Planning and Execution with Phase Transitions

Phase Tracking
• Infer phase configurations from observable features of the environment
• Physical state of process
• Current time
• Transient analysis for Markov chains
• The probability of being in state s at time t

Planning and Execution with Phase Transitions

Phase Tracking
• Belief distribution changes continuously
• Use fixed update interval 
• Independent phase tracking for each event
• Q matrix is n  n for n-phase distribution
• Phase tracking is O(n) for Erlang distribution

Planning and Execution with Phase Transitions

The Foreman’s Dilemma
• When to enable “Service” action in “Working” state?

Service

Exp(10)

Fail

G

Servicedc = –0.1

Workingc = 1

Failedc = 0

Return

Exp(1)

Planning and Execution with Phase Transitions

Foreman’s Dilemma:Policy Performance

Failure-time distribution: W(1.6x,4.5)

100

90

80

Percent of optimal

n = 8,  = 0.5

n = 4,  = 0.5

70

n = 8, simulate

60

n = 4, simulate

SMDP

50

x

0

5

10

15

20

25

30

35

40

Planning and Execution with Phase Transitions

Foreman’s Dilemma:Enabling Time for Action

30

20

Service enabling time

optimal

10

n = 8,  = 0.5

n = 8, simulate

SMDP

0

x

0

5

10

15

20

25

30

35

40

Planning and Execution with Phase Transitions

• Network of n machines
• Reward rate c(s) = k in states where k machines are up
• One crash event and one reboot action per machine
• At most one action enabled at any time (single agent)

Planning and Execution with Phase Transitions

Reboot-time distribution: U(0,1)

45

40

35

Expected discounted reward

n = 4,  = 

n = 2,  = 

30

n = 4, simulate

n = 2, simulate

25

n = 1

20

2

3

4

5

6

7

8

9

10

Number of machines

Planning and Execution with Phase Transitions

104

103

no filter, n = 5

102

Planning time

no filter, n = 4

pre-filter, n = 5

101

pre-filter, n = 4

filter, n = 5

100

filter, n = 4

10–1

2

3

4

5

6

7

8

9

10

Number of machines

Planning and Execution with Phase Transitions

Discussion
• Phase-type approximations add state variables, but structure can be exploited
• Use approximate solution methods (e.g., Guestrin et al., JAIR19)
• Improved phase tracking using transient analysis instead of phase simulation
• Recent CTBN work with phase-type dist.:
• modeling—not planning

Planning and Execution with Phase Transitions

Tempastic-DTP
• A tool for GSMDP planning:

http://sweden.autonomy.ri.cmu.edu/tempastic/

Planning and Execution with Phase Transitions