Discounting the Future in Systems Theory

Discounting the Future in Systems Theory Luca de Alfaro, UC Santa Cruz Tom Henzinger, UC Berkeley Rupak Majumdar, UC Los Angeles Chess Review May 11, 2005 Berkeley, CA

A Graph Model of a System a b c

Property c ("eventually c") a b c

Property c ("eventually c") a b c  c … some trace has the property c

Property c ("eventually c") a b c  c … some trace has the property c  c … all traces have the property c

Richer Models FAIRNESS: -automaton Parity game ADVERSARIAL CONCURRENCY: game graph graph Stochastic game PROBABILITIES: Markov decision process

Concurrent Game 1,12,2 1,11,22,2 1,22,1 a b c 2,1 player "left"player "right" -for modeling open systems [Abramsky, Alur, Kupferman, Vardi, …] -for strategy synthesis ("control") [Ramadge, Wonham, Pnueli, Rosner]

Property c 1,1 2,2 1,1 1,2 2,2 1,2 2,1 a b c 2,1 hhleftii c … player "left" has a strategy to enforce c

Property c 1,1 2,2 1,11,2 2,2 1,2 2,1 a b c Pr(1): 0.5 Pr(2): 0.5 2,1 hhleftii c … player "left" has a strategy to enforce cleft c … player “left" has a randomized strategy to enforce c

Qualitative Models Trace: sequence of observations Property p: assigns a reward to each trace boolean rewards Model m: generates a set of traces (game) graph Value(p,m): defined from the rewards of the generated traces 9 or 8 (98) B

Stochastic Game a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2

Property c Probability with which player "left" can enforce c ? a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2

Semi-Quantitative Models Trace: sequence of observations Property p: assigns a reward to each trace boolean rewards Model m: generates a set of traces (game) graph Value(p,m): defined from the rewards of the generated traces sup or inf (sup inf) [0,1] µ R

A Systems Theory Class of properties p over traces Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values

A Systems Theory w-regular properties Class of properties p over traces Algorithm for computing Value(p,m) over models m GRAPHS Distance between models w.r.t. property values bisimilarity m-calculus

Transition Graph Q states d: Q  2Q transition relation

Graph Regions Q states d: Q  2Q transition relation  = [ Q !B ] regions 9pre, 8pre:    9 9pre(R) 8pre(R) R µ Q 8

Graph Property Values: Reachability  R Given RµQ, find the states from which some trace leads to R. R

Graph Property Values: Reachability  R = (m X) (R Ç9pre(X)) Given RµQ, find the states from which some trace leads to R. R R R [ pre(R) . . . R [ pre(R) [ pre2(R)

Concurrent Game Q states l, r moves of both players d: Q  l  r  Q transition function

Game Regions Q states l, r moves of both players d: Q  l  r  Q transition function  = [ Q !B ] regionslpre, rpre:   q  lpre(R) iff (l  l ) (r  r) d(q,l,r)  R 2,* lpre(R) R µ Q 1,2 1,1

Game Property Values: Reachability left R Given RµQ, find the states from which player "left" has a strategy to force the game to R. R

Game Property Values: Reachability left R = (m X) (R Ç lpre(X)) Given RµQ, find the states from which player "left" has a strategy to force the game to R. R left R R [ lpre(R) . . . R [ lpre(R) [ lpre2(R)

An Open Systems Theory w-regular properties Class of winning conditions p over traces GAME GRAPHS Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values alternating bisimilarity [Alur, H, Kupferman, Vardi] (lpre,rpre) fixpoint calculus

An Open Systems Theory w-regular properties hhleftiiR Class of winning conditions p over traces GAME GRAPHS Algorithm for computing Value(p,m) over models m Every deterministic fixpoint formula f computes Value(p,m), where p is the linear interpretation [Vardi] of f. (lpre,rpre) fixpoint calculus (m X) (R Ç lpre(X))

An Open Systems Theory Two states agree on the values of all fixpoint formulas iff they are alternating bisimilar [Alur, H, Kupferman, Vardi]. GAME GRAPHS Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values alternating bisimilarity (lpre,rpre) fixpoint calculus

Stochastic Game Q states l, r moves of both players d: Q  l  r  Dist(Q) probabilistic transition function

Quantitative Game Regions Q states l, r moves of both players d: Q  l  r  Dist(Q) probabilistic transition function  = [ Q ! [0,1] ] quantitative regions lpre, rpre:   lpre(R)(q) = (sup l  l ) (inf r  r) R(d(q,l,r))

Quantitative Game Regions Q states l, r moves of both players d: Q  l  r  Dist(Q) probabilistic transition function  = [ Q ! [0,1] ] quantitative regions lpre, rpre:   lpre(R)(q) = (sup l  l ) (inf r  r) R(d(q,l,r)) B 9 8

Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0 0 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2

Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2

Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0.8 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2

Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 0.96 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2

Probability with which player "left" can enforce c : (m X) (c Ç lpre(X)) Ç = pointwise max 1 1 1 a b c right right 1 2 1 2 left left a: 0.6 b: 0.4 a: 0.5 b: 0.5 a: 0.0 c: 1.0 a: 0.0 c: 1.0 1 1 a: 0.1 b: 0.9 a: 0.2 b: 0.8 a: 0.7 b: 0.3 a: 0.0 b: 1.0 2 2 In the limit, the deterministic fixpoint formulas work for all w-regular properties [de Alfaro, Majumdar].

A Probabilistic Systems Theory w-regular properties Class of properties p over traces MARKOV DECISION PROCESSES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values quantitative bisimilarity [Desharnais, Gupta, Jagadeesan, Panangaden] quantitative fixpoint calculus

A Probabilistic Systems Theory quantitativew-regular properties Class of properties p over traces max expected value of satisfying R MARKOV DECISION PROCESSES Algorithm for computing Value(p,m) over models m Every deterministic fixpoint formula f computes expected Value(p,m), where p is the linear interpretation of f. quantitative fixpoint calculus (m X) (R Ç9pre(X))

Qualitative Bisimilarity e: Q2! {0,1} … equivalence relation F … function on equivalences F(e)(q,q') = 0 if q and q' disagree on observations = min { e(r,r’) | r2 9pre(q) Ær’2 9pre(q’) } else Qualitative bisimilarity … greatest fixpoint of F

Quantitative Bisimilarity d: Q2! [0,1] … pseudo-metric ("distance") F … function on pseudo-metrics F(d)(q,q') = 1 if q and q' disagree on observations ¼ max of supl infr d(d(q,l,r),d(q',l,r)) supr infl d(d(q,l,r),d(q',l,r)) else Quantitative bisimilarity … greatest fixpoint of F Natural generalization of bisimilarity from binary relations to pseudo-metrics.

A Probabilistic Systems Theory Two states agree on the values of all quantitative fixpoint formulas iff their quantitative bisimilarity distance is 0. MARKOV DECISION PROCESSES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values quantitative bisimilarity quantitative fixpoint calculus

Great BUT … 1 The theory is too precise. Even the smallest change in the probability of a transition can cause an arbitrarily large change in the value of a property. 2 The theory is not computational. We cannot bound the rate of convergence for quantitative fixpoint formulas.

Solution: Discounting Economics: A dollar today is better than a dollar tomorrow. Value of $1.- today: 1 Tomorrow: a for discount factor 0 < a < 1 Day after tomorrow: a2 etc.

Solution: Discounting Economics: A dollar today is better than a dollar tomorrow. Value of $1.- today: 1 Tomorrow: a for discount factor 0 < a < 1 Day after tomorrow: a2 etc. Engineering: A bug today is worse than a bug tomorrow.

Discounted Reachability Reward (ac) =ak if c is first true after k transitions 0 if c is never true The reward is proportional to how quickly c is satisfied.

Discounted Property ac 1 a a b c a2  ac

Discounted Property ac 1 a a b c a2  ac Discounted fixpoint calculus:pre(f) a ¢ pre(f)

Fully Quantitative Models Trace: sequence of observations Property p: assigns a reward to each trace real reward Model m: generates a set of traces (game) graph Value(p,m): defined from the rewards of the generated traces sup or inf (sup inf) [0,1] µ R

Discounted Bisimilarity d: Q2! [0,1] … pseudo-metric ("distance") F … function on pseudo-metrics F(d)(q,q') = 1 if q and q' disagree on observations ¼ max of supl infr d(d(q,l,r),d(q',l,r)) supr infl d(d(q,l,r),d(q',l,r)) else Quantitative bisimilarity … greatest fixpoint of F a ¢

A Discounted Systems Theory discounted w-regular properties Class of winning rewards p over traces STOCHASTIC GAMES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values discounted bisimilarity discounted fixpoint calculus

A Discounted Systems Theory discounted w-regular properties max expected reward aR achievable by left player Class of expected rewards p over traces STOCHASTIC GAMES Algorithm for computing Value(p,m) over models m Every discounted deterministic fixpoint formula f computes Value(p,m), where p is the linear interpretation of f. discounted fixpoint calculus (m X) (R Ça ¢ lpre(X))

A Discounted Systems Theory The difference between two states in the values of discounted fixpoint formulas is bounded by their discounted bisimilarity distance. STOCHASTIC GAMES Algorithm for computing Value(p,m) over models m Distance between models w.r.t. property values discounted bisimilarity discounted fixpoint calculus

Discounting the Future in Systems Theory

Discounting the Future in Systems Theory

Presentation Transcript

SYSTEMS THEORY

Discounting

Topic 5: Discounting the Future

Discounting Future Cash Flows

Systems Theory

Systems theory

Discounting future income and cost

Discounting

Discounting Overview

Discounting Overview

Discounting

Discounting

Discounting of Environmental Goods and Discounting in Social Contexts

Systems theory

DISCOUNTING

CSC551 Systems Theory Living Systems Theory

Discounting Factors

Bill Discounting

Systems Theory