Issues on the border of economics and computation
Download
1 / 39

Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב. Speaker: Dr. Michael Schapira Topic: Dynamics in Games (Part III) (Some slides from Prof. Yishay Mansour’s course at TAU). Two Things. Ex1 to be published by Thu submission deadline: 6.12.12, midnight

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב' - pepin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Issues on the border of economics and computation

Issues on the border of economics and computationנושאים בגבול כלכלה וחישוב

Speaker: Dr. Michael Schapira

Topic: Dynamics in Games (Part III)

(Some slides from Prof. YishayMansour’s courseat TAU)


Two things
Two Things

  • Ex1 to be published by Thu

    • submission deadline: 6.12.12, midnight

    • can submit in pairs

    • submit through Dr. Blumrosen’s mailbox

  • Debt from last class.


Reminder zero sum games

(-1,1) (1,-1)

(1,-1) (-1,1)

Left Right

Left

Right

Reminder: Zero-Sum Games

  • Azero-sum game is a 2-player strategic game such that for eachsS, we haveu1(s) + u2(s) = 0.

    • What is good for me, is bad for my opponent and vice versa


Reminder minimax optimal strategies
Reminder: Minimax-Optimal Strategies

  • A (mixed) strategy s1*isminimax optimal for player 1, if

    mins2 S2u1(s1*,s2) ≥mins2 S2u1(s1,s2) for all s1S1

  • Similar for player 2

  • Can be found via linear programming.


Reminder minimax theorem
Reminder: Minimax Theorem

  • Every 2-player zero-sum game has a unique value V.

  • Minimax optimal strategy for R guarantees R’s expected gain at least V.

  • Minimaxoptimal strategy for C guarantees R’s expected gain at most V.


Algorithmic implications
Algorithmic Implications

  • The minimax theorem is a useful tool in the analysis of randomized algorithms

  • Let’s see why.


Find bill
Find Bill

  • There are n boxes and exactly one box contains a dollar bill, and the rest of the boxes are empty.

  • A probe is defined as opening a box to see if it contains the dollar bill.

  • The objective is to locate the box containing the dollar bill while minimizing the number of probes performed.

  • How well can a deterministic algorithm do?

  • Can we do better via a randomized algorithm?

    • i.e., an algorithm that is a probability distribution over deterministic algorithms


Randomized find alg
Randomized Find Alg

  • Randomized Find: select xin {H,T} uniformly at random

    • if x = H then probe boxes in order from 1 through n and stop if bill is found

    • Otherwise, probe boxes in order from n through 1 and stop if bill is found

  • The expected number of probes made by the algorithm is (n+1)/2.

    • if the dollar bill is in the ith box, then i probes are made with probability ½ and (n - i + 1) probes are made with probability ½.


Randomized find is optimal
Randomized Find is Optimal

  • Lemma: A lower bound on the expected number of probes required by any randomized algorithm to solve the Find-Bill problem is (n + 1)/2.

  • Proof via the minimax theorem!


The algorithm game
The Algorithm Game

ALG1 ALG2 … ALGn

  • Row player aims to choose malicious inputs;

  • Column player aims to choose efficient algorithms

  • Payoff for (I,ALG) is the running time of ALG on I

Input1

Input2

.

.

.

Inputm

T(Alg,I)


The algorithm game1
The Algorithm Game

ALG1 ALG2 … ALGn

  • Pure strategies:

    • specific input for row player

    • deterministic algorithm for column player

  • Mixed strategies:

    • distribution over inputs for row player

    • randomized algorithm for column player

Input1

Input2

.

.

.

Inputm

T(Alg,I)


The algorithm game2
The Algorithm Game

ALG1 ALG2 … ALGn

  • If I’m the column player what strategy (i.e., randomized algorithm) do I want to choose?

Input1

Input2

.

.

.

Inputm

T(Alg,I)


The algorithm game3
The Algorithm Game

ALG1 ALG2 … ALGn

  • What does the minimax theorem mean here?

Input1

Input2

.

.

.

Inputm

T(Alg,I)


Yao s principle
Yao’s Principle

  • Let T(I,Alg) denote the time required for deterministic algorithm Alg to run on input I. Then,maxp on IminAlgE[T(Ip,Alg)] = minq on algsmaxIE[T(I,Algq)]

  • So, for any two probability distributions p and qmindet-algE[T(Ip,Alg)] maxIE[T(I,Algq)]


Using yao s principle
Using Yao’s Principle

  • Useful technique for proving lower bounds on running times of randomized algorithms

  • Step I: Design a probability distribution Ip over inputs for which every deterministic algorithm’s running time is at least a

  • Step II:Deduce that every randomized algorithm’s (expected) running time is at least a


Back to find bill
Back to Find-Bill

  • Lemma: A lower bound on the expected number of probes required by any randomized algorithm to solve the Find-Bill problem is (n + 1)/2.

  • Proof:

    • Consider the scenario that the bill is located in any one of the n boxes uniformly at random.

    • Consider only deterministic algorithms that do not probe the same box twice.

    • By symmetry we can assume that the probe order for a deterministic algorithm ALG is 1 through n.

    • The expected #probes for ALG is ∑i/n = (n+1)/2

    • Yao’s principle implies the lower bound.


No regret algs so far
No Regret Algs: So far…

  • In some games (e.g., potential games), best-/better-response dynamics are guaranteed to converge to a PNE.

  • In 2-player zero-sum games no-regret dynamics converge to a NE.

  • What about general games?


Chicken game

(0,0) (-3,1)

(1,-3) (-4,-4)

Stop Go

Stop

Go

Chicken Game

½

½

¼

¼

½

½

¼

¼

What are the pure NEs?

What are the (mixed) NEs?


Correlated equilibrium illustration

(0,0) (-3,1)

(1,-3) (-4,-4)

Stop Go

Stop

Go

Correlated Equilibrium: Illustration

0

½

½

0

  • Suppose that there is a trusted random device that samples a pure strategy profile from a distribution P

  • … and tells each player his component of the strategy profile.

  • If all players other than i are following the strategy suggested by the random device, then i does not have any incentive to deviate.


Correlated equilibrium illustration1

(0,0) (-3,1)

(1,-3) (-4,-4)

Stop Go

Stop

Go

Correlated Equilibrium: Illustration

1/3

1/3

1/3

0

  • Suppose that there is a trusted random device that samples a pure strategy profile from a distribution P

  • … and tells each player his component of the strategy profile.

  • If all players other than i are following the strategy suggested by the random device, then i does not have any incentive to deviate.


Correlated equilibrium
Correlated Equilibrium

  • Consider a game:

    • Si is the set of (pure) strategies for player i

      • S = S1 x S2 x… x Sn

    • s = (s1,s2,…,sn )  S is a vector of strategies

    • Ui: S  R is the payoff function for player i.

  • Notation: given a strategy vector s, let s-i= (s1,…,si-1,si,…,sn)

    • The vector siwhere the i’th element is omitted


Correlated equilibrium1
Correlated Equilibrium

A correlated equilibrium is a probability distribution p over (pure) strategy profiles in S such that forany i, si, si’:Σs-ip(si,s-i) ui(si,s-i) ≥ Σs-ip(si,s-i) ui(si’,s-i)


Facts about correlated equilibrium
Facts About Correlated Equilibrium

  • CE always exists

    • why?

  • The set of CE is convex

    • what about NE?

    • CEs are the solution to a set of linear equations

  • CE can be computed in an efficient manner (e.g., via linear programming)


Moreover
Moreover…

  • When every player uses a no-regret algorithm to select strategies the dynamics converges to a CE

    • in any game!

  • But this requires a stronger definition of no-regret…


Types of no regret algs
Types of No-Regret Algs

  • No external regret: Do (nearly) as well as best strategy in hindsight

    • what we’ve been talking about so far

    • I should have always taken the same route to work…

  • No internal regret: the Alg could not gain (in hindsight) by substituting a single strategy with another (consistently)

    • each time strategy si was chosen substitute with si’

    • each time I bought a Microsoft stock I should have bought the Google stock

  • No internal regret implies no external regret

    • why?


Reminder minimizing regret
Reminder: Minimizing Regret

  • At each round t=1,2, …,T

  • There are n actions (experts) 1,2, …, n

  • Algorithm selects an action in {1,…,n}

  • and then observes the gain gi,t[0,1] of each action i{1,…,n}

  • Let gi = Stgi,t. Let gmax = maxigi

  • No external regret: Do (at least) “nearly as well” as gmax in hindsight.


Internal regret
Internal Regret

  • Assume that alg outputs action sequenceA=a1… aT

  • The action sequence A(b → d) :

    • Change everyait=btoait=din

    • g(b→d)is the gain ofA(b → d) (for the same gains gi,t)

  • Internal regret:

    max{b,d}g(b→d) - galg– = max{b,d} Σt(gd,t-gb,t)pb,t

  • An algorithm has no internal regret alg if its internal regret goes to 0 as T goes to infinity


Internal regret and dominated strategies
Internal Regret and Dominated Strategies

  • Suppose that a player uses a no-internal-regret algorithm to select strategies

    • in a repeated game against others

  • What guarantees does the player have?

    • beyond the no-regret guarantee


Dominated strategies
Dominated Strategies

  • Strategy siis dominated by a (mixed) strategy si’ if for everys-i we have thatui(si,s-i) < ui(si’, s-i)

  • Clearly, we like to avoid choosing dominated strategies

si

s’i


Internal regret and dominated strategies1
Internal Regret and Dominated Strategies

  • siis dominated by si’

    • every time we playedsiwe do better withsi’

  • Define internal regret

    • swapping the pair of strategies

  • No internal regret 

    no dominated strategies


Does a no internal regret alg exist
Does a No-Internal-Regret Alg Exist?

  • Yes!

  • In fact, there exist algorithms with a stronger guarantee: no swap regret.

    • no swap regret: alg cannot benefit in hindsight by changing action i to F(i) for any F:{1,…,n} -> {1,…,n}

  • We show a generic reduction fromno-external-regret to no-internal-regret


External to swap regret

Alg1

External to Swap Regret

  • Our algorithm utilizes no-external-regret algorithms to achieve no-internal-regret:

    • n no-external-regret algorithms

      • intuitively, each algorithm represents a strategyin {1,…,n}

    • for algorithm Algi, and for any sequence of gain vectors:gAlgi > gmax - Ri

Alg2

Algn


External to swap regret1

q1

Alg1

qi

p

qn

External to Swap Regret

  • At timet:

    • each Algioutputs a distribution qi

      • induces a matrix Q

    • our algorithm uses Q to decide on a distribution p over the strategies {1,…,n}

    • adversary decides on gains vector g=<g1…gn>

    • our algorithm returns to each Algisome gains vector

Q

Alg2

Algn


Combining the no external regret algs

Q

p

Combining the No-External-Regret Algs

  • Approach I:

    • Select an expert Ai with probability ri

    • Let the “selected” expert decide the outcome p

    • strategy distribution p=Qr

  • Approach II:

    • Directly decide on p.

  • Our approach: make p=r

    • Find a p such that p=Qp


Distributing gain

Alg1

Distributing Gain

  • Adversary selects gains g=(g1…gn)

  • Return to Algi gain vector pig

    • Note: Σ pig=g

Alg2

Algn


External to Swap Regret

q1

Alg1

qi

p

qn

  • At time t:

    • each Algioutputs a distribution qi

      • induces a matrix Q

    • output distribution p such that p=Qp

      • pj = Σi piqi,j

    • observe gains g=(g1,…,gn)

    • return to Algi the gain vector pig

Q

Alg2

Algn


External to Swap Regret

  • Gain of Algi(from its view) at round t

    • <qi,t,(pig)> = pi,t<qi,t,gt>

  • No-external-regret guarantee:

    • gAlgi= Σtpi,t<qi,t,gt> > Σtpi,tgj,t – Ri

  • For any swap function F:

    • gAlg = Σt <pt,gt>

      = Σt<ptQt,gt>

      = ΣtΣipi,t<qi,t,gt> = ΣigAlgi>ΣiΣtpi,tgF(i),t – Ri= gAlg,F - ΣiRi


Swap regret
Swap Regret

Corollary:

Can be improved to:


Summary
Summary

  • The Minimax Theorem is a useful tool for analyzing randomized algorithms

    • Yao’s Principle

  • There exist no-swap-regret algorithms

  • Next time: When all players use no-swap-regret algorithms to select strategies the dynamics converge to a CE


ad