Issues on the border of economics and computation
Download
1 / 39

Issues on the border of economics and computation ?????? ????? ????? ?????? - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב. Speaker: Dr. Michael Schapira Topic: Dynamics in Games (Part III) (Some slides from Prof. Yishay Mansour’s course at TAU). Two Things. Ex1 to be published by Thu submission deadline: 6.12.12, midnight

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Issues on the border of economics and computation ?????? ????? ????? ??????' - pepin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Issues on the border of economics and computation

Issues on the border of economics and computationנושאים בגבול כלכלה וחישוב

Speaker: Dr. Michael Schapira

Topic: Dynamics in Games (Part III)

(Some slides from Prof. YishayMansour’s courseat TAU)


Two things
Two Things

  • Ex1 to be published by Thu

    • submission deadline: 6.12.12, midnight

    • can submit in pairs

    • submit through Dr. Blumrosen’s mailbox

  • Debt from last class.


Reminder zero sum games

(-1,1) (1,-1)

(1,-1) (-1,1)

Left Right

Left

Right

Reminder: Zero-Sum Games

  • Azero-sum game is a 2-player strategic game such that for eachsS, we haveu1(s) + u2(s) = 0.

    • What is good for me, is bad for my opponent and vice versa


Reminder minimax optimal strategies
Reminder: Minimax-Optimal Strategies

  • A (mixed) strategy s1*isminimax optimal for player 1, if

    mins2 S2u1(s1*,s2) ≥mins2 S2u1(s1,s2) for all s1S1

  • Similar for player 2

  • Can be found via linear programming.


Reminder minimax theorem
Reminder: Minimax Theorem

  • Every 2-player zero-sum game has a unique value V.

  • Minimax optimal strategy for R guarantees R’s expected gain at least V.

  • Minimaxoptimal strategy for C guarantees R’s expected gain at most V.


Algorithmic implications
Algorithmic Implications

  • The minimax theorem is a useful tool in the analysis of randomized algorithms

  • Let’s see why.


Find bill
Find Bill

  • There are n boxes and exactly one box contains a dollar bill, and the rest of the boxes are empty.

  • A probe is defined as opening a box to see if it contains the dollar bill.

  • The objective is to locate the box containing the dollar bill while minimizing the number of probes performed.

  • How well can a deterministic algorithm do?

  • Can we do better via a randomized algorithm?

    • i.e., an algorithm that is a probability distribution over deterministic algorithms


Randomized find alg
Randomized Find Alg

  • Randomized Find: select xin {H,T} uniformly at random

    • if x = H then probe boxes in order from 1 through n and stop if bill is found

    • Otherwise, probe boxes in order from n through 1 and stop if bill is found

  • The expected number of probes made by the algorithm is (n+1)/2.

    • if the dollar bill is in the ith box, then i probes are made with probability ½ and (n - i + 1) probes are made with probability ½.


Randomized find is optimal
Randomized Find is Optimal

  • Lemma: A lower bound on the expected number of probes required by any randomized algorithm to solve the Find-Bill problem is (n + 1)/2.

  • Proof via the minimax theorem!


The algorithm game
The Algorithm Game

ALG1 ALG2 … ALGn

  • Row player aims to choose malicious inputs;

  • Column player aims to choose efficient algorithms

  • Payoff for (I,ALG) is the running time of ALG on I

Input1

Input2

.

.

.

Inputm

T(Alg,I)


The algorithm game1
The Algorithm Game

ALG1 ALG2 … ALGn

  • Pure strategies:

    • specific input for row player

    • deterministic algorithm for column player

  • Mixed strategies:

    • distribution over inputs for row player

    • randomized algorithm for column player

Input1

Input2

.

.

.

Inputm

T(Alg,I)


The algorithm game2
The Algorithm Game

ALG1 ALG2 … ALGn

  • If I’m the column player what strategy (i.e., randomized algorithm) do I want to choose?

Input1

Input2

.

.

.

Inputm

T(Alg,I)


The algorithm game3
The Algorithm Game

ALG1 ALG2 … ALGn

  • What does the minimax theorem mean here?

Input1

Input2

.

.

.

Inputm

T(Alg,I)


Yao s principle
Yao’s Principle

  • Let T(I,Alg) denote the time required for deterministic algorithm Alg to run on input I. Then,maxp on IminAlgE[T(Ip,Alg)] = minq on algsmaxIE[T(I,Algq)]

  • So, for any two probability distributions p and qmindet-algE[T(Ip,Alg)] maxIE[T(I,Algq)]


Using yao s principle
Using Yao’s Principle

  • Useful technique for proving lower bounds on running times of randomized algorithms

  • Step I: Design a probability distribution Ip over inputs for which every deterministic algorithm’s running time is at least a

  • Step II:Deduce that every randomized algorithm’s (expected) running time is at least a


Back to find bill
Back to Find-Bill

  • Lemma: A lower bound on the expected number of probes required by any randomized algorithm to solve the Find-Bill problem is (n + 1)/2.

  • Proof:

    • Consider the scenario that the bill is located in any one of the n boxes uniformly at random.

    • Consider only deterministic algorithms that do not probe the same box twice.

    • By symmetry we can assume that the probe order for a deterministic algorithm ALG is 1 through n.

    • The expected #probes for ALG is ∑i/n = (n+1)/2

    • Yao’s principle implies the lower bound.


No regret algs so far
No Regret Algs: So far…

  • In some games (e.g., potential games), best-/better-response dynamics are guaranteed to converge to a PNE.

  • In 2-player zero-sum games no-regret dynamics converge to a NE.

  • What about general games?


Chicken game

(0,0) (-3,1)

(1,-3) (-4,-4)

Stop Go

Stop

Go

Chicken Game

½

½

¼

¼

½

½

¼

¼

What are the pure NEs?

What are the (mixed) NEs?


Correlated equilibrium illustration

(0,0) (-3,1)

(1,-3) (-4,-4)

Stop Go

Stop

Go

Correlated Equilibrium: Illustration

0

½

½

0

  • Suppose that there is a trusted random device that samples a pure strategy profile from a distribution P

  • … and tells each player his component of the strategy profile.

  • If all players other than i are following the strategy suggested by the random device, then i does not have any incentive to deviate.


Correlated equilibrium illustration1

(0,0) (-3,1)

(1,-3) (-4,-4)

Stop Go

Stop

Go

Correlated Equilibrium: Illustration

1/3

1/3

1/3

0

  • Suppose that there is a trusted random device that samples a pure strategy profile from a distribution P

  • … and tells each player his component of the strategy profile.

  • If all players other than i are following the strategy suggested by the random device, then i does not have any incentive to deviate.


Correlated equilibrium
Correlated Equilibrium

  • Consider a game:

    • Si is the set of (pure) strategies for player i

      • S = S1 x S2 x… x Sn

    • s = (s1,s2,…,sn )  S is a vector of strategies

    • Ui: S  R is the payoff function for player i.

  • Notation: given a strategy vector s, let s-i= (s1,…,si-1,si,…,sn)

    • The vector siwhere the i’th element is omitted


Correlated equilibrium1
Correlated Equilibrium

A correlated equilibrium is a probability distribution p over (pure) strategy profiles in S such that forany i, si, si’:Σs-ip(si,s-i) ui(si,s-i) ≥ Σs-ip(si,s-i) ui(si’,s-i)


Facts about correlated equilibrium
Facts About Correlated Equilibrium

  • CE always exists

    • why?

  • The set of CE is convex

    • what about NE?

    • CEs are the solution to a set of linear equations

  • CE can be computed in an efficient manner (e.g., via linear programming)


Moreover
Moreover…

  • When every player uses a no-regret algorithm to select strategies the dynamics converges to a CE

    • in any game!

  • But this requires a stronger definition of no-regret…


Types of no regret algs
Types of No-Regret Algs

  • No external regret: Do (nearly) as well as best strategy in hindsight

    • what we’ve been talking about so far

    • I should have always taken the same route to work…

  • No internal regret: the Alg could not gain (in hindsight) by substituting a single strategy with another (consistently)

    • each time strategy si was chosen substitute with si’

    • each time I bought a Microsoft stock I should have bought the Google stock

  • No internal regret implies no external regret

    • why?


Reminder minimizing regret
Reminder: Minimizing Regret

  • At each round t=1,2, …,T

  • There are n actions (experts) 1,2, …, n

  • Algorithm selects an action in {1,…,n}

  • and then observes the gain gi,t[0,1] of each action i{1,…,n}

  • Let gi = Stgi,t. Let gmax = maxigi

  • No external regret: Do (at least) “nearly as well” as gmax in hindsight.


Internal regret
Internal Regret

  • Assume that alg outputs action sequenceA=a1… aT

  • The action sequence A(b → d) :

    • Change everyait=btoait=din

    • g(b→d)is the gain ofA(b → d) (for the same gains gi,t)

  • Internal regret:

    max{b,d}g(b→d) - galg– = max{b,d} Σt(gd,t-gb,t)pb,t

  • An algorithm has no internal regret alg if its internal regret goes to 0 as T goes to infinity


Internal regret and dominated strategies
Internal Regret and Dominated Strategies

  • Suppose that a player uses a no-internal-regret algorithm to select strategies

    • in a repeated game against others

  • What guarantees does the player have?

    • beyond the no-regret guarantee


Dominated strategies
Dominated Strategies

  • Strategy siis dominated by a (mixed) strategy si’ if for everys-i we have thatui(si,s-i) < ui(si’, s-i)

  • Clearly, we like to avoid choosing dominated strategies

si

s’i


Internal regret and dominated strategies1
Internal Regret and Dominated Strategies

  • siis dominated by si’

    • every time we playedsiwe do better withsi’

  • Define internal regret

    • swapping the pair of strategies

  • No internal regret 

    no dominated strategies


Does a no internal regret alg exist
Does a No-Internal-Regret Alg Exist?

  • Yes!

  • In fact, there exist algorithms with a stronger guarantee: no swap regret.

    • no swap regret: alg cannot benefit in hindsight by changing action i to F(i) for any F:{1,…,n} -> {1,…,n}

  • We show a generic reduction fromno-external-regret to no-internal-regret


External to swap regret

Alg1

External to Swap Regret

  • Our algorithm utilizes no-external-regret algorithms to achieve no-internal-regret:

    • n no-external-regret algorithms

      • intuitively, each algorithm represents a strategyin {1,…,n}

    • for algorithm Algi, and for any sequence of gain vectors:gAlgi > gmax - Ri

Alg2

Algn


External to swap regret1

q1

Alg1

qi

p

qn

External to Swap Regret

  • At timet:

    • each Algioutputs a distribution qi

      • induces a matrix Q

    • our algorithm uses Q to decide on a distribution p over the strategies {1,…,n}

    • adversary decides on gains vector g=<g1…gn>

    • our algorithm returns to each Algisome gains vector

Q

Alg2

Algn


Combining the no external regret algs

Q

p

Combining the No-External-Regret Algs

  • Approach I:

    • Select an expert Ai with probability ri

    • Let the “selected” expert decide the outcome p

    • strategy distribution p=Qr

  • Approach II:

    • Directly decide on p.

  • Our approach: make p=r

    • Find a p such that p=Qp


Distributing gain

Alg1

Distributing Gain

  • Adversary selects gains g=(g1…gn)

  • Return to Algi gain vector pig

    • Note: Σ pig=g

Alg2

Algn


Issues on the border of economics and computation

External to Swap Regret

q1

Alg1

qi

p

qn

  • At time t:

    • each Algioutputs a distribution qi

      • induces a matrix Q

    • output distribution p such that p=Qp

      • pj = Σi piqi,j

    • observe gains g=(g1,…,gn)

    • return to Algi the gain vector pig

Q

Alg2

Algn


Issues on the border of economics and computation

External to Swap Regret

  • Gain of Algi(from its view) at round t

    • <qi,t,(pig)> = pi,t<qi,t,gt>

  • No-external-regret guarantee:

    • gAlgi= Σtpi,t<qi,t,gt> > Σtpi,tgj,t – Ri

  • For any swap function F:

    • gAlg = Σt <pt,gt>

      = Σt<ptQt,gt>

      = ΣtΣipi,t<qi,t,gt> = ΣigAlgi>ΣiΣtpi,tgF(i),t – Ri= gAlg,F - ΣiRi


Swap regret
Swap Regret

Corollary:

Can be improved to:


Summary
Summary

  • The Minimax Theorem is a useful tool for analyzing randomized algorithms

    • Yao’s Principle

  • There exist no-swap-regret algorithms

  • Next time: When all players use no-swap-regret algorithms to select strategies the dynamics converge to a CE