Loading in 5 sec....

Issues on the border of economics and computation נושאים בגבול כלכלה וחישובPowerPoint Presentation

Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב

- By
**pepin** - Follow User

- 77 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Issues on the border of economics and computation נושאים בגבול כלכלה וחישוב' - pepin

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Issues on the border of economics and computationנושאים בגבול כלכלה וחישוב

Speaker: Dr. Michael Schapira

Topic: Dynamics in Games (Part III)

(Some slides from Prof. YishayMansour’s courseat TAU)

Two Things

- Ex1 to be published by Thu
- submission deadline: 6.12.12, midnight
- can submit in pairs
- submit through Dr. Blumrosen’s mailbox

- Debt from last class.

(1,-1) (-1,1)

Left Right

Left

Right

Reminder: Zero-Sum Games- Azero-sum game is a 2-player strategic game such that for eachsS, we haveu1(s) + u2(s) = 0.
- What is good for me, is bad for my opponent and vice versa

Reminder: Minimax-Optimal Strategies

- A (mixed) strategy s1*isminimax optimal for player 1, if
mins2 S2u1(s1*,s2) ≥mins2 S2u1(s1,s2) for all s1S1

- Similar for player 2
- Can be found via linear programming.

Reminder: Minimax Theorem

- Every 2-player zero-sum game has a unique value V.
- Minimax optimal strategy for R guarantees R’s expected gain at least V.
- Minimaxoptimal strategy for C guarantees R’s expected gain at most V.

Algorithmic Implications

- The minimax theorem is a useful tool in the analysis of randomized algorithms
- Let’s see why.

Find Bill

- There are n boxes and exactly one box contains a dollar bill, and the rest of the boxes are empty.
- A probe is defined as opening a box to see if it contains the dollar bill.
- The objective is to locate the box containing the dollar bill while minimizing the number of probes performed.
- How well can a deterministic algorithm do?
- Can we do better via a randomized algorithm?
- i.e., an algorithm that is a probability distribution over deterministic algorithms

Randomized Find Alg

- Randomized Find: select xin {H,T} uniformly at random
- if x = H then probe boxes in order from 1 through n and stop if bill is found
- Otherwise, probe boxes in order from n through 1 and stop if bill is found

- The expected number of probes made by the algorithm is (n+1)/2.
- if the dollar bill is in the ith box, then i probes are made with probability ½ and (n - i + 1) probes are made with probability ½.

Randomized Find is Optimal

- Lemma: A lower bound on the expected number of probes required by any randomized algorithm to solve the Find-Bill problem is (n + 1)/2.
- Proof via the minimax theorem!

The Algorithm Game

ALG1 ALG2 … ALGn

- Row player aims to choose malicious inputs;
- Column player aims to choose efficient algorithms
- Payoff for (I,ALG) is the running time of ALG on I

Input1

Input2

.

.

.

Inputm

T(Alg,I)

The Algorithm Game

ALG1 ALG2 … ALGn

- Pure strategies:
- specific input for row player
- deterministic algorithm for column player

- Mixed strategies:
- distribution over inputs for row player
- randomized algorithm for column player

Input1

Input2

.

.

.

Inputm

T(Alg,I)

The Algorithm Game

ALG1 ALG2 … ALGn

- If I’m the column player what strategy (i.e., randomized algorithm) do I want to choose?

Input1

Input2

.

.

.

Inputm

T(Alg,I)

The Algorithm Game

ALG1 ALG2 … ALGn

- What does the minimax theorem mean here?

Input1

Input2

.

.

.

Inputm

T(Alg,I)

Yao’s Principle

- Let T(I,Alg) denote the time required for deterministic algorithm Alg to run on input I. Then,maxp on IminAlgE[T(Ip,Alg)] = minq on algsmaxIE[T(I,Algq)]
- So, for any two probability distributions p and qmindet-algE[T(Ip,Alg)] maxIE[T(I,Algq)]

Using Yao’s Principle

- Useful technique for proving lower bounds on running times of randomized algorithms
- Step I: Design a probability distribution Ip over inputs for which every deterministic algorithm’s running time is at least a
- Step II:Deduce that every randomized algorithm’s (expected) running time is at least a

Back to Find-Bill

- Lemma: A lower bound on the expected number of probes required by any randomized algorithm to solve the Find-Bill problem is (n + 1)/2.
- Proof:
- Consider the scenario that the bill is located in any one of the n boxes uniformly at random.
- Consider only deterministic algorithms that do not probe the same box twice.
- By symmetry we can assume that the probe order for a deterministic algorithm ALG is 1 through n.
- The expected #probes for ALG is ∑i/n = (n+1)/2
- Yao’s principle implies the lower bound.

No Regret Algs: So far…

- In some games (e.g., potential games), best-/better-response dynamics are guaranteed to converge to a PNE.
- In 2-player zero-sum games no-regret dynamics converge to a NE.
- What about general games?

(1,-3) (-4,-4)

Stop Go

Stop

Go

Chicken Game½

½

¼

¼

½

½

¼

¼

What are the pure NEs?

What are the (mixed) NEs?

(1,-3) (-4,-4)

Stop Go

Stop

Go

Correlated Equilibrium: Illustration0

½

½

0

- Suppose that there is a trusted random device that samples a pure strategy profile from a distribution P
- … and tells each player his component of the strategy profile.
- If all players other than i are following the strategy suggested by the random device, then i does not have any incentive to deviate.

(1,-3) (-4,-4)

Stop Go

Stop

Go

Correlated Equilibrium: Illustration1/3

1/3

1/3

0

- Suppose that there is a trusted random device that samples a pure strategy profile from a distribution P
- … and tells each player his component of the strategy profile.
- If all players other than i are following the strategy suggested by the random device, then i does not have any incentive to deviate.

Correlated Equilibrium

- Consider a game:
- Si is the set of (pure) strategies for player i
- S = S1 x S2 x… x Sn

- s = (s1,s2,…,sn ) S is a vector of strategies
- Ui: S R is the payoff function for player i.

- Si is the set of (pure) strategies for player i
- Notation: given a strategy vector s, let s-i= (s1,…,si-1,si,…,sn)
- The vector siwhere the i’th element is omitted

Correlated Equilibrium

A correlated equilibrium is a probability distribution p over (pure) strategy profiles in S such that forany i, si, si’:Σs-ip(si,s-i) ui(si,s-i) ≥ Σs-ip(si,s-i) ui(si’,s-i)

Facts About Correlated Equilibrium

- CE always exists
- why?

- The set of CE is convex
- what about NE?
- CEs are the solution to a set of linear equations

- CE can be computed in an efficient manner (e.g., via linear programming)

Moreover…

- When every player uses a no-regret algorithm to select strategies the dynamics converges to a CE
- in any game!

- But this requires a stronger definition of no-regret…

Types of No-Regret Algs

- No external regret: Do (nearly) as well as best strategy in hindsight
- what we’ve been talking about so far
- I should have always taken the same route to work…

- No internal regret: the Alg could not gain (in hindsight) by substituting a single strategy with another (consistently)
- each time strategy si was chosen substitute with si’
- each time I bought a Microsoft stock I should have bought the Google stock

- No internal regret implies no external regret
- why?

Reminder: Minimizing Regret

- At each round t=1,2, …,T

- There are n actions (experts) 1,2, …, n

- Algorithm selects an action in {1,…,n}

- and then observes the gain gi,t[0,1] of each action i{1,…,n}

- Let gi = Stgi,t. Let gmax = maxigi
- No external regret: Do (at least) “nearly as well” as gmax in hindsight.

Internal Regret

- Assume that alg outputs action sequenceA=a1… aT
- The action sequence A(b → d) :
- Change everyait=btoait=din
- g(b→d)is the gain ofA(b → d) (for the same gains gi,t)

- Internal regret:
max{b,d}g(b→d) - galg– = max{b,d} Σt(gd,t-gb,t)pb,t

- An algorithm has no internal regret alg if its internal regret goes to 0 as T goes to infinity

Internal Regret and Dominated Strategies

- Suppose that a player uses a no-internal-regret algorithm to select strategies
- in a repeated game against others

- What guarantees does the player have?
- beyond the no-regret guarantee

Dominated Strategies

- Strategy siis dominated by a (mixed) strategy si’ if for everys-i we have thatui(si,s-i) < ui(si’, s-i)
- Clearly, we like to avoid choosing dominated strategies

si

s’i

Internal Regret and Dominated Strategies

- siis dominated by si’
- every time we playedsiwe do better withsi’

- Define internal regret
- swapping the pair of strategies

- No internal regret
no dominated strategies

Does a No-Internal-Regret Alg Exist?

- Yes!
- In fact, there exist algorithms with a stronger guarantee: no swap regret.
- no swap regret: alg cannot benefit in hindsight by changing action i to F(i) for any F:{1,…,n} -> {1,…,n}

- We show a generic reduction fromno-external-regret to no-internal-regret

Alg1

External to Swap Regret- Our algorithm utilizes no-external-regret algorithms to achieve no-internal-regret:
- n no-external-regret algorithms
- intuitively, each algorithm represents a strategyin {1,…,n}

- for algorithm Algi, and for any sequence of gain vectors:gAlgi > gmax - Ri

- n no-external-regret algorithms

Alg2

Algn

q1

Alg1

qi

p

qn

External to Swap Regret- At timet:
- each Algioutputs a distribution qi
- induces a matrix Q

- our algorithm uses Q to decide on a distribution p over the strategies {1,…,n}
- adversary decides on gains vector g=<g1…gn>
- our algorithm returns to each Algisome gains vector

- each Algioutputs a distribution qi

Q

Alg2

Algn

p

Combining the No-External-Regret Algs- Approach I:
- Select an expert Ai with probability ri
- Let the “selected” expert decide the outcome p
- strategy distribution p=Qr

- Approach II:
- Directly decide on p.

- Our approach: make p=r
- Find a p such that p=Qp

Alg1

Distributing Gain- Adversary selects gains g=(g1…gn)
- Return to Algi gain vector pig
- Note: Σ pig=g

Alg2

Algn

q1

Alg1

qi

p

qn

- At time t:
- each Algioutputs a distribution qi
- induces a matrix Q

- output distribution p such that p=Qp
- pj = Σi piqi,j

- observe gains g=(g1,…,gn)
- return to Algi the gain vector pig

- each Algioutputs a distribution qi

Q

Alg2

Algn

- Gain of Algi(from its view) at round t
- <qi,t,(pig)> = pi,t<qi,t,gt>

- No-external-regret guarantee:
- gAlgi= Σtpi,t<qi,t,gt> > Σtpi,tgj,t – Ri

- For any swap function F:
- gAlg = Σt <pt,gt>
= Σt<ptQt,gt>

= ΣtΣipi,t<qi,t,gt> = ΣigAlgi>ΣiΣtpi,tgF(i),t – Ri= gAlg,F - ΣiRi

- gAlg = Σt <pt,gt>

Summary

- The Minimax Theorem is a useful tool for analyzing randomized algorithms
- Yao’s Principle

- There exist no-swap-regret algorithms
- Next time: When all players use no-swap-regret algorithms to select strategies the dynamics converge to a CE

Download Presentation

Connecting to Server..