Connections between learning theory game theory and optimization
Download
1 / 39

Connections between Learning Theory - PowerPoint PPT Presentation


  • 515 Views
  • Updated On :

Connections between Learning Theory, Game Theory, and Optimization Lecture 1, August 24 th 2010 Maria Florina (Nina) Balcan Big Picture Over the past decades, many important and deep connections between: machine learning theory algorithmic game theory combinatorial optimization

Related searches for Connections between Learning Theory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Connections between Learning Theory' - benjamin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Connections between learning theory game theory and optimization l.jpg

Connections between Learning Theory, Game Theory, and Optimization

Lecture 1, August 24th2010

Maria Florina (Nina) Balcan


Big picture l.jpg
Big Picture Optimization

Over the past decades, many important and deep connections between:

  • machine learning theory

  • algorithmic game theory

  • combinatorial optimization

We will explore such connections, discussing:

  • fundamental topics in each area.

  • how ideas from each area can shed light on the others.


Outline l.jpg
Outline Optimization

Online learning. Combining expert advice.

Regret minimization (no external regret and no internal regret). Bandit algorithms.

1/2

1

0

Zero sum games. Nash equilibria.

0

1/2

1

Experts learning & Minimax theorem.

1

0

1/2

Nash equilibria and approximate nash equilibria in general sum bimatrix games.


Outline4 l.jpg

+ Optimization

+

+

-

+

-

-

-

-

Outline

Learning in a distributional setting.

Sample complexity results.

Weak-learning vs. Strong-learning.

Boosting with connections to game theory.

Quality of equilibria (Price of anarchy/stability).

Games with many players. Potential games.

Dynamics in games and the price of learning.


Outline5 l.jpg
Outline Optimization

Mechanism design (MD).

Combinatorial auctions. [Social welfare; revenue maximization]

Auctions for digital goods.

  • Reductions from MD to algorithm design using machine learning.

Algorithmic pricing problems.

  • Online learning for designing online pricing schemes.


Outline6 l.jpg
Outline Optimization

Submodularity with connections to game theory and machine learning.

  • Combinatorial auctions with submodular valuations

  • Learning submodular functions

  • Other optimization pbs involving submodularity (ranking, clustering, etc.)


Admin l.jpg
Admin Optimization

http://www.cc.gatech.edu/~ninamf/LGO10/

  • Course web page:

  • 3 hwk assignments. Exercises/problems (pencil-and-paper problem-solving variety).

  • Project: explore a theoretical question, try some experiments, or read a couple of papers and explain the idea. Writeup and class presentation. Groups ok.

[50%]

[50%]

  • “Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani

  • Other papers, surveys, and tutorials


Slide8 l.jpg

Online learning, minimizing regret, and combining expert advice.

  • “The weighted majority algorithm”

N. Littlestone & M. Warmuth

  • “Online Algorithms in Machine Learning” (survey)

A. Blum

  • Algorithmic Game Theory, Nisan, Roughgarden, Tardos, Vazirani (eds) [Chapters 4]

  • Prediction, Learning, and Games, Cesa-Bianchi, Lugosi



Using expert advice l.jpg
Using “expert” advice advice.

Assume we want to predict the stock market.

  • Will the market go up or down?

  • We solicit n “experts” for their advice.

  • We then want to use their advice somehow to make our prediction. E.g.,

Can we do nearly as well as best in hindsight?

Note: “expert” ´ someone with an opinion.

[Not necessairly someone who knows anything.]


Formal model l.jpg
Formal model advice.

  • For each round t=1,2, …, T

  • There are n experts.

  • Each expert makes a prediction in {0,1}

  • The learner (using experts’ predictions) makes a prediction in {0,1}

  • The learner observes the actual outcome. There is a mistake if the predicted outcome is different form the actual outcome.

Can we do nearly as well as best in hindsight?


Weighted majority algorithm l.jpg
Weighted Majority Algorithm advice.

Deterministic Majority Algorithm

  • Start with all experts having weight 1.

  • Predict based on weighted majority vote.

  • If

  • then predict 1

  • else predict 0

  • Penalize mistakes by cutting weight in half.

Randomized versions of this algorithm can provide surprisingly strong guarantees


Weighted majority algorithm13 l.jpg
Weighted Majority Algorithm advice.

  • E[# mistakes] ·(1+e)OPT + e-1log(n).

  • If set =(log(n)/OPT)1/2 to balance the two terms out (or use guess-and-double), get bound of

  • E[mistakes]·OPT+2(OPT¢log n)1/2

Note: Of course we might not know OPT, so if running T time steps, since OPT · T, set ² to get additive loss (2T log n)1/2

regret

  • E[mistakes]·OPT+2(T¢log n)1/2

  • So, regret/T ! 0.

[no regret algorithm]


Many other useful extensions l.jpg
Many other useful extensions advice.

E.g., what if have n options, not n predictors?

  • We’re not combining n experts, we’re choosing one.

  • Nice feature of RWM: can be applied when experts are n different options

  • E.g., n different ways to drive to work each day, n different ways to invest our money.

Other generalizations as well.

Other notions of no regret (e.g., no internal regret).


Game theory on line prediction and boosting freund schapire geb l.jpg

Online Learning, advice.Game Theory, and Minimax Optimality

“Game Theory, On-line Prediction, and Boosting”, Freund & Schapire, GEB


Slide16 l.jpg

Zero Sum Games advice.

Game defined by a matrix M.

Assume wlog entries in [0,1].

Scissors

Rock

Paper

1/2

1

0

Rock

Paper

0

1/2

1

Row player (Mindy) chooses row i.

Scissors

1

0

1/2

Column player (Max) chooses column j (simultaneously).

Mindy’s goal: minimize her loss M(i,j).

Max’s goal: maximize this loss (zero sum).


Slide17 l.jpg

Randomized Play advice.

Mindy chooses a distribution P over rows.

Max chooses a distribution Q over columns [simultaneously]

Mindy’s expected loss:

If i,j = pure strategies, and P,Q = mixed strategies

M(P,j) - Mindy’s expected loss when she plays P and Max plays j

M(i,Q) - Mindy’s expected loss when she plays i and Max plays Q


Slide18 l.jpg

Sequential Play advice.

Say Mindy plays before Max. If Mindy chooses P, then Max will pick Q to maximize M(P,Q), so the loss will be

So, Mindy should pick P to minimize L(P). Loss will be:

Similarly, if Max plays first, loss will be:


Slide19 l.jpg

Minimax Theorem advice.

Playing second cannot be worse than playing first

Mindy plays first

Mindy plays second

Von Neumann’s minimax theorem:

No advantage to playing second!


Slide20 l.jpg

Optimal Play advice.

Von Neumann’s minimax theorem:

Value of the game

Optimal strategies:

Min-max strategy

Max-min strategy

We will show how to use WM to prove this!

And to also find approximate min-max strategies quickly.


Slide21 l.jpg

Optimal Play advice.

Von Neumann’s minimax theorem:

Value of the game

Optimal strategies:

Min-max strategy

Max-min strategy

(P*, Q*) is Nash Equilibria (No player has an incentive to unilateraly deviate)

Central solution

concept we will study


Slide22 l.jpg

Games with many players with interesting structure advice.

"Potential Games", D. Monderer and L, S. Shapley , Games and Economic Behavior


Slide23 l.jpg

Fair cost-sharing advice.

Fair cost-sharing: n players in weighted directed graph G. Player i wants to get from si to ti, and they share cost of edges they use with others.

G


Fair cost sharing l.jpg
Fair cost-sharing advice.

s

n

1

t

  • n players in directed graph G, each edge e costsce.

  • Player i wants to get fromsito ti.

  • All players share cost of edges they use with others.

  • Each player wants to minimize his own cost.

Good equilibrium: all use edge of cost 1.

(paying 1/n each)

Bad equilibrium: all use edge of cost n.

(paying 1 each)


Inefficiency of equilibria poa and pos l.jpg
Inefficiency of advice.equilibria, PoA and PoS

Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT.

[Koutsoupias-Papadimitriou’99]

Price of Stability (PoS): ratio of best Nash equilibrium to OPT.

[Anshelevich et. al, 2004]

E.g., for fair cost-sharing, PoS is log(n), whereas PoA is n.

Significant effort spent on understanding these in CS.

“Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani


Congestion games l.jpg
Congestion games advice.

  • Nice general class of games with many players.

  • Always have a pure-strategy equilibrium.

  • Have a potential functions.t. whenever a player switches, potential drops by exactly that player’s improvement.

  • We will analyze dynamics in these games!!!

  • What happens if players follow natural learning dynamics!!!


Slide27 l.jpg

Learning in a distributional setting. advice.

[With feature information]


Used all over cs and science l.jpg
Used all over CS and Science advice.

Image Classification

Document Categorization

Speech Recognition

Protein Classification

Spam Detection

Branch Prediction

Fraud Detection


Example supervised classification l.jpg
Example: Supervised Classification advice.

Decide which emails are spam and which are important.

Supervised classification

Not spam

spam

Goal: use emails seen so far to produce good prediction rule for future data.


Example supervised classification30 l.jpg

+ advice.

+

-

+

-

+

-

-

-

-

Example: Supervised Classification

Represent each message by features. (e.g., keywords, spelling, etc.)

example

label

Reasonable RULES:

Predict SPAM if unknown AND (money OR pills)

Predict SPAM if 2money + 3pills –5 known > 0

Linearly separable


Two main aspects of supervised learning l.jpg
Two Main Aspects of Supervised Learning advice.

Algorithm Design. How to optimize?

Automatically generate rules that do well on observed data.

Optimization played a significant role in the recent years.

Confidence Bounds, Generalization Guarantees, Sample Complexity

Confidence for rule effectiveness on future data.

Well understood for passive supervised learning.


Standard passive supervised learning l.jpg
Standard Passive Supervised Learning advice.

  • S={(x, l)} - set of labeled examples

  • X – feature space

  • drawn i.i.d. from distr. D over X and labeled by target concept c*

  • Do optimization over S, find hypothesis h 2 C.

  • Goal: h has small error over D.

  • err(h)=Prx 2 D(h(x)  c*(x))

c*

h

  • c* in C, realizable case; else agnostic


Standard passive supervised learning33 l.jpg
Standard Passive Supervised Learning advice.

Classic models: PAC (Valiant), SLT (Vapnik)

  • Sample Complexity, Finite Hypothesis Spaces, Realizable Case

  • In in the non-realizable case, replace \epsilon with \epsilon ^2.


Standard passive supervised learning34 l.jpg
Standard Passive Supervised Learning advice.

Classic models: PAC (Valiant), SLT (Vapnik)

  • Sample Complexity, Finite Hypothesis Spaces, Realizable Case

  • Such ideas/techniques useful in Auction design, Learning submodular functions, etc.


Boosting game theory l.jpg
Boosting & game theory advice.

  • Suppose I have an algorithm A that for any distribution (weighting fn) over a dataset S can produce a rule h2H that gets < 40% error.

  • Adaboost gives a way to use such an A to get error ! 0 at a good rate, using weighted votes of rules produced.

  • We can show that this is in principle possible by using the minimax theorem!


Supermarket pricing problem l.jpg
Supermarket Pricing Problem advice.

  • A supermarket trying to decide on how to price the goods.

Seller’s Goal: set prices to maximize revenue.

  • Simple case: customers make separate decisions on each item.

  • Harder case: customers buy everything or nothing based on

  • sum of prices in list.

  • Or could be even more complex.


Supermarket pricing problem37 l.jpg
Supermarket Pricing Problem advice.

Algorithmic

  • Seller knows the market well.

Incentive Compatible Auction

  • Must be in customers’ interest (dominant strategy) to report truthfully.

Online Pricing

  • Customers arrive one at a time, buy what they want at current prices. Seller modifies prices over time.

  • Techniques from learning will be useful here.


Submodular functions l.jpg
Submodular advice. functions

V={1,2, …, n}, f : 2V!R

Submodularity:

  • Concave Functions Let h : R!R be concave.For each S µ V, let f(S) = h(|S|)

f(S)+f(T) ¸ f(S Å T) + f(S [ T) 8 S,Tµ V

Equivalent

Decreasing marginal values:

f(S [ {x})-f(S) ¸ f(T [ {x})-f(T) 8SµTµV, xT

Examples:

  • Vector Spaces Let V={v1,,vn}, each vi2Rn.For each S µ V, let f(S) = rank(V[S])


Submodular functions39 l.jpg
Submodular advice. functions

  • Strong connection between optimization and submodularity

    • e.g.: minimization [C’85,GLS’87,IFF’01,S’00,…],maximization [NWF’78,V’07,…]

  • Algorithmic game theory

  • Submodular utility functions

  • Much interest in Machine Learning community recently

  • Tutorials at major conferences: ICML, NIPS, etc.

  • www.submodularity.org is a Machine Learning site

  • Interesting to understand their learnability


ad