Connections between Learning Theory, Game Theory, and Optimization

1 / 39

Connections between Learning Theory - PowerPoint PPT Presentation

Connections between Learning Theory, Game Theory, and Optimization Lecture 1, August 24 th 2010 Maria Florina (Nina) Balcan Big Picture Over the past decades, many important and deep connections between: machine learning theory algorithmic game theory combinatorial optimization

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Connections between Learning Theory' - benjamin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Connections between Learning Theory, Game Theory, and Optimization

Lecture 1, August 24th2010

Maria Florina (Nina) Balcan

Big Picture

Over the past decades, many important and deep connections between:

• machine learning theory
• algorithmic game theory
• combinatorial optimization

We will explore such connections, discussing:

• fundamental topics in each area.
• how ideas from each area can shed light on the others.
Outline

Online learning. Combining expert advice.

Regret minimization (no external regret and no internal regret). Bandit algorithms.

1/2

1

0

Zero sum games. Nash equilibria.

0

1/2

1

Experts learning & Minimax theorem.

1

0

1/2

Nash equilibria and approximate nash equilibria in general sum bimatrix games.

+

+

+

-

+

-

-

-

-

Outline

Learning in a distributional setting.

Sample complexity results.

Weak-learning vs. Strong-learning.

Boosting with connections to game theory.

Quality of equilibria (Price of anarchy/stability).

Games with many players. Potential games.

Dynamics in games and the price of learning.

Outline

Mechanism design (MD).

Combinatorial auctions. [Social welfare; revenue maximization]

Auctions for digital goods.

• Reductions from MD to algorithm design using machine learning.

Algorithmic pricing problems.

• Online learning for designing online pricing schemes.
Outline

Submodularity with connections to game theory and machine learning.

• Combinatorial auctions with submodular valuations
• Learning submodular functions
• Other optimization pbs involving submodularity (ranking, clustering, etc.)

http://www.cc.gatech.edu/~ninamf/LGO10/

• Course web page:
• 3 hwk assignments. Exercises/problems (pencil-and-paper problem-solving variety).
• Project: explore a theoretical question, try some experiments, or read a couple of papers and explain the idea. Writeup and class presentation. Groups ok.

[50%]

[50%]

• “Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani
• Other papers, surveys, and tutorials
• “The weighted majority algorithm”

N. Littlestone & M. Warmuth

• “Online Algorithms in Machine Learning” (survey)

A. Blum

• Algorithmic Game Theory, Nisan, Roughgarden, Tardos, Vazirani (eds) [Chapters 4]
• Prediction, Learning, and Games, Cesa-Bianchi, Lugosi

Expert 3

Expert 2

Expert 1

Assume we want to predict the stock market.

• Will the market go up or down?
• We solicit n “experts” for their advice.
• We then want to use their advice somehow to make our prediction. E.g.,

Can we do nearly as well as best in hindsight?

Note: “expert” ´ someone with an opinion.

[Not necessairly someone who knows anything.]

Formal model
• For each round t=1,2, …, T
• There are n experts.
• Each expert makes a prediction in {0,1}
• The learner (using experts’ predictions) makes a prediction in {0,1}
• The learner observes the actual outcome. There is a mistake if the predicted outcome is different form the actual outcome.

Can we do nearly as well as best in hindsight?

Weighted Majority Algorithm

Deterministic Majority Algorithm

• Start with all experts having weight 1.
• Predict based on weighted majority vote.
• If
• then predict 1
• else predict 0
• Penalize mistakes by cutting weight in half.

Randomized versions of this algorithm can provide surprisingly strong guarantees

Weighted Majority Algorithm
• E[# mistakes] ·(1+e)OPT + e-1log(n).
• If set =(log(n)/OPT)1/2 to balance the two terms out (or use guess-and-double), get bound of
• E[mistakes]·OPT+2(OPT¢log n)1/2

Note: Of course we might not know OPT, so if running T time steps, since OPT · T, set ² to get additive loss (2T log n)1/2

regret

• E[mistakes]·OPT+2(T¢log n)1/2
• So, regret/T ! 0.

[no regret algorithm]

Many other useful extensions

E.g., what if have n options, not n predictors?

• We’re not combining n experts, we’re choosing one.
• Nice feature of RWM: can be applied when experts are n different options
• E.g., n different ways to drive to work each day, n different ways to invest our money.

Other generalizations as well.

Other notions of no regret (e.g., no internal regret).

Online Learning, Game Theory, and Minimax Optimality

“Game Theory, On-line Prediction, and Boosting”, Freund & Schapire, GEB

Zero Sum Games

Game defined by a matrix M.

Assume wlog entries in [0,1].

Scissors

Rock

Paper

1/2

1

0

Rock

Paper

0

1/2

1

Row player (Mindy) chooses row i.

Scissors

1

0

1/2

Column player (Max) chooses column j (simultaneously).

Mindy’s goal: minimize her loss M(i,j).

Max’s goal: maximize this loss (zero sum).

Randomized Play

Mindy chooses a distribution P over rows.

Max chooses a distribution Q over columns [simultaneously]

Mindy’s expected loss:

If i,j = pure strategies, and P,Q = mixed strategies

M(P,j) - Mindy’s expected loss when she plays P and Max plays j

M(i,Q) - Mindy’s expected loss when she plays i and Max plays Q

Sequential Play

Say Mindy plays before Max. If Mindy chooses P, then Max will pick Q to maximize M(P,Q), so the loss will be

So, Mindy should pick P to minimize L(P). Loss will be:

Similarly, if Max plays first, loss will be:

Minimax Theorem

Playing second cannot be worse than playing first

Mindy plays first

Mindy plays second

Von Neumann’s minimax theorem:

No advantage to playing second!

Optimal Play

Von Neumann’s minimax theorem:

Value of the game

Optimal strategies:

Min-max strategy

Max-min strategy

We will show how to use WM to prove this!

And to also find approximate min-max strategies quickly.

Optimal Play

Von Neumann’s minimax theorem:

Value of the game

Optimal strategies:

Min-max strategy

Max-min strategy

(P*, Q*) is Nash Equilibria (No player has an incentive to unilateraly deviate)

Central solution

concept we will study

Games with many players with interesting structure

"Potential Games", D. Monderer and L, S. Shapley , Games and Economic Behavior

Fair cost-sharing

Fair cost-sharing: n players in weighted directed graph G. Player i wants to get from si to ti, and they share cost of edges they use with others.

G

Fair cost-sharing

s

n

1

t

• n players in directed graph G, each edge e costsce.
• Player i wants to get fromsito ti.
• All players share cost of edges they use with others.
• Each player wants to minimize his own cost.

Good equilibrium: all use edge of cost 1.

(paying 1/n each)

Bad equilibrium: all use edge of cost n.

(paying 1 each)

Inefficiency of equilibria, PoA and PoS

Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT.

Price of Stability (PoS): ratio of best Nash equilibrium to OPT.

[Anshelevich et. al, 2004]

E.g., for fair cost-sharing, PoS is log(n), whereas PoA is n.

Significant effort spent on understanding these in CS.

“Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani

Congestion games
• Nice general class of games with many players.
• Always have a pure-strategy equilibrium.
• Have a potential functions.t. whenever a player switches, potential drops by exactly that player’s improvement.
• We will analyze dynamics in these games!!!
• What happens if players follow natural learning dynamics!!!

Learning in a distributional setting.

[With feature information]

Used all over CS and Science

Image Classification

Document Categorization

Speech Recognition

Protein Classification

Spam Detection

Branch Prediction

Fraud Detection

Example: Supervised Classification

Decide which emails are spam and which are important.

Supervised classification

Not spam

spam

Goal: use emails seen so far to produce good prediction rule for future data.

+

+

-

+

-

+

-

-

-

-

Example: Supervised Classification

Represent each message by features. (e.g., keywords, spelling, etc.)

example

label

Reasonable RULES:

Predict SPAM if unknown AND (money OR pills)

Predict SPAM if 2money + 3pills –5 known > 0

Linearly separable

Two Main Aspects of Supervised Learning

Algorithm Design. How to optimize?

Automatically generate rules that do well on observed data.

Optimization played a significant role in the recent years.

Confidence Bounds, Generalization Guarantees, Sample Complexity

Confidence for rule effectiveness on future data.

Well understood for passive supervised learning.

Standard Passive Supervised Learning
• S={(x, l)} - set of labeled examples
• X – feature space
• drawn i.i.d. from distr. D over X and labeled by target concept c*
• Do optimization over S, find hypothesis h 2 C.
• Goal: h has small error over D.
• err(h)=Prx 2 D(h(x)  c*(x))

c*

h

• c* in C, realizable case; else agnostic
Standard Passive Supervised Learning

Classic models: PAC (Valiant), SLT (Vapnik)

• Sample Complexity, Finite Hypothesis Spaces, Realizable Case
• In in the non-realizable case, replace \epsilon with \epsilon ^2.
Standard Passive Supervised Learning

Classic models: PAC (Valiant), SLT (Vapnik)

• Sample Complexity, Finite Hypothesis Spaces, Realizable Case
• Such ideas/techniques useful in Auction design, Learning submodular functions, etc.
Boosting & game theory
• Suppose I have an algorithm A that for any distribution (weighting fn) over a dataset S can produce a rule h2H that gets < 40% error.
• Adaboost gives a way to use such an A to get error ! 0 at a good rate, using weighted votes of rules produced.
• We can show that this is in principle possible by using the minimax theorem!
Supermarket Pricing Problem
• A supermarket trying to decide on how to price the goods.

Seller’s Goal: set prices to maximize revenue.

• Simple case: customers make separate decisions on each item.
• Harder case: customers buy everything or nothing based on
• sum of prices in list.
• Or could be even more complex.
Supermarket Pricing Problem

Algorithmic

• Seller knows the market well.

Incentive Compatible Auction

• Must be in customers’ interest (dominant strategy) to report truthfully.

Online Pricing

• Customers arrive one at a time, buy what they want at current prices. Seller modifies prices over time.
• Techniques from learning will be useful here.
Submodular functions

V={1,2, …, n}, f : 2V!R

Submodularity:

• Concave Functions Let h : R!R be concave.For each S µ V, let f(S) = h(|S|)

f(S)+f(T) ¸ f(S Å T) + f(S [ T) 8 S,Tµ V

Equivalent

Decreasing marginal values:

f(S [ {x})-f(S) ¸ f(T [ {x})-f(T) 8SµTµV, xT

Examples:

• Vector Spaces Let V={v1,,vn}, each vi2Rn.For each S µ V, let f(S) = rank(V[S])
Submodular functions
• Strong connection between optimization and submodularity
• e.g.: minimization [C’85,GLS’87,IFF’01,S’00,…],maximization [NWF’78,V’07,…]
• Algorithmic game theory
• Submodular utility functions
• Much interest in Machine Learning community recently
• Tutorials at major conferences: ICML, NIPS, etc.
• www.submodularity.org is a Machine Learning site
• Interesting to understand their learnability