1 / 11

Game Theory: Optimal stopping problem I: Introduction

Game Theory: Optimal stopping problem I: Introduction. Sarbar Tursunova. Table of content. 1. The definition of the Problem 2. The house-selling problem 3. Maximizing the average 4. The one-armed bandit 5. Detecting a change-point. The definition of the problem.

Download Presentation

Game Theory: Optimal stopping problem I: Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Game Theory: Optimal stopping problem I: Introduction Sarbar Tursunova

  2. Table of content 1. The definition of the Problem 2. The house-selling problem 3. Maximizing the average 4. The one-armed bandit 5. Detecting a change-point

  3. The definition of the problem Stopping rule problems are associated with two objects: •  a sequance of random variables, X1, X2,......., whose joint distribution is assumed known • a sequence of real-valued reward functions: yo,y1(x1), y2, (x1,x2),........ Your problem is to choose a time to stop to maximize the expected reward.

  4. The house-selling problem Assume:  Xn- the amount of the offer received on day n  c- cost of living  c>0 amount of cost observation --------------------------------------------------   Offer  Xn  Accept or deny?

  5. The problems with recall were introduced by MacQueen and Miller (1960), Derman and Sacks( 1960) and Chow and Robbins (1961), and, with discount rather than cost, by Karlin (1962). In the economics literature, this problem is called a job search problems and is attributed to George Stigler (1961, 1962). An unemployed worker is searching for a job. Each search costs a certain amount in time and lost wages. How many searched  should the worker undertake before accepting the best offer so far found? For a review form this viewpoint, see Lippman and McCall (1976)

  6. Maximizing the average Observation of coin being tossed What stopping rule should you employ to maximize your expected payoff? An how great an expected payoff can you obtain? This problem first time was studied by Y.S. Chow and H. Robbins (1965)

  7. Put the problem of maximizing the average in the form of a stopping rule problem:  let X1, X2.... be independent identially distributed random variables with a known distribution having a finite mean m, and let                          y0=m        yn(x1,.....,xn)= (x1+...+xn)/n        for   n=1,2,...        y∞ (x1,x2,...)= m This assumes that if you don't take any observations you receive m.

  8. The One-armed bandit  (Bradt, Johnson and Karlin (1956)) Given: The Standard treatment T2, probability p0 Treatment T1, unknown probability p A group of patients = n Decide which treatment to give to  give each patient= ??? Objective is to cure as many of the patients as possible. Your payoff is the number of patients cured. Shows that if it is ever optimal to use  T2 on a patient, then it is optimal to continue to use T2 on all subsequent patients. The problem is when to start a treatment.

  9. Detecting a change-point (Shiryaev (1963)) random variables X1, X2,..... distribution Fo time T other distribution F1 change    c>0 Total cost : Yn = cI {n<T} + (n-T) I {n>/T} for n=0,1,... and           Y ∞ = ∞

  10. In this display, I(A) the indicator function of a set A; for example , I {n<T} is equal to 1 if n<T, and to zero otherwise. Since T is a random unobservable quantity, we may replace Yn by its conditional expected value given X1,......, Xn yn=cP(T>N/Fn)+E((n-T)+/Fn) for n=0,1,..... and    Y ∞ = ∞ applications include monitoring hear patients for a change in pulse rate, monitoring a production line for a change in quality.

  11. Thank you for your attention!!!

More Related