430 likes | 434 Views
Efficient algorithms for online decinsion problems. Adam Kalai , Santosh Vempala. Seminar on Experts and Bandits, Fall 17/18. Ran Hochshtet. Contents. Online decision problem N Experts Online shortest paths Tree update problem. Introduction. Online decision problem
E N D
Efficient algorithms foronline decinsion problems Adam Kalai, Santosh Vempala Seminar on Experts and Bandits, Fall 17/18 Ran Hochshtet
Contents • Online decision problem • N Experts • Online shortest paths • Tree update problem
Introduction • Online decision problem • No knowledge of the future • Each period we pick choice • Pay • Goal: Minimize the regret
Linear generalization • Series of decisions from infinite set • leads to state • Making decision in state costs • Total cost =
Linear generalization • - state • - decision • M computes the best single decision in hindsight
Predicting from expert advice • experts • Each period we pick expert • Pay • Goal: Minimize the regret
Predicting from expert advice • experts problem • - the costs vector
Motivation • Consider example with two experts • The costs are: • Follow the leader always incurs cost of 1The total cost is • Using perturbations we can achieve
On each period t: • Choose uniformly at random from the cube • Use
On each period t: • Choose at random according to the exponential distribution • Use
Notations • for • for all • for all • for all
Theorem 1.1 • Let be a state sequence • (a) Running with gives:
Theorem 1.1 • Let be a state sequence • (b) For nonnegative , gives:
Theorem 1.1 • If or are known: • For FPL:
Experts problem • It seems that , • In our algorithm the worst case is wheneach period only one expert incurs cost Min-cost is b After we choose b, there is a chance we choose c
Experts problem • , • By Theorem (b):
Online shortest paths • Input • Directed Graph - • Pair of nodes • Each period pick path from to • Then times on all edges are revealed • The cost is the sum of times on the chosen path
Online shortest paths • is the number of edges • - the times vector
Online shortest paths • Use • On each period • For each edge pick from exp. Distribution (same as ) • = the total times on edge so far • Use shortest path in the graph with weights
Online shortest paths • , • By Theorem :
Proof of Theorem - • “Be the leader” – use instead of • “Be the leader” has no regret • Prove by induction
Proof of Theorem - • We show that: • For – trivial • Induction step from to :
Lemma • We want to show that perturbations do not hurt too much • Still “be the leader” algorithm • For any state sequence , any and any vectors
Proof of Lemma • Pretend
Proof of Theorem - • Use , for all • No need to choose new each period • Applying Lemma :
Proof of Theorem - • Now we return to use instead of • We need to show that:
Proof of Theorem - • Key idea: the distributions over and are similar • If the cubes are identical, i.e. , then • If they overlap on fraction of their volume:
Lemma • For any , the cubes and overlap in at least a fraction
Proof of Lemma • Take a random point ,if , then for some , • With union bound we get:
Proof of Theorem - • By lemma : • Each period the difference between using and is at most • We get:
The tree update problem • Maintain a binary search tree over items • There is an unknown sequence of accesses • The cost is the number of comparisons • Equals to the depth in the tree
The tree update problem • We can solve the problem with • Each period we find the best tree so far, and use it • The problem: • For each access we do expensive computation
The tree update problem • Follow the lazy leading tree: • For , let and choose randomly from • Start with best tree as if there were accesses to node • After each access to item :(a) (b) if theni. ii. Compute best tree as if there were accesses to node
Calling the oracle can be a computationally expensive • We want to minimize the numbers of times we use • Trick: use as often as possible
is equivalent to in terms of expected cost • rarely calls the oracle • rarely changes decision from one period to the next
Once, choose uniformly • Determine a grid: • On period . Use where is the unique point in • If - no need to re-evaluate
Lemma • For any fixed sequence of states and (also and ) have identical expectations on each period . • The probability of (or ) performing an update is at most .
Proof of Lemma • chooses a uniformly random grid of spacing • There is exactly one grid point inside • By symmetry is uniformly distributed over • Same as - uniform over
Lemma 3.2:For any , the cubes and overlap in at least a fraction Proof of Lemma • In each update: • The grid point of is not in the cube • By lemma 3.2:
Summary • Online decision problem • N Experts • Online shortest paths • Tree update problem
THANKS! ANY QUESTIONS?