1 / 33

Online Algorithms – II

Online Algorithms – II. Amrinder Arora. Permalink: http://standardwisdom.com/softwarejournal/presentations/. Summary. Importance and Research Topics. Online algorithms show up in many practical problems.

dagmar
Download Presentation

Online Algorithms – II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Online Algorithms – II Amrinder Arora Permalink: http://standardwisdom.com/softwarejournal/presentations/

  2. Summary • Importance and Research Topics • Online algorithms show up in many practical problems. • Even if you are considering an offline problem, consider what would be the online version of that problem. • Research areas including improving algorithms, improving analysis of existing algorithms, proving tightness of analysis, considering problem variations, etc.

  3. Part II only makes sense, if it is better than part I..

  4. Two Options Randomized version of online job scheduling Online algorithms in machine learning Online Graph Coloring

  5. Job Scheduling – Randomized • No randomized algorithm for 2-machine scheduling can be better than 4/3-competitive • Consider any randomized algorithm A • How can we prove a lower bound on competitive ratio of algorithm A?

  6. Job Scheduling • Consider the old job sequence of 1,1, 2 • After first two jobs, E[L1] ≤ 4/3. • Therefore, E[L2] ≥ 2/3. • Therefore, after job of size 2 is scheduled, then E[L2] ≥ 8/3. • OPT offline makespan = 2. • Therefore, competitive ratio of randomized algorithm A ≥ 4/3.

  7. Algorithm “Random Scheduler” • 4/3 competitive randomized algorithm • Tries to keep machine loads in expected ratio of 2:1.

  8. Online Algorithms in Machine Learning • But first, let us understand classification techniques

  9. Classification • Given a collection of records (training set) • Each record contains a set of attributes, and a class. • Find a model for class attribute as a function of the values of other attributes. • Goal: previously unseen records should be assigned a class as accurately as possible. • A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

  10. Illustrating Classification Task

  11. Examples of Classification Task • Predict tax returns as “clean” or “auditable” • Predicting tumor cells as benign or malignant • Classifying credit card transactions as legitimate or fraudulent • Classifying secondary structures of protein as alpha-helix, beta-sheet, or random coil • Categorizing news stories as finance, weather, entertainment, sports, etc

  12. Classification Techniques • Decision Tree based Methods • Rule-based Methods • Memory based reasoning • Neural Networks • Naïve Bayes and Bayesian Belief Networks • Support Vector Machines

  13. Example of a Decision Tree • income < $40K • job > 5 yrs then good risk • job < 5 yrs then bad risk • income > $40K • high debt then bad risk • low debt then good risk

  14. Decision Tree Induction • Many Algorithms: • Hunt’s Algorithm (one of the earliest) • CART • ID3, C4.5 • SLIQ,SPRINT

  15. General Structure of Hunt’s Algorithm • Let Dt be the set of training records that reach a node t • General Procedure: • If Dt contains records that belong the same class yt, then t is a leaf node labeled as yt • If Dt is an empty set, then t is a leaf node labeled by the default class, yd • If Dt contains records that belong to more than one class, use an attribute test to split the data into smaller subsets. Recursively apply the procedure to each subset.

  16. Measures of Node Impurity • GiniIndex • Entropy • Misclassification error

  17. Different kinds of classifiers.. • Different decision trees based on Hunt’s • C4.5 • Naïve Bayes • Support Vector Machine

  18. Online Algorithms in Machine Learning • Given m experts, each given an output (0,1) • We want to be able predict the output • After each try, we are told the result. • Goal: After some time, we want to be able to do “not much worse” than the best expert.

  19. “Weighted Majority” – Algorithm 1 • Initialize the weights of all experts w1..wn to 1 • At each step, take the majority decision. That is, output 1 if weighted average of experts saying 1 is at least 0.5 • After each step, halve the weight of each expert who was wrong (leave the weight of correct experts unchanged)

  20. Performance of WM-A1 The number of mistakes made by Weighted Majority- Algorithm 1 is never more than 2.41 (m + lg n), where m is the number of mistakes made by best expert. Proof • Suppose WM-A1 makes M mistakes • After each mistake, total weight goes down by ¼. So, it is no more than n(3/4)M • [All initial weights are 1, so initial total weight = n] • After each mistake, best expert’s weight goes down by ½. So, it is no more than 1/2m • So, 1/2m ≤ n(3/4)M • [Best expert’s weight is no more than the total weight.]

  21. Performance of WM-A1 Proof (cont.) 1/2m ≤ n(3/4)M  (4/3)M ≤ n 2m  M lg(4/3) ≤ lg n + m  M ≤ [1 / lg(4/3)] [m + lg n]  M ≤ 2.41 [m + lg n] The number of mistakes made by Weighted Majority- Algorithm 1 is never more than 2.41 (m + lg n), where m is the number of mistakes made by best expert, and n is number of experts.

  22. Are the experts independent?

  23. “Weighted Majority” – Algorithm 2 • Initialize the weights of all experts w1..wn to 1 • At each step, take the probability decision. That is, output 1 with probability that is equal to sum of weights of experts that say 1 (divided by total weight). • After each step, multiply the weight of each expert who was wrong by β (leave the weight of correct experts unchanged)

  24. Performance of WM-A2 The number of mistakes made by Weighted Majority- Algorithm 2 is never more than (m ln (1/ β) + ln n)/(1- β), where m is the number of mistakes made by best expert. For β = ½, this is: 1.39m + 2 ln n For β = 3/4, this is: 1.15m + 4 ln n

  25. Performance of WM-A2 Proof • Suppose we have seen t tries so far. • Let Fi be the fraction of total weight on the wrong answers at the i-th trial. • Suppose WM-A2 makes M mistakes. • Therefore M = {i=1 to t} { Fi } • [Why? Because, in each try, probability of mistake = Fi] • Suppose best expert makes m mistakes. • After each mistake, best expert’s weight gets multiplied by β. So, it is no more than βm • During each round, the total weight changes as: • W  W (1 – (1-β) Fi)

  26. Performance of WM-A2 Proof (cont.) • Therefore, at the end of t tries, total weight: W = n  {i= 1 to t} {1 – (1 – β) Fi} • Since total weight ≥ weight of best expert: n  {i= 1 to t} {1 – (1 – β) Fi} ≥ βm • Taking natural logs: ln n + {i=1 to t} ln {1 – (1 – β) Fi} ≥ m ln β • Reversing the inequality (multiply by -1): – ln n –{i=1 to t} ln {1 – (1 – β) Fi} ≤ m ln (1/β) • A bit of math: – ln (1 – x) > x – ln n + (1 – β) {i=1 to t} {Fi} ≤ m ln (1/β) • – ln n + (1 – β) M≤ m ln (1/β) • M≤ {m ln (1/β) + ln n} / {1 – β} The number of mistakes made by Weighted Majority- Algorithm 2 is never more than (m ln (1/ β) + ln n)/(1- β), where m is the number of mistakes made by best expert.

  27. Why does this all matter? • http://www.fda.gov/predict

  28. Online Graph Coloring • Vertices arrive one by one • We need to assign a color immediately, which cannot be changed. • What we are shown is an inducedsubgraph – not merely a subgraph. • In other words, edges cannot arrive by themselves.

  29. Applications of Graph Coloring • Tasks: { T1, ..., Tn} • Concurrency Constraint: unsharable resources • Conflict Matrix C: • C(i, j) = 0: Ti and Tj need no common resources • C(i, j) = 1: otherwise • Conflict Graph G: a graph with adjacency matrix C • G is k-colorable iff tasks can be schedules in k time interval

  30. Graph Coloring – Offline Version • Problem is NP-complete • Reduction from 3-SAT • Approximation algorithm?

  31. Online Graph Coloring • What should we aim for? • 1-competitive seems unlike, n-competitive is trivial and useless.

  32. Online Graph Coloring (cont.) • For every positive integer k, there exists a tree Tk on 2k-1 vertices such that every on-line coloring algorithm A requires at least k colors.

  33. Party Time!

More Related