1 / 58

Learning with AdaBoost

Learning with AdaBoost. Fall 2007. Outline. Introduction and background of Boosting and Adaboost Adaboost Algorithm example Adaboost Algorithm in current project Experiment results Discussion and conclusion. Outline. Introduction and background of Boosting and Adaboost

ifama
Download Presentation

Learning with AdaBoost

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning with AdaBoost Fall 2007

  2. Outline • Introduction and background of Boosting and Adaboost • Adaboost Algorithm example • Adaboost Algorithm in current project • Experiment results • Discussion and conclusion Learning with Adaboost

  3. Outline • Introduction and background of Boosting and Adaboost • Adaboost Algorithm example • Adaboost Algorithm in current project • Experiment results • Discussion and conclusion Learning with Adaboost

  4. Boosting Algorithm • Definition of Boosting[1]: Boosting refers to a general method of producing a very accurate prediction rule by combining rough and moderately inaccurate rules-of-thumb. • Boosting procedures[2] • Given a set of labeled training examples ,where is the label associated with instance • On each round , • The booster devises a distribution (importance) over the example set • The booster requests a weak hypothesis (rule-of-thumb) with low error • After T rounds, the booster combine the weak hypothesis into a single prediction rule. Learning with Adaboost

  5. Boosting Algorithm(cont’d) • The intuitive idea Altering the distribution over the domain in a way that increases the probability of the “harder” parts of the space, thus forcing the weak learner to generate new hypotheses that make less mistakes on these parts. • Disadvantages • Needs to know the prior knowledge of accuracies of the weak hypotheses • The performance bounds depends only on the accuracy of the least accurate weak hypothesis Learning with Adaboost

  6. background of Adaboost[2] Learning with Adaboost

  7. Adaboost Algorithm[2] Learning with Adaboost

  8. Advantages of Adaboost • Adaboost adjusts adaptively the errors of the weak hypotheses by WeakLearn. • Unlike the conventional boosting algorithm, the prior error need not be known ahead of time. • The update rule reduces the probability assigned to those examples on which the hypothesis makes a good predictions and increases the probability of the examples on which the prediction is poor. Learning with Adaboost

  9. The error bound[3] • Suppose the weak learning algorithm WeakLearn, when called by Adaboost, generates hypotheses with errors . Then the error of the final hypothesis output by Adaboost is bounded above by Note that the errors generated by WeakLearn are not uniform, and the final error depends on the error of all of the weak hypotheses. Recall that the errors of the previous boosting algorithms depend only on the maximal error of the weakest hypothesis and ignored the advantages that can be gained from the hypotheses whose errors are smaller. Learning with Adaboost

  10. Outline • Introduction and background of Boosting and Adaboost • Adaboost Algorithm example • Adaboost Algorithm in current project • Experiment results • Discussion and conclusion Learning with Adaboost

  11. A toy example[2] Training set: 10 points (represented by plus or minus)Original Status: Equal Weights for all training samples Learning with Adaboost

  12. A toy example(cont’d) Round 1: Three “plus” points are not correctly classified;They are given higher weights. Learning with Adaboost

  13. A toy example(cont’d) Round 2: Three “minuse” points are not correctly classified;They are given higher weights. Learning with Adaboost

  14. A toy example(cont’d) Round 3: One “minuse” and two “plus” points are not correctly classified;They are given higher weights. Learning with Adaboost

  15. A toy example(cont’d) Final Classifier: integrate the three “weak” classifiers and obtain a final strong classifier. Learning with Adaboost

  16. Outline • Introduction and background of Boosting and Adaboost • Adaboost Algorithm example • Adaboost Algorithm in current project • Experiment results • Discussion and conclusion Learning with Adaboost

  17. Look at Adaboost[3] Again Learning with Adaboost

  18. Adaboost(Con’d):Multi-class Extensions • The previous discussion is restricted to binary classification problems. The set Y could have any number of labels, which is a multi-class problems. • The multi-class case (AdaBoost.M1) requires the accuracy of the weak hypothesis greater than ½. This condition in the multi-class is stronger than that in the binary classification cases Learning with Adaboost

  19. AdaBoost.M1 Learning with Adaboost

  20. Error Upper Bound of Adaboost.M1[3] • Like the binary classification case, the error of the final hypothesis is also bounded. Learning with Adaboost

  21. How does Adaboost.M1 work[4]? Learning with Adaboost

  22. Adaboost in our project Learning with Adaboost

  23. Adaboost in our project • 1) The initialization has set the total weights of target class the same as all other staff. bird[1,…,10] = ½ * 1/10; otherstaff[1,…,690] = ½ * 1/690; • 2) The history record is preserved to strengthen the updating process of the weights. • 3) the unified model obtained from CPM alignment are used for training process. Learning with Adaboost

  24. Adaboost in our project • 2) The history record weight_histogram(withHistory Record) weight_histogram(without History Record) Learning with Adaboost

  25. Adaboost in our project 3) the unified model obtained from CPM alignment are used for training process. This has decreased the overfitting problem. 3.1) Overfitting Problem. 3.2) CPM model. Learning with Adaboost

  26. 3.1) Overfitting Problem. Why the trained Adaboost does not work for bird 11~20? I have compared: I ) the rank of alpha value for each 60 classifiers II) how each classifier has actually detected birds in train process III) how each classifier has actually detected birds in test process. The covariance is also computed for comparison: cov(c(:,1),c(:,2)) ans = 305.0000 6.4746 6.4746 305.0000 K>> cov(c(:,1),c(:,3)) ans = 305.0000 92.8644 92.8644 305.0000 K>> cov(c(:,2),c(:,3)) ans = 305.0000 -46.1186 -46.1186 305.0000 Adaboost in our project Overfitted! Train data is different from test data. This is very common. Learning with Adaboost

  27. Adaboost in our project Train Result(Covariance:6.4746) Learning with Adaboost

  28. Adaboost in our project Comparison:Train&Test Result(Covariance:92.8644) Learning with Adaboost

  29. Adaboost in our project 3.2) CPM: continuous profile model; put forward by Jennifer Listgarten. This is very useful for data alignment. Learning with Adaboost

  30. Adaboost in our project • The alignment results from CPM model: Learning with Adaboost

  31. Adaboost in our project • The unified model from CPM alignment: without resampled after upsample and downsample Learning with Adaboost

  32. Adaboost in our project • The influence of CPM for history record Learning with Adaboost

  33. Outline • Introduction and background of Boosting and Adaboost • Adaboost Algorithm example • Adaboost Algorithm in current project • Experiment results • Discussion and conclusion Learning with Adaboost

  34. Browse all birds Learning with Adaboost

  35. Curvature Descriptor Learning with Adaboost

  36. Distance Descriptor Learning with Adaboost

  37. Adaboost without CPM Learning with Adaboost

  38. Adaboost without CPM(con’d) Learning with Adaboost

  39. Good_Part_Selected(Adaboost without CPM con’d) Learning with Adaboost

  40. Adaboost without CPM(con’d) • The Alpha Values • Other Statistical Data: zero rate: 0.5333; covariance: 0.0074; median: 0.0874 Learning with Adaboost

  41. Adaboost with CPM Learning with Adaboost

  42. Adaboost with CPM(con’d) Learning with Adaboost

  43. Adaboost with CPM(con’d) Learning with Adaboost

  44. Good_Part_Selected(Adaboost without CPM con’d) Learning with Adaboost

  45. Adaboost without CPM(con’d) • The Alpha Values • Other Statistical Data: zero rate: 0.6167; covariance: 0.9488; median: 1.6468 Learning with Adaboost

  46. Outline • Introduction and background of Boosting and Adaboost • Adaboost Algorithm example • Adaboost Algorithm in current project • Experiment results • Discussion and conclusion Learning with Adaboost

  47. Conclusion and discussion 1) Adaboost works with CPM unified model; This model has smoothed the trained data set and decreased the influence of overfitting. 2) The influence of history record is very interesting. It will suppress the noise and strengthen the WeakLearn boosting direction. 3) The step length of KNN selected by Adaboost is not discussed here. This is also useful for suppress noise. Learning with Adaboost

  48. Conclusion and discussion(con’d) 4) The Adaboost does not rely on the trained order. The obtained Alpha value has very similar distribution for all the classifiers. There are two examples: Example 1: four different train orders have obtained the Alpha as follow: 1) 6 birdsAlpha_All1=  0.4480    0.1387    0.2074    0.5949    0.5868    0.3947    0.3874  0.5634    0.6694    0.74472) 6 birdsAlpha_All2=  0.3998    0.0635    0.2479    0.6873    0.5868    0.2998    0.4320  0.5581    0.6946    0.76523) 6 birdsAlpha_All3 = 0.4191    0.1301    0.2513    0.5988    0.5868    0.2920    0.4286  0.5503    0.6968    0.71344) 6 birdsAlpha_All4=  0.4506    0.0618    0.2750    0.5777    0.5701    0.3289    0.5948  0.5857    0.7016    0.6212 Learning with Adaboost

  49. Conclusion and discussion(con’d) Learning with Adaboost

  50. Conclusion and discussion(con’d) Example 2: 60 parts from Curvature Descriptor, 60 from Distance Descriptor; 1) They are trained independently at first; 2) Then they are combined to be trained together. The results are as follow: Learning with Adaboost

More Related