1 / 16

Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley

Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley. Multivariate algorithms. Square cuts may work well for simpler tasks, but as the data are multivariate, the algorithms also must be . Multivariate Algorithms.

tanisha
Download Presentation

Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support Vector Machines:Get more Higgs out of your dataDaniel WhitesonUC Berkeley Daniel Whiteson

  2. Multivariate algorithms Square cuts may work well for simpler tasks, but as the data are multivariate, the algorithms also must be. Daniel Whiteson

  3. Multivariate Algorithms • HEP overlaps with Computer Science, Mathematics and Statistics in this area: • How can we construct an algorithm that can be taught by example and generalize effectively? • We can use solutions from those fields: • Neural Networks • Probability Density Estimators • Support Vector Machines Daniel Whiteson

  4. Neural Networks • Constructed from a very simple object, they can learn complex patterns. • Decision function learned using freedom in hidden layers. • Used very effectively as signal discriminators, particle identifiers and parameter estimators • Fast evaluation makes them suited to triggers Daniel Whiteson

  5. Probability Density Estimation Then we could calculate If we knew the distributions of the signal fs(x) Example disc. surface and the background fb(x), And use it to discriminate. Daniel Whiteson

  6. Probability Density Estimation Of course we do not know the analytical distributions. • Given a set of points drawn from a distribution, put down a kernel centered at each point. • With high statistics, this approximates a smooth probability density. Surface with many kernels Daniel Whiteson

  7. Probability Density Estimation • Simple techniques have advanced to more sophisticated approaches: • Adaptive PDE • varies the width of the kernel for smoothness • Generalized for regression analysis • Measure the value of a continuous parameter • GEM • Measures the local covariance and adjusts the individual kernels to give a more accurate estimate. Daniel Whiteson

  8. Support Vector Machines • PDEs must evaluate a kernel at every training point for every classification of a data point. • Can we build a decision surface that only uses the relevant bits of information, the points in training set that are near the signal-background boundary? For a linear, separable case, this is not too difficult. We simply need to find the hyperplane that maximizes the separation. Daniel Whiteson

  9. Support Vector Machines • To find the hyperplane that gives the highest separation (lowest “energy”), we maximize the Lagrangian w.r.t ai: (xi,yi) are training data aiare positive Lagrange multipliers The solution is: Whereai=0for non support vectors (images from applet at http://svm.research.bell-labs.com/) Daniel Whiteson

  10. Support Vector Machines But not many problems of interest are linear. Map data to higher dimensional space where separation can be made by hyperplanes We want to work in our original space. Replace dot product with kernel function: For these data, we need Daniel Whiteson

  11. Support Vector Machines Neither are entirely separable problems very difficult. • Allow an imperfect decision boundary, but add a penalty. • Training errors, points on the wrong side of the boundary, are indicated by crosses. Daniel Whiteson

  12. Support Vector Machines We are not limited to linear or polynomial kernels. Gives a highly flexible SVM • Gaussian kernel SVMs outperformed PDEs in recognizing handwritten • numbers from the USPS database. Daniel Whiteson

  13. Comparative study for HEP Signal: Wh to bb Neural Net Background: Wbb Background: tt PDE Background: WZ 2-dimensional discriminant with variables Mjj and Ht SVM Discriminator Value Daniel Whiteson

  14. Comparative study for HEP Signal to Noise Enhancement Efficiency 43% Efficiency 50% Efficiency 49% All of these methods provide powerful signal enhancement Discriminator Threshold Daniel Whiteson

  15. Algorithm Comparisons Daniel Whiteson

  16. Conclusions • Difficult problems in HEP overlap with those in other fields. We can take advantage of our colleagues’ years of thought and effort. • There are many areas of HEP analysis where intelligent multivariate algorithms like NNs, PDEs and SVMs can help us conduct more powerful searches and make more precise measurements. Daniel Whiteson

More Related