1 / 27

Polynomial Curve Fitting BITS C464/BITS F464

Polynomial Curve Fitting BITS C464/BITS F464. Navneet Goyal Department of Computer Science, BITS- Pilani , Pilani Campus, India. Polynomial Curve Fitting. Seems a very trivial concept!! Why are we discussing it in Machine Learning course? A simple regression problem!!

Download Presentation

Polynomial Curve Fitting BITS C464/BITS F464

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Polynomial Curve FittingBITS C464/BITS F464 NavneetGoyal Department of Computer Science, BITS-Pilani, Pilani Campus, India

  2. Polynomial Curve Fitting • Seems a very trivial concept!! • Why are we discussing it in Machine Learning course? • A simple regression problem!! • It motivates a number of key concepts of ML!! • Let’s discover…

  3. Polynomial Curve Fitting Observe Real-valued input variable x • Use x to predict value of target variable t • Synthetic data generated from sin(2π x) • Random noise in target values Target Variable Input Variable Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  4. Polynomial Curve Fitting N observations of x x = (x1,..,xN)T t = (t1,..,tN)T • Goal is to exploit training set to predict value of from x • Inherently a difficult problem Target Variable Data Generation: N = 10 Spaced uniformly in range [0,1] Generated from sin(2πx) by adding small Gaussian noise Noise typical due to unobserved variables Input Variable Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  5. Polynomial Curve Fitting • Where M is the order of the polynomial • Is higher value of M better? We’ll see shortly! • Coefficients w0 ,…wM are denoted by vector w • Nonlinear function of x, linear function of coefficients w • Called Linear Models Target Variable Input Variable Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  6. Sum-of-Squares Error Function Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  7. Polynomial curve fitting Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  8. Polynomial curve fitting • Choice of M?? • Called the model selection or model comparison Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  9. 0th Order Polynomial Poor representations of sin(2πx) Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  10. 1st Order Polynomial Poor representations of sin(2πx) Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  11. 3rd Order Polynomial Best Fit to sin(2πx) Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  12. 9th Order Polynomial Over Fit: Poor representation of sin(2πx) Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  13. Polynomial Curve Fitting • Good generalization is the objective • Dependence of generalization performance on M? • Consider a data set of 100 points • Calculate E(w*) for both training data & test data • Choose M which minimizes E(w*) • Root Mean Square Error (RMS) • Sometimes convenient to use as division by N allows us to compare different sizes of data sets on equal footing • Square root ensures ERMS is measure on the same scale ( and in same units) as the target variable t Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  14. Flexibility & Model Complexity • M=0, very rigid!! Only 1 parameter to play with!

  15. Flexibility & Model Complexity • M=1, not so rigid!! 2 parameters to play with!

  16. Flexibility & Model Complexity • So what value of M is most suitable? • Any Answers???

  17. Over-fitting For small M(0,1,2) Inflexible to handle oscillations of sin(2πx) M(3-8) flexible enough to handle oscillations of sin(2πx) For M=9 Too flexible!! TE = 0 GE = high Why is it happening? Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  18. Polynomial Coefficients Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  19. Data Set Size M=9 - Larger the data set, the more complex model we can afford to fit to the data - No. of data pts should be no less than 5-10 times the no. of adaptive parameters in the model Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  20. Over-fitting Problem Should we limit the no. of parameters according to the available training set? Complexity of the model should depend only on the complexity of the problem! LSE represents a specific case of Maximum Likelihood Over-fitting is a general property of maximum likelihood Over-fitting Problem can be avoided using the Bayesian Approach!

  21. Over-fitting Problem In Bayesian Approach, the effective number of parameters adapts automatically to the size of the data set In Bayesian Approach, models can have more parameters than the number of data points Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  22. Regularization Penalize large coefficient values Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  23. Regularization: Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  24. Regularization: Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  25. Regularization: vs. Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  26. Polynomial Coefficients Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006 Springer

  27. Take Away from Polynomial Curve Fitting • Concept of over-fitting • Model Complexity & Flexibility Will keep revisiting it from time to time…

More Related