1 / 72

Model Fitting

Model Fitting. Jean-Yves Le Boudec. Contents. What is model fitting ? Linear Regression Linear regression with norm minimization Choosing a distribution Heavy Tail. Virus Infection Data. We would like to capture the growth of infected hosts (explanatory model)

lydia
Download Presentation

Model Fitting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model Fitting Jean-Yves Le Boudec 0

  2. Contents • Whatis model fitting ? • LinearRegression • Linearregressionwithnormminimization • Choosing a distribution • HeavyTail 1

  3. Virus Infection Data • We would like to capture the growth of infected hosts (explanatory model) • An exponential model seems appropriate • How can we fit the model, in particular, what is the value of  ? 2

  4. Least Square Fit of Virus Infection Data = 0.5173 Mean doubling time 1.34 hours Prediction at +6 hours: 100 000 hosts Least square fit 3

  5. Least Square Fit of Virus Infection Data In Log Scale = 0.39 Mean doubling time 1.77 hours Prediction at +6 hours: 39 000 hosts Least square fit 4

  6. Compare the Two LS fit in natural scale LS fit in log scale 5

  7. Which Fitting Method should I use ? • Which optimization criterion should I use ? • The answer is in a statistical model. • Model not only the interesting part, but also the noise • For example = 0.5173 6

  8. How can I tell which is correct ? = 0.39 7

  9. Look at Residuals • = validate model 8

  10. 9

  11. Least Square Fit = Gaussian iid Noise • Assume model (homoscedasticity) • The theorem says: minimize least squares = compute MLE for this model • This is how we computed the estimates for the virus example 10

  12. Least Square and Projection • Skrivañ war an daol petra zo: data point, predicted response and estimated parameter for virus example Data point Predicted response Manifold Where the data point would lie if there would be no noise Estimated parameter 11

  13. Confidence Intervals 12

  14. 13

  15. Robustness to « Outliers » 14

  16. A Simple Example Least Square L1 Norm Minimization Model : noise Whatism ? Confidence interval ? • Model: noise • Whatism ? • Confidence interval ? 15

  17. Mean Versus Median 16

  18. 2. Linear Regression • Also called « ANOVA » (Analysis of Variance ») • = least square + linear dependence on parameter • A special case where computations are easy 17

  19. Example 4.3 • What is the parameter ? • Is it a linear model ? • How many degrees of freedom ? • What do we assume on i? • What is the matrix X ? 18

  20. 19

  21. Does this model have full rank ? 20

  22. Some Terminology • xi are called explanatory variable • Assumed fixed and known • yi are called response variables • They are « the data » • Assumed to be one sample output of the model 21

  23. Least Square and Projection Data point Predicted response Manifold Where the data point would lie if there would be no noise Estimated parameter 22

  24. Solution of the Linear Regression Model 23

  25. Least Square and Projection • The theorem gives H and K data residuals Predicted response Manifold Where the data point would lie if there would be no noise Estimated parameter 24

  26. The Theorem Gives  with Confidence Interval 25

  27. SSR • Confidence Intervals use the quantity s • s2 is called « Sum of Squared Residuals » data residuals Predicted response 26

  28. Validate the Assumptions with Residuals 27

  29. Residuals • Residuals are given by the theorem data residuals Predicted response 28

  30. Standardized Residuals • The residuals ei are an estimate of the noise terms i • They are not (exactly) normal iidThe variance of ei is ???? • A: 1- Hi,i • Standardized residuals are not exactly normal iid either but their variance is 1 29

  31. Which of these two models could be a linear regression model ? • A: both • Linear regression does not mean that yi is a linear function of xi • Achtung: There is a hidden assumption • Noise is iid gaussian -> homoscedasticity 30

  32. 31

  33. 3. Linear Regression with L1 norm minimization • = L1 norm minimization + linear dependency on parameter • More robust • Less traditional 32

  34. This is convex programming 33

  35. 34

  36. Confidence Intervals • No closed form • Compare to median ! • Boostrap: • How ? 35

  37. 36

  38. 4. Choosing a Distribution • Know a catalog of distributions, guess a fit • Shape • Kurtosis, Skewness • Power laws • Hazard Rate • Fit • Verify the fit visually or with a test (see later) 37

  39. Distribution Shape • Distributions have a shape • By definition: the shape is what remains the same when we • Shift • Rescale • Example: normal distribution: what is the shape parameter ? • Example: exponential distribution: what is the shape parameter ? 38

  40. Standard Distributions • In a given catalog of distributions, we give only the distributions with different shapes. For each shape, we pick one particular distribution, which we call standard. • Standard normal: N(0,1) • Standard exponential: Exp(1) • Standard Uniform: U(0,1) 39

  41. Log-Normal Distribution 40

  42. 41

  43. Skewness and Curtosis 42

  44. Power Laws and Pareto Distribution 43

  45. Complementary Distribution FunctionsLog-log Scales Lognormal Normal Pareto 44

  46. Zipf’s Law 45

  47. 46

  48. Hazard Rate • Interpretation: probability that a flow dies in next dt seconds given still alive • Used to classify distribs • Aging • Memoriless • Fat tail • Ex: normal ? Exponential ? Pareto ? Log Normal ? 47

  49. The Weibull Distribution • Standard Weibull CDF: • Aging for c > 1 • Memoriless for c = 1 • Fat tailed for c <1 48

  50. Fitting A Distribution • Assume iid • Use maximum likelihood • Ex: assume gaussian; what are parameters ? • Frequent issues • Censoring • Combinations 49

More Related