1 / 46

Stat 155, Section 2, Last Time

Stat 155, Section 2, Last Time. Reviewed Excel Computation of: Time Plots (i.e. Time Series) Histograms Modelling Distributions: Densities (Areas) Normal Density Curve (very useful model) Fitting Normal Densities (using mean and s.d.). Reading In Textbook.

teigra
Download Presentation

Stat 155, Section 2, Last Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stat 155, Section 2, Last Time • Reviewed Excel Computation of: • Time Plots (i.e. Time Series) • Histograms • Modelling Distributions: Densities (Areas) • Normal Density Curve (very useful model) • Fitting Normal Densities (using mean and s.d.)

  2. Reading In Textbook Approximate Reading for Today’s Material: Pages 71-83, 102-112 Approximate Reading for Next Class: Pages 123-127, 132-145

  3. 2 Views of Normal Fitting • “Fit Model to Data” Choose & . • “Fit Data to Model” First Standardize Data Then use Normal . Note: same thing, just different rescalings (choose scale depending on need)

  4. Normal Distribution Notation The “normal distribution, with mean & standard deviation ” is abbreviated as:

  5. Interpretation of Z-scores Recall Z-score Idea: • Transform data • By subtracting mean & dividing by s.d. • To get (mean 0, s.d. 1) • Interpret as • I.e. “ is sd’s above the mean”

  6. Interpretation of Z-scores Same idea for Normal Curves: Z-scores are on scale, so use areas to interpret them Important Areas: • Within 1 sd of mean “the majority”

  7. Interpretation of Z-scores • Within 2 sd of mean “really most” • Within 3 sd of mean “almost all”

  8. Interpretation of Z-scores Interactive Version (used for above pics) From Publisher’s Website: http://bcs.whfreeman.com/ips5e/ • Statistical Applets • Normal Curve

  9. Interpretation of Z-scores Summary: These relations are called the “68 - 95 - 99.7 % Rule” HW: 1.86 (a: 234-298, b: 234, 298), 1.87

  10. Computation of Normal Areas Classical Approach: Tables • See inside covers of text • Summarizes area computations • Because can’t use calculus • Constructed by “computers” (a job description in the early 1900’s!)

  11. Computation of Normal Areas EXCEL Computation: works in terms of “lower areas” E.g. for Area < 1.3 is 0.7257

  12. Computation of Normal Areas Interactive Version (used for above pic) From Same Publisher’s Website: http://bcs.whfreeman.com/ips5e/ • Statistical Applets • Normal Curve

  13. Computation of Normal Areas EXCEL Computation: (of above e.g.) • Use NORMDIST • Enter parameters • x is “cutoff point” • Return is Area below x

  14. Computation of Normal Areas Computation of areas over intervals: (use subtraction) = -

  15. Computation of Normal Areas Computation of areas over intervals: (use subtraction for EXCEL too) E.g. Use Excel to check 68 - 95 - 99.7% Rule http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg9.xls

  16. Normal Area HW HW (use Excel): 1.94 1.97 (Hint: the % above 130 = 100% - % below 130) 1.99 (see discussion above) 1.113 Caution: Don’t just “twiddle EXCEL until answer appears”. Understand it!!!

  17. And Now for Something Completely Different A mind blowing video clip: 8 year old Skateboarding Twins: http://www.youtube.com/watch?v=8X2_zsnPkq8&mode=related&search= • Do they ever miss? • You can explore farther… Thanks to Devin Coley for the link

  18. Inverse of Area Function Inverse of Frequencies: “Quantiles” Idea: Given area, find “cutoff” x I.e. for Area = 80% This x is the “quantile”

  19. Inverse of Area Function EXCEL Computation of Quantiles: Use NORMINV Continue Class Example: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg9.xls • “Probability” is “Area” • Enter mean and SD parameters

  20. Inverse Area Example When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. The machine is “out of control” when it overfills. Choose an “alarm level”, which will give only 1 % false alarms. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99% http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg9.xls

  21. Inverse Area HW 1.95, 1.101, 1.107, 1.109 1.116 a (-0.674, 0.674) 1.117 1.118 (4.3%)

  22. Normal Diagnostic When is the Normal Model “good”? Useful Graphical Device: Q-Q plot = Normal Quantile Plot Idea: look at plot which is approximately linear for data from Normal Model

  23. Normal Quantile Plot Approach, for data : • Sort data • Compute “Theoretical Proportions”: • Compute “Theoretical Z-scores” • Plot Sorted Data (Y-axis) vs. Theoretical Z – scores (X-axis)

  24. Normal Quantile Plot Several Examples: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg12.xls • Show how to compute in Excel • Steps as above

  25. Normal Quantile Plot Main Lessons: • Melbourne Winter Temperature Data • Gaussian is good, so looks ~ linear • So OK, to use normal model for these data • Adding trendline helps in assessing linearity

  26. Normal Quantile Plot Main Lessons: • Intro Stat Course Exam Scores Data • Skewed distributions  nonlinearity • Outliers show up clearly • Normal model unreliable here • Combined plot highlights • Mean = Y-intercept • Standard Deviation = Slope

  27. Normal Quantile Plot Main Lessons: • Simulated Bimodal Data • Curve is flat near modes • Roughly linear near peaks • Corresponds to two normal subpopulaitons • Goes up fast a valley

  28. Normal Quantile Plot Homework: 1.122 1.123 1.125

  29. And now for something completely different Recall Distribution of majors of students in this course:

  30. And now for something completely different How about a biology joke? A seventh grade Biology teacher arranged a demonstration for his class. He took two earth worms and in front of the class he did the following: He dropped the first worm into a beaker of water where it dropped to the bottom and wriggled about. He dropped the second worm into a beaker of Ethyl alchohol and it immediately shriveled up and died. He asked the class if anyone knew what this demonstration was intended to show them.

  31. And now for something completely different He asked the class if anyone knew what this demonstration was intended to show them. A boy in the second row immediately shot his arm up and, when called on said: "You're showing us that if you drink alcohol, you won't have worms."

  32. Variable Relationships Chapter 2 in Text Idea: Look beyond single quantities, to how quantities relate to each other. E.g. How do HW scores “relate” to Exam scores? Section 2.1: Useful graphical device: Scatterplot

  33. Plotting Bivariate Data Toy Example: (1,2) (3,1) (-1,0) (2,-1)

  34. Plotting Bivariate Data Sometimes: Can see more insightful patterns by connecting points

  35. Plotting Bivariate Data Sometimes: Useful to switch off points, and only look at lines/curves

  36. Plotting Bivariate Data Common Name: “Scatterplot” A look under the hood: EXCEL: Chart Wizard (colored bar icon) • Chart Type: XY (scatter) • Subtype conrols points only, or lines • Later steps similar to above (can massage the pic!)

  37. Scatterplot E.g. Data from related Intro. Stat. Class (actual scores) • How does HW score predict Final Exam? = HW, = Final Exam http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls • In top half of HW scores: Better HW  Better Final • For lower HW: Final is much more “random”

  38. Scatterplots Common Terminology: When thinking about “X causes Y”, Call X the “Explanatory Var.” or “Indep. Var.” Call Y the “Response Var.” or “Dep. Var.” (think of “Y as function of X”) (although not always sensible)

  39. Scatterplots Note: Sometimes think about causation, Other times: “Explore Relationship” HW: 2.1

  40. Class Scores Scatterplots http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls • How does HW predict Midterm 1? = HW, = MT1 • Still better HW  better Exam • But for each HW, wider range of MT1 scores • I.e. HW doesn’t predict MT1 as well as Final • “Outliers” in scatterplot may not be outliers in either individual variable e.g. HW = 72, MT1 = 94 (bad HW, but good MT1?, fluke???)

  41. Class Scores Scatterplots http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls • How does MT1 predict MT2? = MT1, = MT2 • Idea: less “causation”, more “exploration” • Still higher MT1 associated with higher MT2 • For each MT1, wider range of MT2 i.e. “not good predictor” • Interesting Outliers: MT1 = 100, MT2 = 56 (oops!) MT1 = 23, MT2 = 74 (woke up!)

  42. Important Aspects of Relations • Form of Relationship • Direction of Relationship • Strength of Relationship

  43. I. Form of Relationship • Linear: Data approximately follow a line Previous Class Scores Example http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls Final vs. High values of HW is “best” • Nonlinear: Data follows different pattern Nice Example: Bralower’s Fossil Data http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg11.xls

  44. Bralower’s Fossil Data http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg11.xls From T. Bralower, formerly of Geological Sci. Studies Global Climate, millions of years ago: • Ratios of Isotopes of Strontium • Reflects Ice Ages, via Sea Level (50 meter difference!) • As function of time • Clearly nonlinear relationship

  45. II. Direction of Relationship • Positive Association X bigger  Y bigger • Negative Association X bigger  Y smaller E.g. X = alcohol consumption, Y = Driving Ability Clear negative association

  46. III. Strength of Relationship Idea: How close are points to lying on a line? Revisit Class Scores Example: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg10.xls • Final Exam is “closely related to HW” • Midterm 1 less closely related to HW • Midterm 2 even related to Midterm 1

More Related