1 / 52

Statistics An Introduction

Statistics An Introduction. Learning Objectives. 1. Define Statistics 2. Describe the Uses of Statistics 3. Distinguish Descriptive & Inferential Statistics Define Population, Sample, Parameter, & Statistic Identify data types. What is Statistics?.

mostyn
Download Presentation

Statistics An Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics An Introduction

  2. Learning Objectives • 1. Define Statistics • 2. Describe the Uses of Statistics • 3. Distinguish Descriptive & Inferential Statistics • Define Population, Sample, Parameter, & Statistic • Identify data types

  3. What is Statistics? • The practice (science?) of data analysis • Summarizing data and drawing inferences about the larger population from which it was drawn

  4. Statistical Methods Statistical Methods Descriptive Inferential Statistics Statistics

  5. Descriptive Statistics • 1. Involves • Collecting Data • Presenting Data • Characterizing Data • 2. Purpose • Describe Data $ 50 25 0 Q1 Q2 Q3 Q4 X = 30.5 S2 = 113

  6. Inferential Statistics • 1. Involves • Estimation • Hypothesis Testing • 2. Purpose • Make Decisions About Population Based on Sample Characteristics Population?

  7. Key Terms • 1. Population (Universe) • All Items of Interest • 2. Sample • Portion of Population • 3. Parameter • Summary Measure about Population • 4. Statistic • Summary Measure about Sample • P in Population & Parameter • S in Sample & Statistic

  8. Data Types • Quantitative • Discrete • Continuous • Qualitative • Nominal (categorical) • Ordinal (rank ordered categories)

  9. Sampling • Representative sample • Same characteristics as the population • Random sample • Every subset of the population has an equal chance of being selected

  10. Review • Descriptive vs. Inferential Statistics • Vocabulary • Population • (Random, representative) sample • Parameter • Statistic • Data types

  11. Methods for Describing Data

  12. Learning Objectives • 1. Describe Qualitative Data Graphically • 2. Describe Numerical Data Graphically • 3. Create & Interpret Graphical Displays • 4. Explain Numerical Data Properties • 5. Describe Summary Measures • 6. Analyze Numerical Data Using Summary Measures

  13. Data Presentation

  14. Presenting Qualitative Data

  15. Data Presentation

  16. Student Specializations • Specialization | Freq. Percent Cum. • ---------------+---------------------------------- • HCI | 9 39.13 39.13 • IEMP | 9 39.13 78.26 • LIS | 3 13.04 91.30 • Undecided | 2 8.70 100.00 • ---------------+---------------------------------- • Total | 23 100.00

  17. Student Specializations

  18. Undergrad Majors • UG major | Freq. Percent Cum. • --------------------------+----------------------------------- • American Studies | 1 4.76 4.76 • Cog Sci | 1 4.76 9.52 • Comp Sci | 3 14.29 23.81 • Economics | 3 14.29 38.10 • English | 5 23.81 61.90 • Environmental Engineering | 1 4.76 66.67 • Graphic Design | 1 4.76 71.43 • Math | 2 9.52 80.95 • Mechanical Engineering | 1 4.76 85.71 • Nutrition | 1 4.76 90.48 • Sci and Tech Policy | 1 4.76 95.24 • Telecommunications | 1 4.76 100.00 • --------------------------+----------------------------------- • Total | 21 100.00

  19. Favorite Colors • color | Freq. Percent Cum. • ------------+----------------------------------- • black | 2 8.70 8.70 • blue | 12 52.17 60.87 • green | 1 4.35 65.22 • orange | 1 4.35 69.57 • purple | 1 4.35 73.91 • red | 5 21.74 95.65 • white | 1 4.35 100.00 • ------------+----------------------------------- • Total | 23 100.00

  20. Calculus Knowledge • integrals | Freq. Percent Cum. • ------------+----------------------------------- • 1 | 3 13.04 13.04 • 2 | 1 4.35 17.39 • 3 | 11 47.83 65.22 • 4 | 6 26.09 91.30 • 5 | 2 8.70 100.00 • ------------+----------------------------------- • Total | 23 100.00

  21. Presenting Numerical Data

  22. Data Presentation

  23. Student Age (Reported) Data • Stem-and-leaf plot for age • 2* | 22233444555777899 • 3* | 01257 • 4* | • 5* | • 6* | • 7* | 6

  24. Histogram

  25. Starting Salaries (in $K) • 3* | 8 • 4* | 000025 • 5* | 0000 • 6* | 0000005 • 7* | 5 • 8* | 0

  26. Numerical Data Properties

  27. Thinking Challenge $400,000 $70,000 $50,000 ... employees cite low pay -- most workers earn only $20,000. ... President claims average pay is $70,000! $30,000 $20,000

  28. Standard Notation Measure Sample Population Mean   x Stand. Dev. s  2 2 Variance s  Size n N

  29. Numerical Data Properties Central Tendency (Location) Variation (Dispersion) Shape

  30. Numerical DataProperties & Measures Numerical Data Properties Central Variation Shape Tendency Mean Range Skew Interquartile Range Median Mode Variance Standard Deviation

  31. Central Tendency

  32. Numerical DataProperties & Measures Numerical Data Properties Central Variation Shape Tendency Mean Range Skew Interquartile Range Median Mode Variance Standard Deviation

  33. What’s wrong with this? • Measurements 1 4 2 9 8 • Middle measurement is 2, so that’s the median  X i X  X    X 1 2 n i  1 X   n n

  34. Ages • Mean = 29 • Median = 27 • 2* | 22233444555777899 • 3* | 01257 • 4* | • 5* | • 6* | • 7* | 6

  35. Summary of Central Tendency Measures Measure Equation Description Mean Balance Point  X / n i Median ( n +1) Position Middle Value 2 When Ordered Mode none Most Frequent

  36. Shape

  37. Numerical DataProperties & Measures Numerical Data Properties Central Variation Shape Tendency Mean Range Skew Median Interquartile Range Mode Variance Standard Deviation

  38. Shape • 1. Describes How Data Are Distributed • 2. Measures of Shape • Skew = Symmetry Left-Skewed Symmetric Right-Skewed Mean Median Mode Mean = Median = Mode Mode Median Mean

  39. Variation

  40. Numerical DataProperties & Measures Numerical Data Properties Central Variation Shape Tendency Range Mean Skew Interquartile Range Median Mode Variance Standard Deviation

  41. Quartiles • 1. Measure of Noncentral Tendency • 2. Split Ordered Data into 4 Quarters • 3. Position of i-th Quartile 25% 25% 25% 25% Q1 Q2 Q3 i  (n  1) Positionin g Point of Q  i 4

  42. Ages • Range • Quartiles • 2* | 22233444555777899 • 3* | 01257 • 4* | • 5* | • 6* | • 7* | 6

  43. Quartiles: 24, 27, 30 Inner fences: (15,39) Outer fences: (6, 48) Quartiles: 41K, 50K, 60K Inner fences: ?? Outer fences: ?? Box Plots - Age and Salary

  44. Variance & Standard Deviation • 1. Measures of Dispersion • 2. Most Common Measures • 3. Consider How Data Are Distributed • 4. Show Variation About Mean (X or )  X = 8.3 4 6 8 10 12

  45. Sample Variance Formula n 2  n - 1 in denominator! (Use N if Population Variance)  X) (X i 2 i  1 S  n  1 2 2 2 (X  X)  (X  X)  ...  (X  X) 1 2 n  n  1

  46. Equivalent Formula

  47. Another Equivalent Formula

  48. Empirical Rule • If x has a “symmetric, mound-shaped” distribution • Justification: Known properties of the “normal” distribution, to be studied later in the course

  49. Preview of Statistical Inference • You observe one data point • Make hypothesis about mean and standard deviation from which it was drawn • Empirical Rule tells you how (un)likely the data point is • If very unlikely, you are suspicious of the hypothesis about mean and standard deviation, and reject it

  50. 2    X  X i n  1 2    X   i X N Summary of Variation Measures Measure Equation Description X - X Total Spread Range largest smallest Q - Q Spread of Middle 50% Interquartile Range 3 1 Dispersion about Standard Deviation Sample Mean (Sample) Dispersion about Standard Deviation Population Mean (Population) 2 Squared Dispersion Variance  ( X - X )  i about Sample Mean (Sample) n - 1

More Related