Download Presentation
APSTAT PART ONE Exploring and Understanding Data

Loading in 2 Seconds...

1 / 40

APSTAT PART ONE Exploring and Understanding Data - PowerPoint PPT Presentation

There are three kinds of lies - lies, damned lies and statistics. ~Benjamin Disraeli, commonly misattributed to Mark Twain. APSTAT PART ONE Exploring and Understanding Data. What is Statistics?. Chapters 1-3. What is Stat?. Book Says: A way of reasoning Collection of tools and methods

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'APSTAT PART ONE Exploring and Understanding Data' - plato

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

There are three kinds of lies - lies, damned lies and statistics. ~Benjamin Disraeli, commonly misattributed to Mark Twain

What is Statistics?

Chapters 1-3

What is Stat?
• Book Says:
• A way of reasoning
• Collection of tools and methods
• Helps us understand the world
• Statistics is about variation
Stat Basics
• Individuals
• Object described by a set of data
• People (#1), cars, animals, groups…
• Variables
• Categorical (Qualitative)– Usually involves words
• Examples: sex, advisor, social security #...
• Quantitative – Involve #’s
• Examples: age, height, income, test score…
Displaying Categorical Data
• Realtive Frequency tables:
• Just roll up the %’s
Displaying Categorical Data
• Contingency Table
• Two Way table

Age at first “Real Kiss” (ahhhhhhhhhhhh…)

Marginal Distribution

Age at first “Real Kiss” (ahhhhhhhhhhhh…)

• Conditional Distribution:
• % of males whose first kiss came when they were 10-14
• % of 20-24 year old first kissers who were male
The Rest of Chapters 1-3
• Displaying the data
• Pie Charts
• Bar Charts
• Blah Blah Blah….
• Simpson’s Paradox – AP MC
• Being Skeptical – Important for real life
• 5 W’s + 1H
• Ex: 4 out of 5 dentists….
• Displaying data
• Lies, Dammed Lies, and Statistics

Showing Off Your Data

Chapters 4-5

Histograms
• Remember bar graphs? Same, but different.
• Think of sorting boxes…
• Same size boxes
• ON TI-83
• Enter Data into L1 (STAT>EDIT)
• Go to STAT PLOT (2ND Y=)
• Change Options
• Go to ZOOM Choose Stat OR Go to WINDOW

Change Options

Go to GRAPH

Histograms
• Make a histogram of the following data:
• Age of Teachers At WPS

25, 34, 37, 42, 51, 43, 49, 35, 37, 65,

Outliers
• An observation that is outside the pattern
• For example, ages in this classroom

16, 17, 16, 17, 18, 17, 17, 16, 18, 36

• Formula to determine (l8r, sk8r)
• For now “potential” or “possible” outlier
Center

Mean - Average

Median - Middle

Shape

Symmetric

Skewed

Uniform

Bell Shaped

Bi- or Multi-modal

Spread

Standard Deviation

Range

IQR

Weird-ness

Outliers

Gaps

Describing a distribution
Stemplots
• Basic
• Split Stems
• Back-To-Back
Basic Stemplot
• Boys Weight in class (pounds)

10

11

12

13

14

15

16

17

18

3 4 6 9 9

0 2 5 7 8 8

0 0 1 3 4 4 5 8 9

1

9

KEY: 10 8 = 108 pounds

Split Stem Stemplot
• Boys Weight in class (pounds)

3 4

6 9 9

0 2

5 7 8 8

0 0 1 3 4 4

5 8 9

1

9

14

14

15

15

16

16

17

17

18

KEY: 10 8 = 108 pounds

Back to Back Stemplot
• Girls vs. Boys Weight in class (pounds)

10

11

12

13

14

15

16

17

18

8

9 3

8 7 7 3

9 4 0

2

1

3 4 6 9 9

0 2 5 7 8 8

0 0 1 3 4 4 5 8 9

1

9

KEY: 10 8 or 8 10 = 108 pounds

Mean
• Average! Add ‘em up and divide by n
• Sample Mean denoted as x (x-bar)
• Not Resistant to extreme measures
• ie. Ages in Mrs. Smith’s Kindergarten Class
• 4,5,4,4,4,5,5,4,4,4,5,5,4,4,5,39
Median
• Middle! Line ‘em up (in order) and find the middle. If two share it, find their mean.
• Resistant to extreme measures
• ie. Ages in Mrs. Smith’s Kindergarten Class
• 4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,39
Quartiles
• Median cuts data in half, Quartiles cut the Halves in Half!

Recall Teacher Ages:

25, 34, 35, 37, 37, 42, 43, 49, 51, 65

Median

3rd Quartile

Q3

1st Quartile

Q1

5-Number Summary
• Low-Q1-Median-Q3-High
• Shows Spread of Data

Recall Teacher Ages:

25, 34, 35, 37, 37, 42, 43, 49, 51, 65

• 5-Number Summary:

25 35 39.5 49 65

Boxplot
• Graphical Representation of 5-Number Summary
• Shows Shape, Spread, and Center
• Always draw to scale:

25 35 39.5 49 65

Outliers
• First off, IQR – InterQuartile Range
• Distance between Quartiles…

Recall Teacher Ages:

25, 34, 35, 37, 37, 42, 43, 49, 51, 65

• IQR is 49-35=14
• Outlier is anything 1.5 times IQR below Q1 or above Q3
• Sooo…. An outlier would have to be 21 below 35 or 21 above 49…Below 14 or above 70. Nothing in our data is an outlier!
Boxplot Using TI-83

Enter Teacher Ages into L1 (clear old stuff first):

25, 34, 35, 37, 37, 42, 43, 49, 51, 65

• ON TI-83
• Go to STAT PLOT (2ND Y=)
• Change Options
• Go to ZOOM Choose Stat OR Go to WINDOW

Change Options

Go to GRAPH

Variance & Standard Deviation
• Variance - s2
• Average of Squared distances from mean
• In example 26/5 = 5.2
• Standard Deviation – s
• Square Root of Variance
• In example, about 2.28
• Standard Deviation
• Measure of Spread
• Use with Mean
• Non-Resistant
• On TI-83 Now…..

STAT>CALC-1VARSTAT

Mean = 6

It’s Normal to Deviate

Chapter 6 – The Normal Model

Mean, Median and Mode

Density Curve
• Area under a density curve is always 1
• Symmetric density curve:

Mean

Mode

Mean

Skewed to the Left

(tail trails to the left)

Skewed to the Right

(tail trails to the right)

Median

Density Curve Continued
• Density curves are often skewed
• Recall Median is “resistant” while Mean is not

50% of Population

50% of Population

Histograms
• Median is “equal areas” point
• Mean is “balance point” – “think Physics”

Concave

Down

Concave

Up

Concave

Up



+

Normal Distributions (bell shaped)
• Center is mean m –(population mean)
• Spread is Standard Deviation s – (population standard deviation)
• To find, look for inflection points

Raw-Score (X)

 2

 3

 1

 + 1

 + 2

 + 3

z-Score (z)

3

2

1

0

1

2

3

68 – 95 – 99.7 Rule
• Also called EMPIRICAL RULE

Probability = 99.7% within 3

Probability = 95% within 2

Probability = 68% within 1

Percentiles (and quartiles)
• Think standardized tests or class rankings
• Percent of observations to the LEFT of an observation
• Quartiles:
• First is at 25th percentile
• Median is at 50th percentile
• Third is at 75th percentile

Raw-Score (X)

 2

 3

 1

 + 1

 + 2

 + 3

z-Score (z)

3

2

1

0

1

2

3

Z-SCORE
• Number of Standard Deviations (s) away from the Mean (m)
Z-SCORE Continued
• Example, You have an IQ of 148 The IQ test you took has a distribution N(105, 20). What is your Z-Score? What does this mean?
• = population mean

 = population standard deviation,

X = Raw-Score,

z = z-Score

• Normal Distribution Notation N (, )
Using Tables
• Ex. – Your IQ Z-SCORE was 2.15. What does it mean now?
Using Tables
• Ex. – If someone’s IQ was at the 10th percentile, what would their Z-SCORE be?
Using TI-83
• Normalcdf (Xlower, Xupper, , ) : - use to convert Raw-Score directly to probability.
• Normalcdf (Zlower, Zupper) : - use to convert z-Score to probability

***For Graphics use Shadenorm (GTANG notes)

Using TI-83
• Test Empirical Rule (68-95-99.7)
• Find Normalcdf(-1,1), Normalcdf(-2,2), Normalcdf(-3,3)
• Ex. What percent of IQ Scores would fall between 100 and 110 Using N(105, 20)? What percent would be above 150?
• Normalcdf(100,110,105,20)
• Normalcdf(150,1000000000,105,20)
Normality
• Just check Box and Whisker plot or Histogram on TI-83
• ALWAYS do this if raw data is given
• Sketch result and comment on it!