statistics data analysis and presentation l.
Skip this Video
Loading SlideShow in 5 Seconds..
Statistics: Data Analysis and Presentation PowerPoint Presentation
Download Presentation
Statistics: Data Analysis and Presentation

Loading in 2 Seconds...

play fullscreen
1 / 22

Statistics: Data Analysis and Presentation - PowerPoint PPT Presentation

  • Uploaded on

Statistics: Data Analysis and Presentation. Fr Clinic II. Overview. Tables and Graphs Populations and Samples Mean, Median, and Standard Deviation Standard Error & 95% Confidence Interval (CI) Error Bars Comparing Means of Two Data Sets Linear Regression (LR). Warning.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Statistics: Data Analysis and Presentation' - Samuel

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
  • Tables and Graphs
  • Populations and Samples
  • Mean, Median, and Standard Deviation
  • Standard Error & 95% Confidence Interval (CI)
  • Error Bars
  • Comparing Means of Two Data Sets
  • Linear Regression (LR)
  • Statistics is a huge field, I’ve simplified considerably here. For example:
    • Mean, Median, and Standard Deviation
      • There are alternative formulas
    • Standard Error and the 95% Confidence Interval
      • There are other ways to calculate CIs (e.g., z statistic instead of t; difference between two means, rather than single mean…)
    • Error Bars
      • Don’t go beyond the interpretations I give here!
    • Comparing Means of Two Data Sets
      • We just cover the t test for two means when the variances are unknown but equal, there are other tests
    • Linear Regression
      • We only look at simple LR and only calculate the intercept, slope and R2. There is much more to LR!

Table 1: Average Turbidity and Color of Water Treated by Portable Water Filters

4 5 12

Consistent Format, Title, Units, Big Fonts

Differentiate Headings, Number Columns








Consistent Format, Title, Units

Good Axis Titles, Big Fonts



Figure 1: Turbidity of Pond Water, Treated and Untreated

populations and samples
Populations and Samples
  • Population
    • All of the possible outcomes of experiment or observation
      • US population
      • Particular type of steel beam
  • Sample
    • A finite number of outcomes measured or observations made
      • 1000 US citizens
      • 5 beams
  • We use samples to estimate population properties
    • Mean, Variability (e.g. standard deviation), Distribution
      • Height of 1000 US citizens used to estimate mean of US population
mean and median
Mean and Median
  • Turbidity of Treated Water (NTU)

Mean = Sum of values divided by number of samples

= (1+3+3+6+8+10)/6

= 5.2 NTU







Median = The middle number

Rank - 1 2 3 4 5 6

Number - 1 3 3 6 8 10

For even number of sample points, average middle two

= (3+6)/2 = 4.5

Excel: Mean – AVERAGE; Median - MEDIAN

  • Measure of variability
    • sum of the square of the deviation about the mean divided by degrees of freedom

n = number of data points

Excel: variance – VAR

standard deviation s




Standard Deviation, s
  • Square-root of the variance
  • For phenomena following a Normal Distribution (bell curve), 95% of population values lie within 1.96 standard deviations of the mean
  • Area under curve is probability of getting value within specified range

Excel: standard deviation – STDEV

Standard Deviations from Mean

standard error of mean
Standard Error of Mean
  • Standard error of mean
    • Of sample of size n
    • taken from population with standard deviation s
    • Estimate of mean depends on sample selected
    • As n , variance of mean estimate goes down, i.e., estimate of population mean improves
    • As n , mean estimate distribution approaches normal, regardless of population distribution
95 confidence interval ci for mean
95% Confidence Interval (CI) for Mean
  • Interval within which we are 95 % confident the true mean lies
  • t95%,n-1 is t-statistic for 95% CI if sample size = n
    • If n  30, let t95%,n-1 = 1.96 (Normal Distribution)
    • Otherwise, use Excel formula: TINV(0.05,n-1)
      • n = number of data points
error bars
Error Bars
  • Show data variability on plot of mean values
  • Types of error bars include:
      • ± Standard Deviation, ± Standard Error, ± 95% CI
      • Maximum and minimum value
using error bars to compare data
Using Error Bars to compare data
  • Standard Deviation
    • Demonstrates data variability, but no comparison possible
  • Standard Error
    • If bars overlap, any difference in means is not statistically significant
    • If bars do not overlap, indicates nothing!
  • 95% Confidence Interval
    • If bars overlap, indicates nothing!
    • If bars do not overlap, difference is statistically significant
  • We’ll use 95 % CI
example 1
Example 1

Create Bar Chart of Name vs Mean. Right click on data. Select “Format Data Series”.

what can we do
What can we do?
  • Plot mean water quality data for various filters with error bars
  • Plot mean water quality over time with error bars
comparing filter performance
Comparing Filter Performance
  • Use t test to determine if the mean of two populations are different.
    • Based on two data sets
      • E.g., turbidity produced by two different filters
comparing two data sets using the t test
Comparing Two Data Sets using the t test
  • Example - You pump 20 gallons of water through filter 1 and 2. After every gallon, you measure the turbidity.
    • Filter 1: Mean = 2 NTU, s = 0.5 NTU, n = 20
    • Filter 2: Mean = 3 NTU, s = 0.6 NTU, n = 20
  • You ask the question - Do the Filters make water with a different mean turbidity?
do the filters make different water
Do the Filters make different water?
  • Use TTEST (Excel)
  • Fractional probability of being wrong if you answer yes
    • We want probability to be small  0.01 to 0.10 (1 to 10 %). Use 0.01
t test questions
“t test” Questions
  • Do two filters make different water?
    • Take multiple measurements of a particular water quality parameter for 2 filters
  • Do two filters treat difference amounts of water between cleanings?
    • Measure amount of water filtered between cleanings for two filters
  • Does the amount of water a filter treats between cleaning differ after a certain amount of water is treated?
    • For a single filter, measure the amount of water treated between cleanings before and after a certain total amount of water is treated
linear regression
Linear Regression
  • Fit the best straight line to a data set

Right-click on data point and use “trendline” option. Use “options” tab to get equation and R2.

r 2 coefficient of multiple determination
R2 - Coefficient of multiple Determination

ŷi = Predicted y values, from regression equation

yi = Observed y values

R2 = fraction of variance explained by regression (variance = standard deviation squared)

= 1 if data lies along a straight line