1 / 32

BIOL2608 Biometrics 2011-2012 Computer lab session II

BIOL2608 Biometrics 2011-2012 Computer lab session II. Basic concepts in statistics. Measures of central tendency. Also known as measure of location Indicates the location of the pop n /sample along the measurement scale Useful for describing and comparing pop n.

lorna
Download Presentation

BIOL2608 Biometrics 2011-2012 Computer lab session II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BIOL2608 Biometrics 2011-2012Computer lab session II Basic concepts in statistics

  2. Measures of central tendency • Also known as measure of location • Indicates the location of the popn/sample along the measurement scale • Useful for describing and comparing popn 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0cm

  3. Mean (= Arithmetic mean) • Commonly called average • Sum of all measurements in the popn/sample divided by the popn/sample size Mean = (10.5 + 11.5 x 2 + 12 + 12.5 + 13 x 3 + 13.5 x 2 + 14 + 14.5 + 15) / 13 = 12.88cm 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0cm

  4. Median • Middle measurement in an ordered dataset Median = the middle (7th) of the 13 measurements 10.5 11.5 11.5 12.0 12.5 13.0 13.0 13.0 13.5 13.5 14.0 14.5 15.0 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0cm

  5. Quartile • Describes an ordered dataset in four equal fractions • 1/4 of the data smaller than 1st quartile (Q1) • 1/4 lies between Q1 and Q2 • 1/4 lies between Q2 and Q3 • 1/4 bigger than the Q3 Q1 = 11.63 Q2 = Median = 13.0 Q3 = 13.88 10.5 11.5 11.5 12.0 12.5 13.0 13.0 13.0 13.5 13.5 14.0 14.5 15.0

  6. Percentile • Describes an ordered dataset in 100 equal fractions • 25th percentile = 1st quartile • 50th percentile = 2nd quartile = median • 75thpecentile = 3rd quartile

  7. Measures of dispersion and variability • Indicates how the measurements spread around the center of distribution SampleA SampleB 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0cm

  8. Variance and standard deviation SampleA SampleB 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0cm

  9. Population or sample? • Population • Entire collection of measurements in which one is interested

  10. Population or sample?

  11. Population or sample? • Population • Entire collection of measurements in which one is interested • Often large and hard to obtain all measurements • Sample • Subset of all measurements in the population

  12. Population or sample?

  13. Population or sample?

  14. Population or sample? ……..……………………………………………………………………………………………………………………………………………………………………………………………….…………....... Sampling ………..…..…………..…….……... Sample Inference Population (very large size)

  15. Commonly used symbols

  16. Estimation of mean • Confidence Interval • Allows us to express the precision of the estimate of population mean (μ) from sample mean ( ) • When we say at 95% confidence level μ = ± y, it means that we are 95% confident that μ lies between - y and + y

  17. . Estimation of variance and standard deviation • NOTE: • Variance and standard deviation for a population are calculated using slightly different formulae

  18. Normal distribution • A very common bell-shaped statistical distribution of data which allows us to carry out different statistical analysis

  19. Normality check • 6 criteria:

  20. Normality check • 6 criteria:

  21. Histogram Bin: Ideal bin size obtained by dividing the range by ideal no. of bin (n = 5logn)

  22. Normality check • 6 criteria:

  23. Skewness • Negative skew • longer left tail • data concentrated on the right • Positive skew • longer right tail • data concentrated on the left

  24. Kurtosis • Measure of “peakedness” and “tailedness” • Positive kurtosis (leptokurtic) • More acute peak around mean • Longer, fatter tails • Negative kurtosis (platykurtic) • Lower, wider peak around mean • Shorter, thinner tails

  25. Normality check • 6 criteria:

  26. Box plot

  27. Normality check • 6 criteria:

  28. P-P Plot / Q-Q Plot

  29. Normality check • 6 criteria:

  30. K-S one-sample test

  31. Related Readings • Zar, J. H. (1999). Biostatistical Analysis, 4th edition. New Jersey: Prentice-Hall. • Chapters 2, 3, 4, 6

More Related