Inferential Statistics

# Inferential Statistics

## Inferential Statistics

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Inferential Statistics Research is about trying to make valid inferences • Inferential statistics: The part of statistics that allows researchers to generalize their findings beyond data collected. • Statistical inference: a procedure for making inferences or generalizations about a larger population from a sample of that population

2. How Statistical Inference Works

3. Basic Terminology • Population (statistical population): Anycollection of entities that have at least one characteristic in common A collection (a aggregate) of measurement about which an inference is desired Everything you wish tostudy • Parameter: The numbers that describe characteristics of scores in the population (mean, variance, standard deviation, correlation coefficient etc.)

4. Body Weight Data (Kg) N = 28 μ = 44 σ² = 1.214 A Population of Values Popul ation 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46

5. Basic Terminology • Sample: Apart of the population A finite number of measurements chosen from a population • Statistics: The numbers that describe characteristics of scores in the sample (mean, variance, standard deviation, correlation coefficient, reliability coefficient, etc.)

6. Body Weight Data (Kg) n = 1 value … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 X: student body weight X1: 43

7. Body Weight Data (Kg) n = 2 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 X: student body weight x1: 43 x2: 44

8. Body Weight Data (Kg) n = 3 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 X: student body weight x1: 43 x2: 44 x3: 45

9. Body Weight Data (Kg) n = 4 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 x: student body weight x1: 43 x2: 44 x3: 45 x4: 44

10. Body Weight Data (Kg) 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 a sample thathas been selected in sucha way that all members of the population have an equalchance of being picked (A Simple Random Sample) x1: 43 x2: 44 x3: 45 x4: 44 x5: 44

11. Basic concept of statistics • Measures of central tendency • Measures of dispersion & variability

12. Measures of tendency central Arithmetic mean (= simple average) • Best estimate of population mean is the sample mean, X measurement in population summation sample size index of measurement

13. Measures of variability All describe how “spread out” the data • Sum of squares,sum of squared deviations from the mean • For a sample,

14. Why? • Average or mean sum of squares = variance, s2: • For a sample,

15. n – 1 represents the degrees of freedom, , or number of independent quantities in the estimate s2. Greek letter “nu” • therefore, once n – 1 of all deviations are specified, the last deviation is already determined.

16. Standard deviation, s • Variance has squared measurement units – to regain original units, take the square root • For a sample,

17. Standard error of the mean Standard error of the mean is a measure of variability among the means of repeated samples from a population. • For a sample,

18. Basic Statistical Symbols

19. Body Weight Data (Kg) N = 28 μ = 44 σ² = 1.214 A Population of Values Popul ation 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46

20. Body Weight Data (Kg) repeated random sampling, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 43

21. Body Weight Data (Kg) repeated random sampling, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 43 44

22. Body Weight Data (Kg) repeated random sampling, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 43 44 45

23. Body Weight Data (Kg) repeated random sampling, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 43 44 45 44

24. Body Weight Data (Kg) repeated random sampling, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 43 44 45 44 44

25. Body Weight Data (Kg) repeated random sampling, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46

26. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 46

27. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 46 44

28. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 46 44 46

29. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 46 44 46 45

30. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 46 44 46 45 44

31. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46

32. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 42

33. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 42 42

34. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 42 42 43

35. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 42 42 43 45

36. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46 42 42 43 45 43

37. Body Weight Data (Kg) Repeated random samples, each with sample size, n = 5 values … A Population of Values 44 45 43 44 44 43 42 46 44 44 44 46 43 44 44 43 42 44 43 44 43 46 44 43 44 45 45 46

38. Summary

39. For a large enough number of large samples, the frequency distribution of the sample means (= sampling distribution), approaches a normal distribution.

40. Normal distribution: bell-shaped curve

41. Testing statistical hypotheses between 2 means • State the research question in terms of statistical hypotheses. It is always started with a statement that hypothesizes “no difference”, called the null hypothesis = H0. • H0: Mean heightof female student is equal to mean height of male student

42. Then we formulate a statement that must be true if the null hypothesis is false, called the alternate hypothesis = HA . • HA: Mean height of female student is not equal to mean height of male student If we reject H0 as a result of sample evidence, then we conclude that HA is true.

43. William Sealey Gosset (“Student”) • Choose an appropriate statistical test that would allow you to reject H0 if H0 were false. E.g., Student’s t test for hypotheses about means

44. Mean of sample 1 Mean of sample 2 Standard error of the difference between the sample means To estimate s(X1 - X2), we must first know the relation between both populations. t Statistic,

45. How to evaluate the success of this experimental design class • Compare the score of statistics and experimental design of several student • Compare the score of experimental design of several student from two serial classes • Compare the score of experimental design of several student from two different classes

46. Comparing the score of statistics and experimental experimental design of several student Similar Student Dependent populations Identical Variance Not Identical Variance Different Student Independent populations Identical Variance

47. 2. Comparing the score of experimental design of several student from two serial classes Not Identical Variance Independent populations Different Student Identical Variance

48. 3. Comparing the score of experimental design of several student from twoclasses Not Identical Variance Different Student Independent populations Identical Variance

49. Relation between populations • Dependent populations • Independent populations • Identical (homogenous ) variance • Not identical (heterogeneous) variance

50. Dependent Populations Sample Null hypothesis: The mean difference is equal too Null distribution t with n-1 df *n is the number of pairs Test statistic compare How unusual is this test statistic? P > 0.05 P < 0.05 Reject Ho Fail to reject Ho