1 / 44

Statistics 04

Statistics 04. ANOVA. Analysis of Variance (ANOVA). Z test or t test is used to test whether two sample means are sufficiently different to indicate the samples are from populations with different population means.

alcina
Download Presentation

Statistics 04

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics 04 ANOVA

  2. Analysis of Variance (ANOVA) • Z test or t test is used to test whether two sample means are sufficiently different to indicate the samples are from populations with different population means. • When more than two different groups are involved, we need to depend on ANOVA for the inference.

  3. Cases of more than two groups • Vocabulary test of candidates from four different regions (Europe, South America, North Africa, Far East) • Different parts of a test (listening, reading, vocabulary, Cloze, translation) • Different teaching methods (three textbooks)

  4. Problems with Z test • 1. Tedious computation: number of computation=N(N-1)/2 • Vocabulary test on regions: 4(4-1)/2=6 • Europe : South America • Europe : North Africa • Europe : Far East • South America : North Africa • South America : Far East • North Africa : Far East • 2. greater Type I error : αn

  5. Principles of ANOVA • Two kinds of differences in a test: systematic differences and random errors • Systematic differences are caused by different experimental conditions. • Random errors are caused by any factors other than experimental conditions. • The total variance between different groups represents systematic differences • The total variance within the group is random errors. • The ratio of these two variances follows the F distribution. • F=Sb2/Sw2 • Null hypothesis: Sb2 is not larger than Sw2 • Large values of the F statistic throw doubt on the validity of the null hypothesis.

  6. Principles of ANOVA • The ratio of these two variances follows the F distribution. • F=Sb2/Sw2 • Null hypothesis: Sb2 is not larger than Sw2 • Large values of the F statistic throw doubt on the validity of the null hypothesis.

  7. Types of ANOVA • One-way ANOVA: the comparison of the means of groups which are classified according to a single criterion variable. • Two-way ANOVA: when affected by more than one factor

  8. Calculation of ANOVA • F=Sb2/Sw2 • Sb2 (MSb): mean between-groups sum of squares • Sw2 (MSw): mean within-groups sum of squares • BSS (SSb): between-groups sum of squares • RSS (SSw): within-groups sum of squares or residual sum of squares • TSS (SSt): total sum of squares • TSS=BSS+RSS • Sb2 = SSb / dfb • Sw2 = SSw / dfw

  9. Example

  10. Total sum of Squares (TSS) • Need to compute: ΣX CF ΣX2 X-j

  11. Computation of ΣX • ΣX: the sum of the all observations • ΣX = X1,1+X2,1+ … X1,2+X2,2+ … + Xi,j • Example • ΣX = 10+12+ … +10+14+ … + 8+15 =180

  12. Computation of Correction Factor (CF) • CF=(ΣX)2/mn • m: the number of samples • n: the size of each sample • Example: • (ΣX)2=1802= 32400 • CF= 32400/(3*5)= 2160

  13. Computation of ΣX2 • ΣX2 : the sum of the squared observations • ΣX2 = X1,12+X2,12+ … +X1,22+X2,22+ … +Xij2 • Example: • ΣX2 = 102+122+ … +102+142+ … +82+152 =2352

  14. Computation of TSS • TSS=ΣX2-CF (Woods) • SSt=ΣX2-(ΣX)2/N (where: N=mn) (韩宝成) • The sum of all squared observations minus the correction factor • Example: • TSS=2352-2160=192

  15. Computation of BSS • BSS=ΣXj2/n-CF (Woods) • SSb=ΣT2/n-(ΣX)2/N (where: T=total of a group, N=mn) (韩宝成) • The sum of the totals of each group divided by the size of the sample (all samples are of the same size), then minus the correction factor • Example: • BSS=(502+552+752)/5-2160=11150/5-2160=2230-2160=70

  16. Computation of RSS • RSS=TSS-BSS • Example • RSS=192-70=122

  17. Computation of Degree of Freedom • dft: degree of freedom of the total • dft=mn-1 • product of the size of the sample and the number of the samples minus 1 • Example: dft=mn-1=3*5=15

  18. Computation of Degree of Freedom • dfb: degree of freedom of the between-groups • dfb=m-1 • the number of samples minus 1 • Example: dfb=m-1=3-1=2

  19. Computation of Degree of Freedom • dfw: degree of freedom of the within-group • dfw=m(n-1) • the number of the samples times the size of the sample minus 1 • Example: dfw=m(n-1)=3*(5-1)=12

  20. Computation of Sb2 and Sw2 (mean sums of squares 均方) • Sb2=BSS/ dfb • Example: Sb2=70/2=35

  21. Computation of Sb2 and Sw2 (mean sums of squares 均方) • Sw2=RSS/ dfw • Example: Sw2=122/12=10.17

  22. Computation of F-ratio • F=Sb2/Sw2 • Example: F=35/10.17= 3.44

  23. Inference for the significant difference • Look up for Fα(m-1,m(n-1)) in the Table of F-distribution • e.g.: F0.05(2,3*(10-1))= F0.05(2,27)=3.35 • (韩宝成:p.192, 分子:2,分母:27) • (Woods: p. 304, n1=2, n2=30)

  24. Compare F with Fα(m-1,m(n-1)) • e.g. F=3.44 • F0.05(2,27)=3.35 • F> F0.05(2,27) • Conclusion: p<0.05 (H0 rejected)

  25. ANOVA Table (English) Source df SS MSS F-ratio Confidence Level Between groups m-1 BSS Sb2 Sb2/ Sw2 p<α Within groups m(n-1) RSS Sw2 (residual) Total mn-1 TSS

  26. ANOVA Table (Chinese) 变异来源平方和自由度 均方 F 显著性水平 组间BSSm-1Sb2 Sb2/ Sw2 p<α 组内RSSm(n-1)Sw2 总变异TSSmn-1

  27. ANOVA Table (Example) Source df SS MSS F-ratio Confidence Level Between groups 2 70 35 3.44 p<0.05 Within groups 27 122 10.17 (residual) Total 14 192

  28. Steps of the Computation • 1.Computation of sums of squares: TSS, BSS, RSS • 2.Determination of degrees of freedom: dft, dfb, dfw • 3.Computation of mean sums of squares: Sb2, Sw2 • 4.F testing: F, Fα(m-1,m(n-1)) • 5. Output an ANOVA table

  29. Consistence of variances • Fmax=S2max/S2min • Check the Table of Critical Value of Fmax (韩宝成:p.198) • If Fmax > Fmaxα, there is inconsistency among the variances. • If Fmax < Fmaxα, there is no significant difference among the variances

  30. 完全随机化设计的方差分析(complete randomized design) • 随机区组实验设计的方差分析(randomized block design) • 多个平均数之间的比较

  31. 完全随机化设计的方差分析(complete randomized design) • 样本容量相同 • 样本容量不同

  32. 样本容量相同 • 5 steps • 1.  Compute for ΣX, (ΣX)2, ΣX2, k, n, N(N=mn) • 2. Compute for sum of squares (离差平方和) (total, between-groups, within-groups) SSt=ΣX2-(ΣX)2/N (where: N=mn) SSb=ΣT2/n-(ΣX)2/N (where: T=total of a group, N=mn) SSw=SSt-SSb

  33. 样本容量相同 3.Determine the degrees of freedom dft=N-1 dfb=k-1 dfw=dft-dfb 4. Compute for mean sum of squares Sb2=SSb/ dfb Sw2=SSw/ dfw 5. Compute for F ratio F=Sb2/Sw2

  34. 样本容量不同 • 5 steps • 1.  Compute for ΣX, (ΣX)2, ΣX2, k, n, N(N=mn) • 2. Compute for sum of squares (离差平方和) (total, between-groups, within-groups) SSt=ΣX2-(ΣX)2/N (where: N=mn) SSb=Σ(T2/n)-(ΣX)2/N (where: T=total of a group, N=mn) SSw=SSt-SSb

  35. 样本容量不同 3.Determine the degrees of freedom • dft=N-1 • dfb=k-1 • dfw=dft-dfb • 4. Compute for mean sum of squares • Sb2=SSb/ dfb • Sw2=SSw/ dfw • 5. Compute for F ratio • F=Sb2/Sw2

  36. Two-way ANOVA • Variations in the case of error gravity scores: • 1. Variation between m groups of judges (horizontal) • 2. Variation between n different errors (vertical) • 3. Residuals

  37. Procedure of Calculation • Calculations of TSS, ESS, GSS and Residual • Calculations of degrees of freedom: between errors, between groups of judges, residual • Calculation of mean sum of squares: Se2, Sg2, Sr2 • Calculation of F-ratio: Se2 / Sr2, Sg2/ Sr2 • Comparison of F and Fα

  38. Calculation of CF • CF=(ΣX)2/mn =24622/3*32 =63140.04

  39. Calculation of TSS, ESS, GSS • TSS=ΣYij2-CF • ESS: between errors sum of squares • ESS=ΣYi2/m-CF • GSS: between groups sum of squares • GSS=ΣYj2/n-CF • The divisor is the number of observations that have gone into each of the values being squared.

  40. Calculation of degree of freedom • dfbetween errors : n-1 • dfbetween groups: m-1 • dfresidual: (m-1)(n-1), or (mn-1)-(n-1)-(m-1) • dftotal: mn-1

  41. Calculations of MSS • MSSbetween errors = ESS/ dfbetween errors • MSSbetween groups = GSS / dfbetween groups • MSSresidual = RSS/ dfresidual

  42. Calculation of F-ratio • Fbetween errors = MSSbetween errors / MSSresidual • Degree of freedom: dfbetween errors, dfresidual • Fbetween groups = MSSbetween groups / MSSresidual • Degree of freedom: dfbetween groups, dfresidual

  43. ANOVA table Source df SS MSS F-ratio Confidence Level Between errorsn-1 ESS ESS/(n-1) EMSS/ RMSSp<α Between groupsm-1 GSS GSS/(m-1) GMSS/RMSS p<α Residual m(n-1) RSS RSS/m(n-1) Total mn-1 TSS

  44. Factorial analysis • Factors: Variants that affect the scores • Level of the factor: different values of each factor • Two null hypotheses in Two-way ANOVA e.g. 1. Mean scores are the same between geographical origins 2. Mean scores are the same between sexes

More Related