1 / 29

Non-parametric

Non-parametric. Statistics:. An Overview of Median Tests. Definition of Non-Parametric Statistics. Non-parametric statistics are a branch of statistics that are applied when populations are not normal, or there are severely skewed data. Titles of Non-parametric Tests. One Sample Median Test

baina
Download Presentation

Non-parametric

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non-parametric Statistics: An Overview of Median Tests

  2. Definition of Non-Parametric Statistics Non-parametric statistics are a branch of statistics that are applied when populations are not normal, or there are severely skewed data.

  3. Titles of Non-parametric Tests • One Sample Median Test • Two Sample Location Test • Two Sample Dispersion Test • One-Way Layout • Independence Test

  4. Focus: Median tests This presentation will cover: • What median tests are • Why they are used • When they are used • How they are used

  5. What are median tests? • They are tests similar to the mean tests covered in a college introduction to statistics. • They include confidence intervals, and significance tests.

  6. When to use a median test:(as opposed to a mean test) • When data or population does not fulfill conditions for mean tests. • The ONLY condition is a simple random sample!

  7. Remember these conditions? • 30>n>15 with slight skewness • N>30 • Or population is normal They are NOT necessary!

  8. Why do we use median tests? Because they are more robust!

  9. Medians are more robust than means • The mean of these salaries is $109,000 • The median of these salaries is clearly between #7 and #8, or $32,500 • Just from looking at the list of salaries, the median seems to describe the middle of the distribution much more accurately, since salary #14 pulls the mean so far up

  10. More robustness The rest of the procedure of the median test is more robust than the t-distribution. This combination of a robust statistic and robust procedure allows for statistical inference on very skewed data.

  11. Confidence Intervals for MediansThe two main types: • Exact: needs tables and or computer software • Approximate: simpler tables, appropriate for larger samples We will concentrate on the approximations

  12. Introduction to the Confidence Intervals It is necessary to understand “rank” The rank of a value in a distribution is simply its numbered place in the list of ordered values Example: in the distribution of letters {a, b, c, d, e, f} “b” has a rank of 2 from the left, and a rank of 5 from the right.

  13. Steps for Approximate Confidence Intervals Order the distribution from smallest to largest values Find the median of the distribution. Find the rank* of each limit depending on the sample size from a table like the one shown on the next slide. Take the rank number and count in that many data points from each side of the ordered data. * Note that these ranks are computed by complicated formulas, then put neatly into a table for users, and treated like the definition of rank seen before.

  14. * Values taken from Siegel’s Statistics and Data Analysis

  15. 1 2 3 Example: Using the same salary data from before, with sample size 14 and rank 3, proceed as follows This is the lower confidence limit of the interval This is the upper confidence limit of the interval So, the 95% confidence interval is ($23000, $60000) 3 2 1

  16. Significance test for medians Remember duality? “What is not contained in the confidence interval is significant at the same alpha-level.” This property of confidence intervals can be used to test for significance.

  17. Steps for Significance Test at alpha=.05 • Create a confidence interval at this alpha-level. • Check to see if the accepted population value is included in interval. • Draw Conclusion: • If value IS included sample is NOT significant • If value is NOT includedsample IS significant

  18. Sample Significance Test Assume that the commonly accepted median of salaries at company A is $53,000, and that the sample shown before was drawn.

  19. Test hypotheses • Ho: M=$53,000 or that the true median of salaries in company A is $53,000. • Ha: M≠$53,000 or that the true median of salaries in company A is NOT $53,000.

  20. Our previous 95% confidence interval was ($23000, $60000), so: • the accepted median, $53,000, is within the interval, • The outcome is not significant, • We do not reject the accepted median.

  21. Mean Tests VS. Median Tests • Consider a population of children, with a distribution of the number of toys each one has. • True mean Mu of 7.3 toys per child • True median M of 7 toys per child

  22. 2 SRS’s from the Population of Children Both look very similar. The only difference is the movement of one bar, to be a far out outlier. # of children # of toys # of children # of toys

  23. Sample 1: 95% Mean Confidence Interval Sample 1, with no outlier • Sample mean x-bar=7.1 toys • Sample standard deviation=1.9877 • Sample size n=28 • Sigma of x-bar=1.9877/√28=.3756 • Z-score z*=1.95996 • CI: 7.1+/-(1.95996*.3756): (6.358, 7.842) (use calculator 1-var stats)

  24. Sample 1: 95% Median Confidence Interval Sample 1, with no outlier • Sample median=7 toys • Sample size n=28 • Rank (see table) =9 • Lower confidence limit=6 • Upper confidence limit=7 • CI: (6, 7)

  25. Sample 2: 95% Mean Confidence Interval Sample 2, with outlier • Sample mean x-bar=8.4 toys • Sample standard deviation=4.8722 • Sample size n=28 • Sigma of x-bar=4.8722/√28=.9208 • Z-score z*=1.95996 • CI: 8.4+/-(1.95996*.9208): (6.595, 10.205) (use calculator 1-var stats)

  26. Sample 2: 95% Median Confidence Interval Sample 2, with outlier • Sample median=7 toys • Sample size n=28 • Rank (see table) =9 • Lower confidence limit=6 • Upper confidence limit=7 • CI: (6, 7) These statistics match up EXACTLY with the median CI for the first sample. The outlier did not affect the outcome, demonstrating the test’s robustness.

  27. Comparison of different intervals Median CI (6,7) Mean CI (6.358, 7.842) Sample 1 Median CI (6, 7) Mean CI (6.595, 10.205) Sample 2

  28. Discussion of differences • The outlier pulled the mean confidence interval to be much larger, making it less useful • The median interval stayed the same, and capture the true median very closely (as 7 is captured from 6 to 7)

  29. Conclusion When data is skewed, a median test can be much more useful than a mean test in estimating the true parameter.

More Related