Nonparametric inference
Download
1 / 53

Nonparametric Inference - PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on

Nonparametric Inference. Why Nonparametric Tests?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Nonparametric Inference' - sylvie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Why nonparametric tests
Why Nonparametric Tests?

  • We have been primarily discussing parametric tests; i.e. , tests that hold certain assumptions about when they are valid, e.g. t-tests and ANOVA both had assumptions regarding the shape of the distribution (normality) and about the necessity of having similar groups (homogeneity of variance).

  • When these assumptions hold we can use standard sampling distributions (e.g. t-distribution, F-distribution) to find p-values.


Why nonparametric tests1
Why Nonparametric Tests?

  • When these assumptions are violated it is necessary to turn to tests that do not have such stringent assumptions ~ nonparametric or "distribution-free" tests.

  • Specifically, there are three cases which necessitate the use of non-parametric tests:

1) The data for the response is not at least interval scale, i.e. measurements. For example the response might be ordinal.

3) There exists severely unequal variances between groups, i.e. there is obviously a violation of the homogeneity of variance assumption required for parametric tests.

In the last two cases, we have interval level data, but it violates our parametric assumptions. Therefore, we no longer treat this data as interval, but as ordinal. In a sense, we demote it because it fails to meet specific assumptions.

2) The distribution of the data for the response is not normal.

Recall that a relatively normal distribution is assumed for parametric tests.



Independent samples
Independent Samples

  • For two populations we use…

    Mann-Whitney/Wilcoxon Rank Sum Test

  • For three or more populations we use…

    Kruskal-Wallis Test (at the end)


Mann whitney wilcoxon rank sum test
Mann-Whitney/Wilcoxon Rank Sum Test

  • Alternative to two-sample t-Test

  • Use when…

    - populations being sampled are not normally distributed.

    - sample sizes are small so assessing normality is not possible (ni< 20).

    - response is ordinal


Mann whitney wilcoxon rank sum test1
Mann-Whitney/Wilcoxon Rank Sum Test

General Hypotheses

Ho: distribution of pop. A and pop. B are the same, i.e. A = B

HA: distribution of pop. A and pop. B are NOT the same, i.e A = B

HA: distribution of pop. A is shifted to the right of pop. B, i.e. A > B.

HA: distribution of pop. A is shifted to the left of pop. B, i.e. A < B


Mann whitney wilcoxon rank sum test2
Mann-Whitney/Wilcoxon Rank Sum Test

Ho: A = B vs. HA: A > B

Q: Is there evidence that the values in population A are generally larger than those in population B?


Mann whitney wilcoxon rank sum test test procedure
Mann-Whitney/Wilcoxon Rank Sum Test(Test Procedure)

  • Rank all N = nA + nB observations in the combined sample from both populations in ascending order.

  • Sum the ranks of the observations from populations A and B separately and denote the sums wA and wB. Assign average rank to tied observations.

  • For HA: A < B reject Ho if wA is “small” or wB is “big”.For HA: A > B reject Ho if wA is “big” or wB is “small”.

  • Use tables to determine how “big” or “small” the rank sums must be in order to reject Ho or use software to conduct the test.


Mann whitney wilcoxon rank sum test critical value table
Mann-Whitney/Wilcoxon Rank Sum Test(Critical Value Table)

This table contains the value the smaller rank sum must be less than in order to reject the Ho for a one-tailed test situation for two significance levels (a = .05 & .01)

Tables exist for the two-tailed tests as well.

n is the sample size of the group with the smaller rank sum.


Example huntington s disease and fasting glucose levels
Example: Huntington’s Disease and Fasting Glucose Levels

Davidson et al. studied the responses to oral glucose in patients with Huntington’s disease and in a group of control subjects. The five-hour responses are shown below. Is there evidence to suggest the five-hour glucose (mg present) is greater for patients with Huntington’s disease?

Ho: Control = Huntington’s i.e. C = H

HA: Control < Huntington’s i.e. C < H


Example observations ranks
Example: Observations & Ranks

10.5

9

15

3

13

1.5

17

1.5

16

5.5

5.5

19

7

21

8

20

18

10.5

4

13

13

wA = 78

wB = 153


Example critical value table
Example: Critical Value Table

Here,

nC = 10 (control)

nH= 11 (Huntington’s)

we will reject

Ho: C = H

in favor of

HA: C < H

if the rank sum for the control group is less than 86 at a = .05 level and less than 77 at a = .01 level.


Example decision conclusion
Example: Decision/Conclusion

Using the Wilcoxon Rank Sum Test we have evidence to suggest that the five hour glucose level for individuals with Huntington’s disease is greater than that for healthy controls (p < .05).

Note: p < .05 because the observed rank sum for the control group is less than 86 which is the critical value for a = .05.


Rank sum test in jmp
Rank Sum Test in JMP

The p-values reported based upon large sample approximations which generally should not be used when sample sizes are small. Here the conclusion reached is the same but in general we should use tables if they are available.


Dependent samples
Dependent Samples

  • Sign Test

  • Wilcoxon Signed-Rank Test


Sign test
Sign Test

  • The sign test can be used in place of the paired t-test when we have evidence that the paired differences are NOT normally distributed.

  • It can be used when the response is ordinal.

  • Best used when the response is difficult to quantify and only improvement can be measured, i.e. subject got better, got worse, or no change.

  • Magnitude of the paired difference is lost when using this test.


Sign test1
Sign Test

  • The sign test looks at the number of (+) and (-) differences amongst the nonzero paired differences.

  • A preponderance of +’s or –’s can indicate that some type of change has occurred.

  • If the null hypothesis of no change is true we expect +’s and –’s to be equally likely to occur, i.e. P(+) = P(-) = .50 and the number of each observed follows a binomial distribution.


Example sign test
Example: Sign Test

  • A study evaluated hepatic arterial infusion of floxuridine and cisplatin for the treatment of liver metastases of colorectral cancer.

  • Performance scores for 29 patients was recorded before and after infusion. Is there evidence that patients had a better performance score after infusion?



Example sign test2
Example: Sign Test

  • Ho: No change in performance score following infusion, or more specifically median change in performance score is 0.

  • HA: Performance scores improve following infusion, or more specifically median

    change in performance score > 0.

    Intuitively we will reject Ho if there is a “large” number of +’s.


Example sign test3

17 nonzeros differences, 11 +’s 6 –’s

Example: Sign Test

-

+

+

-

+

-

+

+

+

+

+

-

+

-

-

+

+


Example sign test4
Example: Sign Test

  • If Ho is true, X = the number of +’s has a binomial dist. with n = 17 and p = P(+) = .50.

  • Therefore the p-value is simply the

    P(X > 11|n=17, p = .50)=.166 > a

  • We fail to reject Ho, there is insufficient evidence to conclude the performance score improves following infusion (p = .166).


Wilcoxon signed rank test
Wilcoxon Signed-Rank Test

  • The problem with the sign test is that the magnitude or size of the paired differences is lost.

  • The Wilcoxon Signed-Rank Test uses ranks of the paired differences to retain some sense of their size.

  • Use when the distribution of the paired differences are NOT normal or when sample size is small.

  • Can be used with an ordinal response.


Wilcoxon signed rank test test procedure
Wilcoxon Signed Rank Test(Test Procedure)

  • Exclude any differences which are zero.

  • Put the rest of differences in ascending order ignoring their signs.

  • Assign them ranks.

  • If any differences are equal, average their ranks.


Example wilcoxon signed rank test
Example: Wilcoxon Signed Rank Test

Resting Energy Expenditure (REE) for Patient with Cystic Fibrosis

  • A researcher believes that patients with cystic fibrosis (CF) expend greater energy during resting than those without CF. To obtain a fair comparison she matches 13 patients with CF to 13 patients without CF on the basis of age, sex, height, and weight.


Example wilcoxon signed rank test1
Example: Wilcoxon Signed Rank Test

6

3

-2

1

13

-5

9

11

4

12

7

8

10


Example wilcoxon signed rank test2
Example: Wilcoxon Signed Rank Test

We then calculate the sum of the positive ranks ( T+ ) and the sum of the negative ranks (T- ).

Here we have

T+ = 6 + 3 + 1 + 13 + 9 + 11 + 4 + 12 + 7 + 8 + 10 = 84and

T-= 2 + 5 = 7


Wilcoxon signed rank test test statistic
Wilcoxon Signed Rank Test(Test Statistic)

  • Intuitively we will reject the Ho ,which states that there is no difference between the populations, if either one of these rank sums is “large” and the other is “small”.

  • The Wilcoxon Signed Rank Test uses the smaller rank sum, T = min( T+ ,T- ) , as the test statistic.


Example wilcoxon signed rank test3
Example: Wilcoxon Signed Rank Test

For the cystic fibrosis example we have the following hypotheses:

Ho: there is no difference in the resting energy expenditure of individuals with CF and healthy controls who are the same gender, age, height, and weight.

HA: the resting energy expenditure of individuals with CF is greater than that of healthy individuals who are the same gender, age, height, and weight.

MEDIAN PAIRED DIFFERENCE = 0

MEDIAN PAIRED DIFFERENCE > 0


Example wilcoxon signed rank test4
Example: Wilcoxon Signed Rank Test

HA: the resting energy expenditure of individuals with CF is greater than that of healthy individuals who are the same gender, age, height, and weight.

  • The alternative is clearly supported if T+ is “large” or T- is “small”.

  • The test statistic T = min( T+ , T- ) = 7

  • Is T = 7 considered small, i.e. what is the corresponding p-value?

  • To answer this question we need a Wilcoxon Signed Rank Test table or statistical software.


Example wilcoxon signed rank test5
Example: Wilcoxon Signed Rank Test

This table gives the value of T = min( T+ , T- ) that our observed value must be less than in order to reject Ho for the both two- and one-tailed tests.

Here we have n = 13 & T = 7. We can see that our test statistic is less than 21 (a = .05) and 12 (a = .01) so we will reject Ho and we also estimate that our p-value < .01.


Example wilcoxon signed rank test6
Example: Wilcoxon Signed Rank Test

  • We conclude that individuals with cystic fibrosis (CF) have a large resting energy expenditure when compared to healthy individuals who are the same gender, age, height, and weight (p < .01).


Analysis in jmp

Select Test Mean from Difference pull-down menu, 0 for null value, and check Wilcoxon option.

Analysis in JMP

The test statistic is reported as

(T+ - T-)/2 = (84 – 7)/2 = 38.50 but we only need p-value = .0023.


Analysis in spss
Analysis in SPSS

Click on CF first and then Healthy to specify that the paired difference will be defined as CF – Healthy & specify which tests to conduct. Note: the Difference column is not actually used in the SPSS analysis.


Independent samples1
Independent Samples

  • If we have three or more populations to compare we use…

    Kruskal – Wallis Test


Kruskal wallis test
Kruskal-Wallis Test

  • One-way ANOVA for a completely randomized design is based on the assumption of normality and equality of variance.

  • The nonparametric alternative not relying on these assumptions is called the Kruskal-Wallis Test.

  • Like the Mann-Whitney/Wilcoxon Rank Sum Test we use the sum of the ranks assigned to each group when considering the combined sample as the basis for our test statistic.


Kruskal wallis test1
Kruskal-Wallis Test

Basic Idea:

1) Looking at all observations together, rank them.

2) Let R1, R2, …,Rk be the sum of the ranks of each group

3) If some Ri’s are much larger than others, it indicates the response values in different groups come from different populations.


Kruskal wallis test2
Kruskal-Wallis Test

  • The test statistic is

    where,

    N= total sample size = n1 + n2 + ... + nk


Kruskal wallis test3
Kruskal-Wallis Test

  • The test statistic is

  • Under the null hypothesis, this has an approximate chi-square distribution with df = k -1, i.e. .

  • The approximation is OK when each group contains at least 5 observations.

  • N= total sample size = n1 + n2 + ... + nk



Example kruskal wallis test
Example: Kruskal-Wallis Test

A clinical trial evaluating the fever reducing effects of aspirin, ibuprofen, and acetaminophen was conducted. Study subjects were adults seen in an ER with diagnoses of flu with body temperatures between 100o F and 100.9o F. Subjects were randomly assigned to treatment. Changes in body temperature were recorded 2 hrs. after administration of treatments.


Example kruskal wallis test1
Example: Kruskal-Wallis Test

Resulting Data: Temperature Decrease (deg. F)

5

4

8

6

9

14

11

12

3

15

10

2

13

7

1

N = 15R1 = 44 R2 = 50 R3 = 26 n1 = 4 n2 = 5 n3 = 6


Example kruskal wallis test2
Example: Kruskal-Wallis Test

N = 15R1 = 44 R2 = 50 R3 = 26 n1 = 4 n2 = 5 n3 = 6



Kruskal wallis in jmp demo
Kruskal-Wallis in JMP (Demo)

Analyze > Fit Y by X

RESULTS

R1 = 44 n1 = 4

R2 = 50 n2 = 5

R3 = 26 n3 = 6

H = 6.833 df = 2

p = .033


Decision conclusion
Decision/Conclusion

  • Using the Kruskal-Wallis test have evidence to suggest that the temperature changes after taking the different drugs are not the same (p = .033).

  • Now we might like to know which drugs significantly differ from one another.


Multiple comparisons for kruskal wallis test
Multiple Comparisons forKruskal – Wallis Test

  • If we decide at least two populations differ in term of what is typical of their values we can use multiple comparisons to determine which populations differ.

  • To do this we calculate an approximate p-value for each pair-wise comparison and then compare that p-value to a Bonferroni corrected significance level (a).


Multiple comparisons for kruskal wallis test1
Multiple Comparisons forKruskal – Wallis Test

To determine if group i significantly differs from group j we compute

.

and then compute p-value = and compare to a/2m where mis the number of possible pair-wise comparisons, m =


Multiple comparisons for kruskal wallis test2
Multiple Comparisons forKruskal – Wallis Test

  • Comparing Aspirin to Acetominophen

N = 15 Aspirin Acetominophen

R1 = 44 R3 = 26 n1 = 4 n3 = 6

Computing the Bonferroni corrected significance level we have .05/2(3) = .00833


Multiple comparisons for kruskal wallis test3
Multiple Comparisons forKruskal – Wallis Test

As this is not significant no others will either, so how can this be?

The problem is the Bonferroni correction is too conservative and the approximate normality of the multiple comparison is valid only when sample sizes are “large” and the sample sizes here quite small.

Thus the comparison shown is fine for a demonstration of the procedure but the results cannot be trusted.




ad