The Mann-Whitney U test. Peter Shaw. Introduction. We meet our first inferential test. You should not get put off by the messy-looking formulae – it’s usually run on a PC anyway. The important bit is to understand the philosophy of the test. Imagine.
You conduct a questionnaire survey of homes in the Heathrow flight path, and also a control population of homes in South west London. Responses to the question “How intrusive is plane noise in your daily life” are tabulated:
Decide significance level p=0.05
Set up H0, H1.
U = 15.5
Probability of H0 being true
p = 0.03
Is p above critical level?
2 Harmonize ranks where the same value occurs more than once
Ux = 6*7 + 7*8/2 - 67 = 3
Uy = 6*7 + 6*7/2 - 24 = 39
Lowest U value is 3.
Critical value of U (7,6) = 4 at p = 0.01.
Calculated U is < tabulated U so reject H0.
At p = 0.01 these two sets of data differ.
2 tailed test: These populations DIFFER.
1 tailed test: Population X is Greater than Y (or Less than Y).
Upper tail of distribution
Lower tail of distribution
When we have 2 groups to compare (M/F, site 1/site 2, etc) the U test is correct applicable and safe.
How to handle cases with 3 or more groups?
The simple answer is to run the Kruskal-Wallis test. This is run on a PC, but behaves very much like the M-W U. It will give one significance value, which simply tells you whether at least one group differs from one other.
Do males differ from females?
Do results differ between these sites?
I will give each of you a sheet with data collected from 3 sites. (Don’t try copying – each one is different and I know who gets which dataset!).
I want you to show me your data processing skills as follows:
1: Produce a boxplot of these data, showing how values differ between the categories.
2: Run 3 separate Mann-Whitny U tests on them, comparing 1-2, 1-3 and 2-3. Only call the result significant if the p value is < 0.01
3: Run a Kruskal-Wallis anova on the three groups combined, and comment on your results.