Review of Statistics Hypothesis Testing

Review of StatisticsHypothesis Testing Jane Meza, Ph.D. November 1, 2004

1a) Would you say that the linear correlation coefficient r is somewhat positive, somewhat negative, or close to zero? Somewhat positive

1b) What does this seem to suggest about the relationship between watching TV and reading level? This appears to suggest that reading level is higher for children who spend more time watching TV.

2a) Considering only the 1st graders, is the correlation between X and Y somewhat positive, somewhat negative, or essentially zero? What about 2nd graders? 3rd graders? Somewhat negative.

2b) In light of this additional information, what can you say about the relationship between watching TV and reading level? For each grade, as time spent watching TV per week increases, reading level tends to decrease. The variable Grade In School is known as a confounding or lurking variable. This example illustrates the importance of considering other variables that might be related to the variables under consideration.

3a). Based on the scatter plot, do you think that a linear regression model is appropriate to describe the relationship between height and weight? Why or why not?The height values in the tails do not appear to be linearly related to weight. Specifically, the relationship levels off after about 180 cm (6 ft 10 inches).

3b) The following linear regression model was fit to the data. What do you conclude about the relationship between weight and height? Be specific. HEIGHT = 110.33 + 1.11 * WEIGHT • For individuals whose weight differs by 1 kg, we expect the average height to be 1.11 cm higher for the heavier group. • The intercept does not correspond to a biologic quantity of interest. • Body Mass Index (BMI) is a commonly used as a measure of obesity. • BMI=weight/height2, where weight is measured in kilograms and height is measured in meters. • Note that the formula for BMI relates weight to a nonlinear function of height.

Dawson-Saunders, Paiva and Doolen (1986) considered predicting medical school applicants’ MCAT exam scores (science problems) from their ACT composite scores. 4a) In a random sample of the data (n=42), the absolute value of the correlation coefficient was .61. Do MCAT and ACT scores appear to be linearly correlated? • Yes, they appear to have good correlation. 4b) What is the independent variable? What is the dependent variable? • Independent: ACT score • Dependent: MCAT score

4c) The equation of the regression line is Y = - 1.61 + .406 X. Is the correlation coefficient positive or negative? • Since the slope of the line is positive, r is positive. • What is the predicted MCAT score for a student with an ACT composite score of 28? • -1.61 + .406 * 28 = 9.76 4e) When comparing two groups of medical school applicants whose ACT score differs by 1 point, we expect the average MCAT score will be .406 points higher for the group with the higher ACT score.

5. Irwin et al (1987, Am J Psychiatry) studied the relationship between major life events and immune function in women. The women were divided into 3 groups (low, medium, high) stress levels based on a Social Readjustment Rating Scale. The relationships between scores on the Social Readjustment Scale and immune function (a continuous variable measured by NK cell activity ) were analyzed. The investigators wanted to study whether the immune function was different for the 3 groups of women.

State the null and alternative hypothesis. HO: The mean immune level is the same for the 3 groups HA: At least one of the means is different 5b)Since we are comparing a numerical variable for 3 independent groups, the appropriate method of analysis is: ANOVA 5c)Since the p-value (.001) is less than .05, the conclusion is: Reject HO There is evidence that the mean immune level is not the same in the 3 groups.

5d) Comparison Bonferroni p-value Low – Medium .001 Low – High .001 Medium – High .999 The mean immune measure for the low stress group is significantly different from the medium (p<.001) and high stress groups (p<.001). The mean immune measure for the medium stress group is not significantly different from the high stress group.

6. Researchers are interested in detection of hemophilia A carriers. The AHF activity was measured for 40 women sampled from a population of women who do not carry the hemophilia gene and from 40 women selected from known hemophilia A carriers. The researchers want to test the claim that the AHF activity for women who are hemophilia A carriers is different than that of women in the general population. 6a) State which test you will use and why you think it is the appropriate test. • The two groups, hemophilia A carriers and non-carriers are independent. • The mean AHF activity will be compared in the two groups (continuous variable). • Since the sample sizes are both larger than 30 and we are comparing means for two independent groups, an independent samples t-test should be used.

6b) State the null and alternative hypothesis. • Ho: The mean AHF level is the same for carriers and non-carriers. • Ha: The mean AHF level differs for carriers and non-carriers.

Review of Statistics Hypothesis Testing

Review of Statistics Hypothesis Testing

Presentation Transcript

Hypothesis Testing

Testing Hypothesis

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing:

Inferential Statistics: Null Hypothesis Testing

Inferential Statistics: Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Hypothesis Testing

Inferential Statistics: Hypothesis Testing

Hypothesis Testing

TESTING OF HYPOTHESIS

Inferential Statistics: Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Medical Statistics: Hypothesis Testing

Hypothesis and Testing of Hypothesis

Statistical hypothesis testing – Inferential statistics I.