- 93 Views
- Uploaded on
- Presentation posted in: General

Independent Samples T-Test of Population Means

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Independent Samples T-Test of Population Means

Key Points about Statistical Test

Sample Homework Problem

Solving the Problem with SPSS

Logic for Independent Samples T-Test of Population Means

Power Analysis

- Purpose: test whether or not the populations represented by the two samples have a different mean
- Examples:
- Social work students have higher GPA’s than nursing students
- Social work students volunteer for more hours per week than education majors
- UT social work students score higher on licensing exams than graduates of Texas State University

- Hypotheses:
- Null: mean of population 1 = mean of population 2
Versus

- Research: mean of population 1 < mean of population 2
- Research: mean of population 1 ≠ mean of population 2
- Research: mean of population 1 > mean of population 2

- Null: mean of population 1 = mean of population 2
- Decision:
- Reject null hypothesis if pSPSS ≤ alpha (≠ relationship)
- Reject null hypothesis if pSPSS÷2 ≤ alpha (< or > relationship)

- Variable is interval level (ordinal with caution)
- Variable is normally distributed
- Acceptable degree of skewness and kurtosis
or

- Using the Central Limit Theorem (30+ in each group)

- Acceptable degree of skewness and kurtosis
- The variance of the two groups is not different (if different, use alternative formula)

- Cohen’s d measures difference in means in standard deviation units.
- Cohen’s d = difference in population means
population standard deviation

- Interpretation:
- small: d = .20 to .50
- medium: d = .50 to .80
- large: d = .80 and higher

- An independent samples T-test is presented the same as the one-sample t-test:
t(75) = 2.11, p = .02 (one –tailed), d = .48

- Example: Survey respondents who were employed by the federal, state, or local government had significantly higher socioeconomic indices (M = 55.42, SD = 19.25) than survey respondents who were employed by a private employer (M = 47.54, SD = 18.94) , t(255) = 2.363, p = .01 (one-tailed).

Degrees of freedom

Value of statistic

Significance of statistic

Include if test is one-tailed

Effect size if available

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

- True
- True with caution
- False
- Incorrect application of a statistic

This is the general framework for the problems in the homework assignment on “Independent Samples T-Test of Population Means.” The description is similar to findings one might state in a research article.

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ]for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

- True
- True with caution
- False
- Incorrect application of a statistic

- The first paragraph identifies:
- The data set to use, e.g. GSS2000R.Sav
- The groups that will be compared in the analysis
- The variable compared in the t-test
- Thealpha level to use for the hypothesis test

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local governmentcompleted significantly more years of school(M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

- True
- True with caution
- False
- Incorrect application of a statistic

- The second paragraph specifies:
- The sample means and standard deviation for the groups being compared
- The relationship for deriving the research hypothesis

The answer to a problem will be True if the t-test supports the finding in the problem statement.

The answer to a problem will be True with caution if the t-test supports the finding in the problem statement, but the dependent variable is ordinal level.

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

- True
- True with caution
- False
- Incorrect application of a statistic

- The answer to a problem will Incorrect application of a statistic if
- the t-test violates the level of measurement requirement, i.e. the dependent variable is nominal
- the assumption of normality of the dependent variable is violated and the central limit theorem doesn’t apply
- the independent variable is not dichotomous

The answer to a problem will be False if the t-test does not support the finding in the problem statement.

Our first task in SPSS is to identify the numeric codes for the groups that SPSS will require us to specify.

The problem statement tells us “This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt].”

Select the Variables… command from the Utilities menu.

NOTE: in our problems we required that the grouping, or independent variable, be dichotomous, because there are other statistical tests to use when there are more than two groups. SPSS does not require the independent variable to be dichotomous, but it does require that you enter the numeric codes for the two groups (possibly out of a larger number of groups) that you wish to compare.

Scroll through the list of variables until you see wkgovt. Click on wkgovt and the information for the variable appears in the panel to the right.

Click on Close to dismiss the dialog box.

The Variable Information panel shows us the text labels that the creator of the data set assigned to each of the possible numeric responses for this variable.

The numeric codes for the groups we want to compare are: 1 (GOVERNMENT) and 2 (PRIVATE).

This remaining numeric codes represent missing data:

0 (NAP),

8 (DK), and

9 (NA).

Statistical tests of means require that the dependent variable be interval level. "Highest year of school completed" [educ] is interval level, satisfying the requirement.

In our analyses, we will allow the dependent variable to be ordinal , which violates this requirement in the strictest interpretation of level of measurement. However, since the research literature often computes means for ordinal level data, especially scaled measures, we will follow the convention of applying interval level statistics to ordinal data. Since all analysts may not agree with this convention, a caution is added to any true findings.

The independent samples t-test uses the t-distribution for the probability of the test statistic. To obtain accurate probabilities, the variable must follow a normal distribution.

We will generate descriptive statistics to evaluate normality.

Select the Descriptive Statistics > Descriptives… command from the Analysis menu.

First, move the variable we will use in the t-test, educ, to the Variable(s) list box.

Second, click on the Options… button to select the statistics we want.

First, in addition to the statistics, SPSS has checked by default, mark the Kurtosis and Skewness check boxes on the Distribution panel.

Second, click on the Continue button to close the dialog box.

Click on the OK button to obtain the output.

"Highest year of school completed" [educ] did not satisfy the criteria for a normal distribution. The skewness of the distribution (-.137) was between -1.0 and +1.0, but the kurtosis of the distribution (1.246) fell outside the range from -1.0 to +1.0.

Having failed the normality requirement using this criteria, we will see if we can apply the central limit theorem.

The number of cases in each group is part of the output for the independent samples t-test, so we will go ahead and compute that test to continue addressing the issue of normality.

Select Compare Means > Independent-Samples T Test… from the Analyze menu.

First, move the dependent variable educ to the Test Variable(s) list box.

Second, move the independent variable wkgovt to the Grouping Variable text box.

Note that SPSS lists two question marks after the variable name and activates the Define Groups… button as its clue for what it wants us to do next. Click on the Define Groups button.

First, type in the numeric codes for the groups in the wkgovt variable that we looked up at the beginning of the problem.

Second, click on the Continue button to close the dialog box.

Click on the OK button to close the dialog box.

Note that SPSS has replaced the question marks after the variable name with the numeric codes we typed in.

Since survey respondents who were employed by the federal, state, or local government had 38 cases and survey respondents who were employed by a private employer had 217 cases, the assumption of normality was satisfied by the Central Limit Theorem which required both groups to have 30 or more cases.

If we are unable to establish normality either by the distribution or by the central limit theorem, the t-test would not be an appropriate statistic.

The independent-samples t-test assumes that the variances of the dependent variable for both groups are equal in the population. This assumption is evaluated with Levene's Test for Equality of Variances. The null hypothesis for this test states that the variance for both groups are equal. The desired outcome for this test is to fail to reject the null hypothesis, which demonstrates equality.

The probability associated with Levene's Test for Equality of Variances (.161) is greater than alpha (.05), indicating that the 'Equal variances assumed' formula for the independent samples t-test should be used for the analysis.

Since we failed to reject the hypothesis for Levene’s test, the 'Equal variances assumed' formula for the independent samples t-test should be used for the analysis.

Had the probability associated with Levene’s test been less than the alpha level, we would have used the statistics for the ‘Equal variances not assumed’ row in the table.

The finding we are trying to verify is:

Our first task is to make certain we have solved the right problem.

Second, we verify that the mean and standard deviations for the groups match the problem statement.

First, we check to make certain we have the correct groups in the output.

The finding we are trying to verify is:

Since the problem states that the mean for one group is significantly higher than the mean of the other group, the research hypothesis is a one-tailed test.

We divide the SPSS 2-tailed significance (.080) in half and make our decision about the null hypothesis by comparing p = .04 to alpha = .05.

The answer to the question is True.

We can include the t-test results in our statement of the finding:

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) , t(255) = 1.761, p = .04 (one-tailed).

Measurement level of independent variable?

Dichotomous

Interval/ordinal/nominal

Measurement level of dependent variable?

Inappropriate application of a statistic

Nominal/

Dichotomous

Interval/ordinal

Strictly speaking, the test requires an interval level variable. We will allow ordinal level variables with a caution.

Inappropriate application of a statistic

Number of cases in both groups is at least 30?

Skewness and Kurtosis between

-1.0 and +1.0?

No

No

Inappropriate application of a statistic

Yes

Yes

Probability for Levene test of equality of population variances less than or equal to alpha?

Yes

No

Use ‘Equal variances

not assumed’

Use ‘Equal variances

assumed’

Mean and standard deviation of both variables are correct?

No

Yes

False

One-tailed or two-tailed test?

Two-tailed

One-tailed

Divide two-tailed significance by 2

Probability for

t-test less than or equal to alpha?

Yes

No

Add caution for ordinal dependent variable.

True

False

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "number of hours worked in the past week" [hrs1] for groups of survey respondents defined by the variable "self-employment" [wrkslf]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were self-employed worked significantly longer hours in the past week (M = 42.04, SD = 13.86) than survey respondents who were working for someone else (M = 40.55, SD = 12.46) .

1 True

2 True with caution

3 False

4 Incorrect application of a statistic

The answer to this problem was false because the probability for the t-test was .29 (one-tailed), greater than the alpha of 0.05.

We can conduct a post-hoc power analysis to determine what number of cases would have been sufficient to have a better opportunity to find a statistically significant difference.

The answer to the problem was false because the one-tailed significance was p = .29 (.583 ÷ 2), greater than the alpha of .05.

To calculate the effect size, and corresponding power, for this problem, we need a pooled estimate of the standard deviation for the two groups. SamplePower will calculate that for us, we will enter the sample sizes, means, and standard deviations for the two groups in SamplePower.

The UT license for SPSS does not include SamplePower, the SPSS program for power analysis.

However, the program is available on the UT timesharing server.

Information about access this program is available at this site.

In the SamplePower program on the ITS Timesharing Systems, select the New… command from the File menu.

First, select the Means tab to access the tests for means.

Third, click on the Ok button to enter the specific values for our problem.

Second, since we want to enter the means for our two groups, select the option button for t-test for 2 (independent) groups with common variance (Enter means)

I want to my entries to display two decimal places, instead of the default of 1, so I click on the Decimals displayed tool button.

First, click the up arrow button on the spinner for Decimals for data entry until 2 appears.

Second, click on the OK button to close the dialog box.

SPSS sets the default test to a two-tailed test with an alpha of .05.

Since our test was a one-tailed test with an alpha of .05, we click on the text specified as the SPSS default.

First, click on the 1 Tailed option on the Tails panel.

Second, click on the Ok button to change the test specifications.

- We enter the values from the SPSS output from the independent-samples t-test for the Population 1 group:
- 42.04 for Population Mean
- 13.86 for Standard Deviation
- 26 for the N Per Group

- Note that SPSS fills in the standard deviation and N Per Group numbers for Population 2 with the same values.

First, enter the population mean for the second group, 40.55.

When we click on the box to change the Standard Deviation, this message appears. Since the standard deviation for our two groups is not the same, we click on the Yes button.

We are now able to enter the standard deviation for the second group, 12.46.

When we click on the box to change the N Per Group for the second group, this message box below appears.

Since the number of cases for our two groups is not the same, we click on the Yes button.

We are now able to enter the N Per Group for the second group, 145.

Having entered the values for the two groups, we now click on the Compute button.

SamplePower tells us that our power to obtain statistical significance was 14%, translating to a possible successful outcome 1 in 7 tries.

With the mean difference of 1.49 and a pooled standard deviation of 12.68, we can use a calculator to compute the effect size of .12 (Cohen’s d), about half of what would be typically characterized as a small effect.

Suppose, however, that even a very small effect of this size had important consequences. We can ask ourselves how large would the sample need to have been in order to find a statistically significant effect.

To find the group sizes needed, select Find N for power of 80% from the Tools menu.

This dialog box appears. SamplePower will need additional information to know how it should increase the size of each group.

Click on the Yes button to link the group sample sizes.

Second, using a calculator, I compute that group 2 was about 6 times larger than group 1, so I increase the second spinner to 6.

First, assuming the proportion of cases in each of our groups was representative of the population, we mark the option button to Link Sample Size in two groups.

Third, click OK to close the dialog box.

To find the the group sizes needed, again select Find N for power of 80% from the Tools menu.

SamplePower indicates that we would have needed a total sample of 3,654 to detect this very small effect size in the population.

This very small effect size would have to have very important consequences in order to justify the expense of collecting samples this large.