1 / 55

# Independent Samples T-Test of Population Means - PowerPoint PPT Presentation

Independent Samples T-Test of Population Means. Key Points about Statistical Test Sample Homework Problem Solving the Problem with SPSS Logic for Independent Samples T-Test of Population Means Power Analysis. Independent Samples T-Test: Purpose.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about ' Independent Samples T-Test of Population Means' - philip-herring

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Independent Samples T-Test of Population Means

Key Points about Statistical Test

Sample Homework Problem

Solving the Problem with SPSS

Logic for Independent Samples T-Test of Population Means

Power Analysis

• Purpose: test whether or not the populations represented by the two samples have a different mean

• Examples:

• Social work students have higher GPA’s than nursing students

• Social work students volunteer for more hours per week than education majors

• UT social work students score higher on licensing exams than graduates of Texas State University

• Hypotheses:

• Null: mean of population 1 = mean of population 2

Versus

• Research: mean of population 1 < mean of population 2

• Research: mean of population 1 ≠ mean of population 2

• Research: mean of population 1 > mean of population 2

• Decision:

• Reject null hypothesis if pSPSS ≤ alpha (≠ relationship)

• Reject null hypothesis if pSPSS÷2 ≤ alpha (< or > relationship)

• Variable is interval level (ordinal with caution)

• Variable is normally distributed

• Acceptable degree of skewness and kurtosis

or

• Using the Central Limit Theorem (30+ in each group)

• The variance of the two groups is not different (if different, use alternative formula)

• Cohen’s d measures difference in means in standard deviation units.

• Cohen’s d = difference in population means

population standard deviation

• Interpretation:

• small: d = .20 to .50

• medium: d = .50 to .80

• large: d = .80 and higher

• An independent samples T-test is presented the same as the one-sample t-test:

t(75) = 2.11, p = .02 (one –tailed), d = .48

• Example: Survey respondents who were employed by the federal, state, or local government had significantly higher socioeconomic indices (M = 55.42, SD = 19.25) than survey respondents who were employed by a private employer (M = 47.54, SD = 18.94) , t(255) = 2.363, p = .01 (one-tailed).

Degrees of freedom

Value of statistic

Significance of statistic

Include if test is one-tailed

Effect size if available

Homework problems: Independent Samples T-Test of Population Means

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

• True

• True with caution

• False

• Incorrect application of a statistic

This is the general framework for the problems in the homework assignment on “Independent Samples T-Test of Population Means.” The description is similar to findings one might state in a research article.

Homework problems: Independent Samples T-Test - Data set, variables, and sample

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ]for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

• True

• True with caution

• False

• Incorrect application of a statistic

• The first paragraph identifies:

• The data set to use, e.g. GSS2000R.Sav

• The groups that will be compared in the analysis

• The variable compared in the t-test

• Thealpha level to use for the hypothesis test

Homework problems: Independent Samples T-Test - Specifications

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local governmentcompleted significantly more years of school(M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

• True

• True with caution

• False

• Incorrect application of a statistic

• The second paragraph specifies:

• The sample means and standard deviation for the groups being compared

• The relationship for deriving the research hypothesis

Homework problems: Independent SamplesT- Test - Choosing an answer

The answer to a problem will be True if the t-test supports the finding in the problem statement.

The answer to a problem will be True with caution if the t-test supports the finding in the problem statement, but the dependent variable is ordinal level.

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

• True

• True with caution

• False

• Incorrect application of a statistic

• The answer to a problem will Incorrect application of a statistic if

• the t-test violates the level of measurement requirement, i.e. the dependent variable is nominal

• the assumption of normality of the dependent variable is violated and the central limit theorem doesn’t apply

• the independent variable is not dichotomous

The answer to a problem will be False if the t-test does not support the finding in the problem statement.

Solving the problem with SPSS: Identifying numeric codes for groups - 1

Our first task in SPSS is to identify the numeric codes for the groups that SPSS will require us to specify.

The problem statement tells us “This problem uses the data set GSS2000R.Sav to compare the average score on the variable "highest year of school completed" [educ] for groups of survey respondents defined by the variable "governmental employment" [wrkgovt].”

Select the Variables… command from the Utilities menu.

NOTE: in our problems we required that the grouping, or independent variable, be dichotomous, because there are other statistical tests to use when there are more than two groups. SPSS does not require the independent variable to be dichotomous, but it does require that you enter the numeric codes for the two groups (possibly out of a larger number of groups) that you wish to compare.

Solving the problem with SPSS: Identifying numeric codes for groups - 2

Scroll through the list of variables until you see wkgovt. Click on wkgovt and the information for the variable appears in the panel to the right.

Click on Close to dismiss the dialog box.

The Variable Information panel shows us the text labels that the creator of the data set assigned to each of the possible numeric responses for this variable.

The numeric codes for the groups we want to compare are: 1 (GOVERNMENT) and 2 (PRIVATE).

This remaining numeric codes represent missing data:

0 (NAP),

8 (DK), and

9 (NA).

Solving the problem with SPSS:Level of measurement

Statistical tests of means require that the dependent variable be interval level. "Highest year of school completed" [educ] is interval level, satisfying the requirement.

In our analyses, we will allow the dependent variable to be ordinal , which violates this requirement in the strictest interpretation of level of measurement. However, since the research literature often computes means for ordinal level data, especially scaled measures, we will follow the convention of applying interval level statistics to ordinal data. Since all analysts may not agree with this convention, a caution is added to any true findings.

Solving the problem with SPSS: Evaluating normality - 1

The independent samples t-test uses the t-distribution for the probability of the test statistic. To obtain accurate probabilities, the variable must follow a normal distribution.

We will generate descriptive statistics to evaluate normality.

Select the Descriptive Statistics > Descriptives… command from the Analysis menu.

Solving the problem with SPSS: Evaluating normality - 2

First, move the variable we will use in the t-test, educ, to the Variable(s) list box.

Second, click on the Options… button to select the statistics we want.

Solving the problem with SPSS: Evaluating normality - 3

First, in addition to the statistics, SPSS has checked by default, mark the Kurtosis and Skewness check boxes on the Distribution panel.

Second, click on the Continue button to close the dialog box.

Solving the problem with SPSS: Evaluating normality - 4

Click on the OK button to obtain the output.

Solving the problem with SPSS: Evaluating normality - 5

"Highest year of school completed" [educ] did not satisfy the criteria for a normal distribution. The skewness of the distribution (-.137) was between -1.0 and +1.0, but the kurtosis of the distribution (1.246) fell outside the range from -1.0 to +1.0.

Having failed the normality requirement using this criteria, we will see if we can apply the central limit theorem.

Solving the problem with SPSS: The independent-samples t-test - 1

The number of cases in each group is part of the output for the independent samples t-test, so we will go ahead and compute that test to continue addressing the issue of normality.

Select Compare Means > Independent-Samples T Test… from the Analyze menu.

Solving the problem with SPSS: The independent-samples t-test - 2

First, move the dependent variable educ to the Test Variable(s) list box.

Second, move the independent variable wkgovt to the Grouping Variable text box.

Note that SPSS lists two question marks after the variable name and activates the Define Groups… button as its clue for what it wants us to do next. Click on the Define Groups button.

Solving the problem with SPSS: The independent-samples t-test - 3

First, type in the numeric codes for the groups in the wkgovt variable that we looked up at the beginning of the problem.

Second, click on the Continue button to close the dialog box.

Solving the problem with SPSS: The independent-samples t-test - 4

Click on the OK button to close the dialog box.

Note that SPSS has replaced the question marks after the variable name with the numeric codes we typed in.

Solving the problem with SPSS: Evaluating normality with the central limit theorem - 6

Since survey respondents who were employed by the federal, state, or local government had 38 cases and survey respondents who were employed by a private employer had 217 cases, the assumption of normality was satisfied by the Central Limit Theorem which required both groups to have 30 or more cases.

If we are unable to establish normality either by the distribution or by the central limit theorem, the t-test would not be an appropriate statistic.

Solving the problem with SPSS: central limit theorem - 6Evaluating equality of group variances - 1

The independent-samples t-test assumes that the variances of the dependent variable for both groups are equal in the population. This assumption is evaluated with Levene's Test for Equality of Variances. The null hypothesis for this test states that the variance for both groups are equal. The desired outcome for this test is to fail to reject the null hypothesis, which demonstrates equality.

The probability associated with Levene's Test for Equality of Variances (.161) is greater than alpha (.05), indicating that the 'Equal variances assumed' formula for the independent samples t-test should be used for the analysis.

Solving the problem with SPSS: central limit theorem - 6Evaluating equality of group variances - 2

Since we failed to reject the hypothesis for Levene’s test, the 'Equal variances assumed' formula for the independent samples t-test should be used for the analysis.

Had the probability associated with Levene’s test been less than the alpha level, we would have used the statistics for the ‘Equal variances not assumed’ row in the table.

Solving the problem with SPSS: central limit theorem - 6 Answering the question - 1

The finding we are trying to verify is:

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

Our first task is to make certain we have solved the right problem.

Second, we verify that the mean and standard deviations for the groups match the problem statement.

First, we check to make certain we have the correct groups in the output.

Solving the problem with SPSS: central limit theorem - 6 Answering the question - 2

The finding we are trying to verify is:

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) .

Since the problem states that the mean for one group is significantly higher than the mean of the other group, the research hypothesis is a one-tailed test.

We divide the SPSS 2-tailed significance (.080) in half and make our decision about the null hypothesis by comparing p = .04 to alpha = .05.

Solving the problem with SPSS: central limit theorem - 6 Answering the question - 3

The answer to the question is True.

We can include the t-test results in our statement of the finding:

Survey respondents who were employed by the federal, state, or local government completed significantly more years of school (M = 13.97, SD = 3.27) than survey respondents who were employed by a private employer (M = 13.07, SD = 2.84) , t(255) = 1.761, p = .04 (one-tailed).

Logic for central limit theorem - 6independent-samples t-test:Level of measurement

Measurement level of independent variable?

Dichotomous

Interval/ordinal/nominal

Measurement level of dependent variable?

Inappropriate application of a statistic

Nominal/

Dichotomous

Interval/ordinal

Strictly speaking, the test requires an interval level variable. We will allow ordinal level variables with a caution.

Inappropriate application of a statistic

Logic for central limit theorem - 6independent-samples t-test:Assumption of normality

Number of cases in both groups is at least 30?

Skewness and Kurtosis between

-1.0 and +1.0?

No

No

Inappropriate application of a statistic

Yes

Yes

Logic for central limit theorem - 6independent-samples t-test:Assumption of equality of variances

Probability for Levene test of equality of population variances less than or equal to alpha?

Yes

No

Use ‘Equal variances

not assumed’

Use ‘Equal variances

assumed’

Logic for central limit theorem - 6independent-samples t-test:Means and standard deviations correct

Mean and standard deviation of both variables are correct?

No

Yes

False

Logic for central limit theorem - 6independent-samples t-test: Decision about null hypothesis

One-tailed or two-tailed test?

Two-tailed

One-tailed

Divide two-tailed significance by 2

Probability for

t-test less than or equal to alpha?

Yes

No

Add caution for ordinal dependent variable.

True

False

This problem uses the data set GSS2000R.Sav to compare the average score on the variable "number of hours worked in the past week" [hrs1] for groups of survey respondents defined by the variable "self-employment" [wrkslf]. Using an independent samples t-test with an alpha of .05, is the following statement true, true with caution, false, or an incorrect application of a statistic?

Survey respondents who were self-employed worked significantly longer hours in the past week (M = 42.04, SD = 13.86) than survey respondents who were working for someone else (M = 40.55, SD = 12.46) .

1 True

2 True with caution

3 False

4 Incorrect application of a statistic

The answer to this problem was false because the probability for the t-test was .29 (one-tailed), greater than the alpha of 0.05.

We can conduct a post-hoc power analysis to determine what number of cases would have been sufficient to have a better opportunity to find a statistically significant difference.

Power Analysis: Statistical Results for FalseFalse Independent-samples T-test - 1

The answer to the problem was false because the one-tailed significance was p = .29 (.583 ÷ 2), greater than the alpha of .05.

Power Analysis: Statistical Results for FalseFalse Independent-samples T-test - 2

To calculate the effect size, and corresponding power, for this problem, we need a pooled estimate of the standard deviation for the two groups. SamplePower will calculate that for us, we will enter the sample sizes, means, and standard deviations for the two groups in SamplePower.

The UT license for SPSS does not include SamplePower, the SPSS program for power analysis.

However, the program is available on the UT timesharing server.

Information about access this program is available at this site.

Power Analysis for FalseIndependent-samples T-test - 1

In the SamplePower program on the ITS Timesharing Systems, select the New… command from the File menu.

Power Analysis for FalseIndependent-samples T-test - 2

First, select the Means tab to access the tests for means.

Third, click on the Ok button to enter the specific values for our problem.

Second, since we want to enter the means for our two groups, select the option button for t-test for 2 (independent) groups with common variance (Enter means)

Power Analysis for FalseIndependent-samples T-test – 3

I want to my entries to display two decimal places, instead of the default of 1, so I click on the Decimals displayed tool button.

Power Analysis for FalseIndependent-samples T-test – 4

First, click the up arrow button on the spinner for Decimals for data entry until 2 appears.

Second, click on the OK button to close the dialog box.

Power Analysis for FalseIndependent-samples T-test - 5

SPSS sets the default test to a two-tailed test with an alpha of .05.

Since our test was a one-tailed test with an alpha of .05, we click on the text specified as the SPSS default.

Power Analysis for FalseIndependent-samples T-test - 6

First, click on the 1 Tailed option on the Tails panel.

Second, click on the Ok button to change the test specifications.

Power Analysis for FalseIndependent-samples T-test - 7

• We enter the values from the SPSS output from the independent-samples t-test for the Population 1 group:

• 42.04 for Population Mean

• 13.86 for Standard Deviation

• 26 for the N Per Group

• Note that SPSS fills in the standard deviation and N Per Group numbers for Population 2 with the same values.

Power Analysis for FalseIndependent-samples T-test – 8

First, enter the population mean for the second group, 40.55.

When we click on the box to change the Standard Deviation, this message appears. Since the standard deviation for our two groups is not the same, we click on the Yes button.

Power Analysis for FalseIndependent-samples T-test – 9

We are now able to enter the standard deviation for the second group, 12.46.

Power Analysis for FalseIndependent-samples T-test – 10

When we click on the box to change the N Per Group for the second group, this message box below appears.

Since the number of cases for our two groups is not the same, we click on the Yes button.

Power Analysis for FalseIndependent-samples T-test - 11

We are now able to enter the N Per Group for the second group, 145.

Having entered the values for the two groups, we now click on the Compute button.

Power Analysis for FalseIndependent-samples T-test - 12

SamplePower tells us that our power to obtain statistical significance was 14%, translating to a possible successful outcome 1 in 7 tries.

Power Analysis for FalseIndependent-samples T-test – 13

With the mean difference of 1.49 and a pooled standard deviation of 12.68, we can use a calculator to compute the effect size of .12 (Cohen’s d), about half of what would be typically characterized as a small effect.

Suppose, however, that even a very small effect of this size had important consequences. We can ask ourselves how large would the sample need to have been in order to find a statistically significant effect.

Power Analysis for FalseIndependent-samples T-test - 14

To find the group sizes needed, select Find N for power of 80% from the Tools menu.

Power Analysis for FalseIndependent-samples T-test – 15

This dialog box appears. SamplePower will need additional information to know how it should increase the size of each group.

Click on the Yes button to link the group sample sizes.

Power Analysis for FalseIndependent-samples T-test - 16

Second, using a calculator, I compute that group 2 was about 6 times larger than group 1, so I increase the second spinner to 6.

First, assuming the proportion of cases in each of our groups was representative of the population, we mark the option button to Link Sample Size in two groups.

Third, click OK to close the dialog box.

Power Analysis for FalseIndependent-samples T-test - 17

To find the the group sizes needed, again select Find N for power of 80% from the Tools menu.

Power Analysis for FalseIndependent-samples T-test - 18

SamplePower indicates that we would have needed a total sample of 3,654 to detect this very small effect size in the population.

This very small effect size would have to have very important consequences in order to justify the expense of collecting samples this large.