Analyzing Data using SPSS

1 / 99

# Analyzing Data using SPSS - PowerPoint PPT Presentation

Analyzing Data using SPSS. Testing for difference . Parametric Test. t-test . Is used in a variety of situations involving interval and ratio variables. Independent – Samples Dependent - Samples. Independent-Samples T-Test .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### t-test

Is used in a variety of situations involving interval and ratio variables.

Independent – Samples

Dependent - Samples

Independent-Samples T-Test
• What it does: The Independent Samples T Test compares the mean scores of two groups on a given variable.
Where to find it: Under the Analyze menu, choose Compare Means, the Independent Samples T Test. Move your dependent variable into the box marked "Test Variable." Move your independent variable into the box marked "Grouping Variable." Click on the box marked "Define Groups" and specify the value labels of the two groups you wish to compare.
Assumptions:-The dependent variable is normally distributed. You can check for normal distribution with a Q-Q plot.-The two groups have approximately equal variance on the dependent variable. You can check this by looking at the Levene's Test. See below.-The two groups are independent of one another
Hypotheses:Null: The means of the two groups are not significantly different.Alternate: The means of the two groups are significantly different.
SPSS Output
• Following is a sample output of an independent samples T test. We compared the mean blood pressure of patients who received a new drug treatment vs. those who received a placebo (a sugar pill).
First, we see the descriptive statistics for the two groups. We see that the mean for the "New Drug" group is higher than that of the "Placebo" group. That is, people who received the new drug have, on average, higher blood pressure than those who took the placebo.

Our

• Finally, we see the results of the Independent Samples T Test. Read the TOP line if the variances are approximately equal. Read the BOTTOM line if the variances are not equal. Based on the results of our Levene's test, we know that we have approximately equal variance, so we will read the top line
Our T value is 3.796.
• We have 10 degrees of freedom.
• There is a significant difference between the two groups (the significance is less than .05).
• Therefore, we can say that there is a significant difference between the New Drug and Placebo groups. People who took the new drug had significantly higher blood pressure than those who took the placebo.
Example Independent – samples t – test
• A study to determine the effectiveness of an integrated statistics/experimental methods course as opposed to the traditional method of taking the two courses separately was conducted.
• It was hypothesized that the students taking the integrated course would conduct better quality research projects than students in the traditional courses as a result of their integrated training.
• Ho : there is no difference in students performance as a result of the integrated versus traditional courses.
• H1 : students taking the integrated course would conduct better quality research projects than students in the traditional courses
Output SPSS
• Students taking the integrated course would conduct better
• quality research projects than students in the traditional courses
Exercise1
• The following data were obtained in an experiment designed to check whether there is a systematic difference in the weights (in grams) obtained with two different scales.
Use the 0.01 level of significance to test whether the difference between the means of the weights obtained with the two scales is significant
• Ho : there is no significant difference between the means of the weight obtained with the two scales.
• H1 : there is significant difference between the means of the weight obtained with the two scales.
Exercise 2
• The following are the scores for random samples of size ten which are taken from large group of trainees instructed by the two methods.
• Method 1 : teaching machine as well as some personal attention by an instructor
• Method 2 : straight teaching-machine instruction

What we can conclude about the claim that the average amount

by which the personal attention of an instructor will improve

trainee’s score. Use =5%.

Paired Samples T Test
• What it does: The Paired Samples T Test compares the means of two variables. It computes the difference between the two variables for each case, and tests to see if the average difference is significantly different from zero.
Paired Samples T Test
• Where to find it: Under the Analyze menu, choose Compare Means, then choose Paired Samples T Test. Click on both variables you wish to compare, then move the pair of selected variables into the Paired Variables box.
Paired Samples T Test
• Assumption:-Both variables should be normally distributed. You can check for normal distribution with a Q-Q plot.
Paired Samples T Test
• Hypothesis:Null: There is no significant difference between the means of the two variables.Alternate: There is a significant difference between the means of the two variables
SPSS Output
• Following is sample output of a paired samples T test. We compared the mean test scores before (pre-test) and after (post-test) the subjects completed a test preparation course. We want to see if our test preparation course improved people's score on the test.
First, we see the descriptive statistics for both variables.
• The post-test mean scores are higher than pre-test scores
Next, we see the correlation between the two variables
• There is a strong positive correlation. People who did well on the pre-test also did well on the post-test.
Finally, we see the results of the Paired Samples T Test. Remember, this test is based on the difference between the two variables. Under "Paired Differences" we see the descriptive statistics for the difference between the two variables
To the right of the Paired Differences, we see the t, degrees of freedom, and significance.

The t value = -2.171

We have 11 degrees of freedom

Our significance is .053

If the significance value is less than .05, there is a significant difference.If the significance value is greater than. 05, there is no significant difference.

Here, we see that the significance value is approaching significance, but it is not a significant difference. There is no difference between pre- and post-test scores. Our test preparation course did not help!

Example
• Twenty first-grade children and their parents were selected for a study to determine whether a seminar instructing on inductive parenting techniques improve social competency in children. The parents attended the seminar for one month. The children were tested for social competency before the course began and were retested six months after the completion of the course.
Hypothesis
• Ho : there is no significant difference between the means of pre and post seminar social competency scores
• In other words, the parenting seminar has no effect on child social competency scores

There is a strong positive correlation. children who did well on the pre-test also did well on the post-test.

There is significant difference between pre- and post-test scores. the parenting seminar has effect on

child social competency scores!

Exercise 3
• The table below shows the number of words per minute readings of 20 student before and after following a particular method that can improve reading.
Using a 0.05 level of significance, test the claim that the method is effective in improve reading.
Exercise 4
• The table below shows the weight of seven subjects before and after following a particular diet for two months
• Subject A B C D E F G
• After 156 165 196 198 167 199 164
• Before 149 156 194 203 153 201 152
• Using a 0.01 level of significance, test the claim that the diet is effective in reducing weight.
One-WayANOVA
• Similar to a t-test, in that it is concerned with differences in means, but the test can be applied on two or more means.
• The test is usually applied to interval and ratio data types. For example differences between two factors (1 and 2).
• The test can be undertaken using the Analyze - Compare Means - One-Way ANOVA menu items, then select for appropriate variables.
• You will observe the One-Way ANOVA for factor 1 and factor 2
Procedure
• 1. You will need one column of group codes labelling which group your data belongs to. The codes need to be numerical, but can be labelled with text.
• 2. You will also need a column containing the data points or scores you wish to analyze.
• 3. Select One-way ANOVA from the Analyze and Compare Means menus.
• 4. Click on your dependent variables (data column) and click on the top arrow so that the selected column appears in the dependent list box.
• 5. Click on your code column (your condition labels) and click on the bottom arrow so that the selected column appears in the factor box.
6. Click on Post Hoc if you wish to perform post-hoc tests.(optional).
• 7. Choose the type of post-hoc test(s) you wish to perform by clicking in the small box next to your choice until a tick appears. Tukey's and Scheffe's tests are commonly used.
• 8. Click on Dunnett to perform a Dunnett's test which allows you to compare experimental groups with a control group.Choose whether your control category is the first or last code entered in your code column.
The main output table is labelled ANOVA. The F-ratio of the ANOVA, the degrees of freedom and the significance are all displayed. The top value of the df column is the df of the factor, the bottom value is the df of the error term.
• Tukey's test will also try to find combinations of similar groups or conditions.
• In the Score table there will be one column for each pair of conditions that are shown to be 'similar'. The mean of each condition within the pair are given in the appropriate column. The p-value for the difference between the means of each pair of groups is given at the bottom of the appropriate column.
Example – one-way ANOVA
• We would like to determine whether the scores on a test of aggression are different across 4 groups of children (each with 5 subjects)
• Each child group has been exposes to differing amounts of time watching cartoons depicting ‘toon violence’

groups have the same mean if the following sample results

have been obtained.

Exercise 5
• At the same time each day, a researcher records the temperature in each of three greenhouses. The table shows the temperatures in degree Fahrenheit recorded for one week.
• Greenhouse #1 greenhouse #2 greenhouse #3

73 71 61

72 69 63

73 72 62

66 72 61

68 65 60

71 73 62

72 71 59

Use a 0.05 significance level to test the claim that the average temperature is the same in each greenhouse.

### Nonparametric Test

Sign Test
• A sign test compares the number of positive and negative differences between related conditions
Procedure
• 1. You should have data in two or more columns - one for each condition tested.
• 2. Select 2 Related Samples from the Analyze - Nonparametric Tests menu.
• 3. Click on the first variable in the pair and the second variable in the pair.
• The names of the variables appear in the current selections section of the dialogue box.
• 5. Click on the central selection arrow when you are happy with the variable pair selection.
• The chosen pair appairs in the Test Pair(s) List.
• Make sure the Sign box is ticked and remove the tick from the Wilcoxon box
Example
• The data in table on the next slide are matched pairs of heights obtained from a random sample of 12 male statistics students. Each student reported his height, then his weight was measured. Use a 0.05 significance level to test the claim that there is no difference between reported height and measured height.
Reported and measured height of male statistics student

Ho: there is no significant difference between reported heights

and measured heights

H1 : there is a difference

Output

Reject Ho. There is sufficient evidence to reject the claim that

no significant difference between the reported and measured heights.

Exercise 6
• Listed here are the right- and left-hand reaction times collected from 14 subject with right handed. Use 0.05 significance level to test the claim of no difference between the right hand- and left-hand reaction times.
Wilcoxon
• The Wilcoxon test is used with two columns of non-parametric related (linked) data.
• Either one person has taken part in two conditions or paired participants (e.g. brother and sister) have taken part in the same condition.
• This is the non-parametric equivelant of the paired sample t-test
Procedure
• 1. Put your data in two or more columns, one for each condition tested.
• 2. Select 2 Related Samples from Analyze - Nonparametric Tests menu.
• 3. Click on the first variable in the pair.
• 4. Click on the second variable in the pair.
• 5. Make sure the Wilcoxon box is ticked
• The Ranks table produced in the output window summarises the ranking process.
• In the Test Statistics table the Z statistic is the result of the Wilcoxon test.
• The p-value for this statistic is shown below it. This is the two tailed significance.
Example
• Use the previous data to test the claim that there is no difference between reported heights and measured heights using Wilcoxon test at 0.05 significance level.
Output

Reject Ho. There is sufficient evidence to reject the claim that

no difference between reported and measured heights.

Mann-Whitney
• The Mann-Whitney test is used with two columns of independent (unrelated) non-paramteric data.This is the non-parametric equivalent of the independent samples t-test.
Procedure
• Put all of your measured data into one column.
• 2. Make a second column that contains codes to indicate the group from which each value was obtained.
• 3. Select 2 Independent Samples from the Analyze - Nonparametric Tests menu.
• 4. Select the column containing the data you want to analyse and click the top arrow.
• 5. Select the Grouping Variable - the column which contains your group codes - and click the bottom arrow.
• 6. Make sure the Mann-Whitney U option is selected.
The output is produced in the output window.
• The top table summarises the ranking process.
• The result of the Mann-Whitney test is given at the top of the Test Statistics table.
• The two-tailed significance of the result is given in the same table.
Example
• One study used x-ray computed tomography (CT) to collect data on brain volumes for a group of patients with obsessive-compulsive disorders and a control group of healthy persons. The following data shows sample results (in mm) for volumes of the right cordate.
Kruskal-Wallis
• examines differences between 3 or more independent groups or conditions.
Procedure
• 1 Put all your measured data into one column.
• 2. Make a second column that contains codes to indicate the group from which each value was obtained.
• 3. Select K Independent Samples from the Analyze - Non-parametric Tests menu.
• 4. Select the grouping variable, the column that contains your group codes, then click on the bottom arrow.
• Make sure the Kruskal-Wallis box is checked
• In the output window the chi-square statistic is shown in the test statistic section, as is the P-value.
Example
• We would like to determine whether the scores on a test of Spanish are different across three different methods of learning
• Method 1 : classroom instruction and language laboratory
• Method 2: only classroom instruction
• Method3: only self-study in language laboratory.
The following are the final examination scores of samples of students from the three group

Method 1 : 94 88 91 74 86 97

Method 2 : 85 82 79 84 61 72 80

Method 3 : 89 67 72 76 69

At the 0.05 level of significance, test the null hypothesis that the

population sampled are identical .

Exercise 7
• The following are the miles per gallon which a test driver got in random samples of six tankfuls of each of three kinds of gasoline:
• Gasoline 1 : 30 15 32 27 24 29
• Gasoline 2 : 17 28 20 33 32 22
• Gasoline 3 : 19 23 32 22 18 25
• Test the claim that there is no difference in the true average mileage yield of the three kinds of gasoline. (use 0.05 level of significance)

### Pearson's Correlation

Pearson's correlation is a parametric test for the strength of the relationship between pairs of variables.

What it does: The Pearson R correlation tells you the magnitude and direction of the association between two variables that are on an interval or ratio scale.
Where to find it: Under the Analyze menu, choose Correlations. Move the variables you wish to correlate into the "Variables" box. Under the "Correlation Coefficients," be sure that the "Pearson" box is checked off.
Assumption: -Both variables are normally distributed. You can check for normal distribution with a Q-Q plot.
Hypotheses:Null: There is no association between the two variables.Alternate: There is an association between the two variables.
SPSS Output
• Following is a sample output of a Pearson R correlation between the Rosenberg Self-Esteem Scale and the Assessing Anxiety Scale.

SPSS creates a correlation matrix of the two variables. All the information we need is in the cell that represents the intersection of the two variables

SPSS gives us three pieces of information: -the correlation coefficient-the significance-the number of cases (N

The correlation coefficient is a number between +1 and -1. This number tells us about the magnitude and direction of the association between two variables.
• The MAGNITUDE is the strength of the correlation. The closer the correlation is to either +1 or -1, the stronger the correlation. If the correlation is 0 or very close to zero, there is no association between the two variables. Here, we have a moderate correlation (r = -.378).
The DIRECTION of the correlation tells us how the two variables are related. If the correlation is positive, the two variables have a positive relationship (as one increases, the other also increases). If the correlation is negative, the two variables have a negative relationship (as one increases, the other decreases). Here, we have a negative correlation (r = -.378). As self-esteem increases, anxiety decreases
Example
• The following data were obtained in a study of the relationship between the resistance (ohms) and the failure time (minutes) of certain overloaded resistors.
• Resistance 48 28 33 40 36 39 46 40 30 42 44 48 39 34 47
• Failure time 45 25 39 45 36 35 36 45 34 39 51 41 38 32 45
• Test the null hypothesis that there is a significant correlation between resistance and failure time.
Output SPSS

There is significant positive correlation between resistance and failure time, indicating that failure time increases as resistance increases.

Exercise 8
• An aerobics instructor believes that regular aerobic exercise is related to greater mental acuity, stress reduction, high self-esteem, and greater overall life satisfaction.
• She asked a random sample of 30 adult to fill out a series of questionnaire.
• The result are as follows:test whether there is significant correlation between aerobic exercise and high self-esteem
The Spearman Rho correlation
• What it does: The Spearman Rho correlation tells you the magnitude and direction of the association between two variables that are on an interval or ratio scale.
The Spearman Rho correlation
• Where to find it: Under the Analyze menu, choose Correlations. Move the variables you wish to correlate into the "Variables" box. Under the "Correlation Coefficients," be sure that the "Spearman" box is checked off.
The Spearman Rho correlation
• Assumption: -Both variables are NOT normally distributed. You can check for normal distribution with a Q-Q plot. If the variables are normally distributed, use a Pearson R correlation.
The Spearman Rho correlation
• Hypotheses:Null: There is no association between the two variables.Alternate: There is an association between the two variables.
SPSS Output
• Following is a sample output of a Spearman Rho correlation between the Rosenberg Self-Esteem Scale and the Assessing Anxiety Scale.
SPSS creates a correlation matrix of the two variables. All the information we need is in the cell that represents the intersection of the two variables.
• SPSS gives us three pieces of information: -the correlation coefficient-the significance-the number of cases (N)
The correlation coefficient is a number between +1 and -1. This number tells us about the magnitude and direction of the association between two variables.
• The MAGNITUDE is the strength of the correlation. The closer the correlation is to either +1 or -1, the stronger the correlation. If the correlation is 0 or very close to 0, there is no association between the two variables. Here, we have a moderate correlation (r = -.392).
The DIRECTION of the correlation tells us how the two variables are related. If the correlation is positive, the two variables have a positive relationship (as one increases, the other also increases). If the correlation is negative, the two variables have a negative relationship (as one increases, the other decreases). Here, we have a negative correlation (r = -.392). As self-esteem increases, anxiety decreases.
Example
• The following are the numbers of hours which ten students studied for an examination and the grades which they received:

Is there any

relationship

between number

of our studied and

examination

Exercise 9
• The following table shows the twelve weeks’ sales of a downtown department store, x, and its suburban branch, y
• X 71 64 67 58 80 63 69 59 76 60 66 55
• Y 49 31 45 24 68 30 40 37 62 22 35 19
• Is there any significant relationship between x and y?
Two way chi-square from frequencies
• A chi-square test is a non-parametric test for nominal (frequency data).
• The test will calculate expected values for each combination of category codes based on the null hypothesis that there is no association between the two variables.
Procedure
• 1. You will need two columns of codes. Each value in each column provides a code to a group or criteria category within the appropriate variable. You should have one row for each combination of category code.
• 2. You will also need a column giving the frequency that each combination of codes is observed.
• Before carrying out your chi-square test you first need to tell SPSS that the numbers in your frequency column are indeed frequencies. You do this using weight cases...
• 3. Select Weight Cases from the Data menu.
• 4. Click the Weight cases by button.
• 5. Select the column containing your frequencies and click on the across arrow.
• 8. Select the first variable and click on the top arrow to move it into the Rows box.
• 9. Select the second variable and click on the middle arrow to move it into the Columns box.
• Click on Statistics to choose to perform a chi-square test on your data.
• 11. Select the chi-square option from the Crosstabs: Statistics dialogue box.
• 12. Click on Continue when ready.
• 13. Click on Cells to choose to output the chi-square expected values.
• 14. Select the top left boxes to display both the Observed and the Expected values
Two way chi-square from raw data
• 1. You will need two columns of codes. Each value in each column provides a code to a group or criteria category within the appropriate variable.
• 2. Click Crosstabs from the Analyze - Descriptive Statistics menu.
• 3. Select the first variable and click on the top arrow to move it into the Rows box.
• 4. Select the second variable and click on the middle arrow to move it into the Columns box.
• Click on Statistics to choose to perform a chi-square test on your data.
• 6. Select the chi-square option from the Crosstabs: Statistics dialogue box
Example
• Suppose we want to investigate whether there is a relationship between the intelligence of employees who have through a certain job training program and their subsequent performance on the job.
• A random sample of 50 cases from files yielded the following results:

Performance

Poor fair good

Below average

Average

Above average

IQ

Test at the 0.01 level of significance whether on the job performance

of persons who have gone through the training program is independent

of their IQ

Exercise 10
• Suppose that a store carries two different brands, A and B, of a certain type of breakfast cereal. During a one-week, 44 packages were purchased and the results shows below

brand A brand B

Men 9 6

Women 13 16

Test the hypothesis that the brand purchased and the sex of the purchaser are independent.