320 likes | 641 Views
6/9/2012. SPSS: STA 3024 - Sourish Saha . 2. TOPICS . Manipulating Data Recoding, Subsetting Descriptive StatisticsComparing Means One-Sample T Test, Independent-Samples T Test, Paired-Samples T Test One-Way ANOVA, Multiple Comparison, CorrelationsSimple
E N D
1. 6/10/2012 SPSS: STA 3024 - Sourish Saha 1 Implementation of Statistical Methods using SPSS Sourish SahaPhD studentDepartment of StatisticsUniversity of Floridasourish@ufl.edu
2. 6/10/2012 SPSS: STA 3024 - Sourish Saha 2 TOPICS Manipulating Data
Recoding, Subsetting
Descriptive Statistics
Comparing MeansOne-Sample T Test, Independent-Samples T Test, Paired-Samples T TestOne-Way ANOVA, Multiple Comparison, Correlations
Simple & Multiple Regression Analysis
Comparison of Several GroupsTwo-way ANOVAChi-Square as a Test of HomogeneityKruskal-Wallis Test
Logistic Regression
3. 6/10/2012 SPSS: STA 3024 - Sourish Saha 3 To Recode the Values of a Variable into a New Variable
Transform -> Recode -> Into Different Variables
Select the variables you want to recode.
Enter an output (new) variable name and click Change.
Click Old and New Values and specify how to recode values.
4. 6/10/2012 SPSS: STA 3024 - Sourish Saha 4 To Select Subsets of Cases Based on a Conditional Expression Data
Select Cases.
Select If condition is satisfied.
Click If.
Enter the conditional expression.
5. 6/10/2012 SPSS: STA 3024 - Sourish Saha 5 Exploring the data in SPSS Analyze Descriptive Statistics Descriptives
Descriptives provides basic descriptive statistics:
n, mean, standard deviation, min and max.
6. 6/10/2012 SPSS: STA 3024 - Sourish Saha 6 Exploring the data in SPSS Analyze Descriptive Statistics Explore
Explore provides more descriptive statistics, including the variance, skewness, kurtosis, the median, percentiles and other descriptive statistics and information.
Plots
Boxplots, stem-and-leaf plots, histograms, normality plots.
Reasons for using the Explore procedure include data screening, outlier identification, description, assumption checking.
7. 6/10/2012 SPSS: STA 3024 - Sourish Saha 7 Exploring the data in SPSS Analyze Descriptive Statistics Frequencies
Frequencies produces a frequency distribution table.
Statistics and plots.
Frequency counts, percentages, cumulative percentages, quartiles, user-specified percentiles, bar charts, pie charts, and histograms and more
8. 6/10/2012 SPSS: STA 3024 - Sourish Saha 8 Exploring the data in SPSS Analyze Descriptive Statistics Crosstabs
Crosstabs with 2 variables creates a two-way table or crosstabulation. With statistics button one can choose among many statistics, including the chi-square value along with its p-value.
The Crosstabs procedure offers tests of independence and measures of association. One can obtain estimates of the relative risk of an event.
9. 6/10/2012 SPSS: STA 3024 - Sourish Saha 9 Exploring the data in SPSS Analyze Descriptive Statistics Ratio Statistics
The Ratio Statistics procedure provides a comprehensive list of summary statistics for describing the ratio between two scale variables.
10. 6/10/2012 SPSS: STA 3024 - Sourish Saha 10 Means Analyze Compare Means Means
The Means procedure calculates subgroup means and related univariate statistics for dependent variables within categories of one or more independent variables.
The Means procedure is useful for both description and analysis of scale variables. A variety of statistics is available to characterize the central tendency and dispersion of your test variables.
11. 6/10/2012 SPSS: STA 3024 - Sourish Saha 11 One-Sample T Test Analyze Compare Means One Sample t-test
The One-Sample T Test procedure tests the difference between a sample mean and a known or hypothesized value.
Allows you to specify the level of confidence for the difference
Produces a table of descriptive statistics for each test variable
12. 6/10/2012 SPSS: STA 3024 - Sourish Saha 12 Independent-Samples T Test Analyze Compare Means Independent Samples T-test
The Independent-Samples T Test procedure compares means for two groups of cases. Ideally, for this test, the subjects should be randomly assigned to two groups, so that any difference in response is due to the treatment (or lack of treatment) and not to other factors.
Also displayed are:
Descriptive statistics for each test variable
A test of variance equality
13. 6/10/2012 SPSS: STA 3024 - Sourish Saha 13 Paired-Samples T Test Analyze Compare Means Paired-Samples T-test
The Paired-Samples T Test procedure compares the means of two variables for a single group. It computes the differences between values of the two variables for each case and tests whether the average differs from 0.
14. 6/10/2012 SPSS: STA 3024 - Sourish Saha 14 One-Way ANOVA Let
be independent random samples from m normal populations with the ith population having parameters
Assuming equal variances, we want to test the null hypothesis
against the alternative that any two of the population means are unequal.
ANOVA involves partitioning the total variation in the combined sample into two parts. One part explains the variation between the samples while the second part explains the variation within each sample (SST=SSG + SSE).
15. 6/10/2012 SPSS: STA 3024 - Sourish Saha 15 One-Way ANOVA
16. 6/10/2012 SPSS: STA 3024 - Sourish Saha 16 One-Way ANOVA
For each group: number of cases, mean, standard deviation, standard error of the mean, minimum, maximum, and 95% confidence interval for the mean.
Levenes test for homogeneity of variance, analysis-of-variance table and robust tests of the equality of means for each dependent variable, user-specified a priori contrasts, and post hoc range tests and multiple comparisons: Bonferroni, Tukeys honestly significant difference, Scheffé, and least-significant difference.
17. 6/10/2012 SPSS: STA 3024 - Sourish Saha 17 Multiple Comparison tests Tests suitable for the simultaneous testing of several hypotheses concerning the equality of three or more population means.
When samples have been taken from several populations, as a preliminary to the more general question of whether the populations differ, there is the simpler question of whether they have different means.
If our null hypothesis is rejected then we wish to know where the differences lie, like for example using Tukeys test (HSD).
18. 6/10/2012 SPSS: STA 3024 - Sourish Saha 18 Multiple Comparison tests With m populations,
If null is rejected then we wish to know where the differences lie. There are
pairs of populations that could be compared.
19. 6/10/2012 SPSS: STA 3024 - Sourish Saha 19 Bivariate Correlations The Bivariate Correlations procedure computes Pearsons correlation coefficient (r), Spearmans rho, and Kendalls tau-b with their significance levels.
Correlations measure how variables or rank orders are related.
Before calculating a correlation coefficient, one should screen the data for outliers (which can cause misleading results) and evidence of a linear relationship.
Pearsons correlation coefficient is a measure of linear association. Two variables can be perfectly related, but if the relationship is not linear, Pearsons correlation coefficient is not an appropriate statistic for measuring their association.
20. 6/10/2012 SPSS: STA 3024 - Sourish Saha 20 Rank Correlation Coefficient Rank correlation is a method of finding the degree of association between two variables.
The calculation for the rank correlation coefficient the same as that for the Pearson correlation coefficient, but is calculated using the ranks of the observations and not their numerical values.
This method is useful when the data are not available in numerical form but information is sufficient to rank the data.
21. 6/10/2012 SPSS: STA 3024 - Sourish Saha 21
22. 6/10/2012 SPSS: STA 3024 - Sourish Saha 22 Recode & ComputeCreate new variable Transform -> Compute
Give the name of the Target variable
In the Numeric Expression box choose the Function of your
choice
23. 6/10/2012 SPSS: STA 3024 - Sourish Saha 23
24. 6/10/2012 SPSS: STA 3024 - Sourish Saha 24
25. 6/10/2012 SPSS: STA 3024 - Sourish Saha 25
26. 6/10/2012 SPSS: STA 3024 - Sourish Saha 26
27. 6/10/2012 SPSS: STA 3024 - Sourish Saha 27 Simple Linear Regression The simple linear regression is aimed at finding the "best-fit" values of two parameters in the following regression equation:
"the y-intercept of the regression line
"the slope of the regression line"
A popular method for finding the "best-fit" values is the Least Squares Regression method.
28. 6/10/2012 SPSS: STA 3024 - Sourish Saha 28 Multiple Regression Multiple (linear) regression is a regression technique aimed at finding a linear relationship between the dependent variable and multiple independent variables.
The multiple regression model is as follows:
Multiple regression finds the set of parameters
that provides the best fit between the model and the given data .
29. 6/10/2012 SPSS: STA 3024 - Sourish Saha 29
30. 6/10/2012 SPSS: STA 3024 - Sourish Saha 30 Kruskal - Wallis Test
The Kruskal-Wallis test is a nonparametric test for finding if three or more independent samples come from populations having the same distribution.
It is a nonparametric version of ANOVA.
31. 6/10/2012 SPSS: STA 3024 - Sourish Saha 31
32. 6/10/2012 SPSS: STA 3024 - Sourish Saha 32 Logistic Regression Useful for situations in which we want to predict the presence or absence of a characteristic or outcome based on values of a set of predictor variables.
Similar to a linear regression model BUT it is suited to models where the dependent variable is dichotomous.
Logistic regression coefficients can be used to estimate odds ratios for each of the independent variables in the model.
33. 6/10/2012 SPSS: STA 3024 - Sourish Saha 33 Logistic Regression To perform logistic regression, go to:
Analyze
Choose Regression
Then click on Binary Logistic