1 / 16

Recap of basic SPSS and statistics

Recap of basic SPSS and statistics. 5 th - 9 th December 2011, Rome. Manage the database . Import / export file Import variable from another database / merge files Restructure cases to variables. Merging datasets. For each level of investigation in a survey, there is typically a dataset

dinah
Download Presentation

Recap of basic SPSS and statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recap of basic SPSS and statistics 5th - 9th December 2011, Rome

  2. Manage the database • Import / export file • Import variable from another database / merge files • Restructure cases to variables

  3. Merging datasets • For each level of investigation in a survey, there is typically a dataset • For example if a survey asks questions at the household level, then measures anthropometry of children under 5 and for women of reproductive age as well as a community level questionnaire, we would expect 4 separate datasets to be created • To do analysis that looks at a case in respect to its context, datasets must be merged

  4. Merging datasets • For example – the education level of a household head is recorded in the household dataset. We may be interested to find if the nutritional status of a child is related to education of the household head. But the child data is in a separate dataset. • In order to merge the datasets, a common variable must exist in each dataset. In this case, a household identifier must be in both datasets. Household dataset Child dataset Household ID Education level of household head Household ID Weight for Age z-score (WAZ)

  5. Merging datasets • In each dataset, the cases must be sorted on the household identifier • In SPSS, select Merge Files > Add Variables; select the datasets and the variable to match the datasets on • The new dataset will have the variable of interest included; In our example, we will now have a child dataset that also has the literacy of the education level of the household head included and can do our analysis Household dataset Child dataset Child dataset + Household ID Education level of household head Household ID Weight for Age z-score (WAZ) Household ID Weight for Age z-score (WAZ) Education level of household head

  6. Data cleaning • Unique ID • Missing data • Define variable properties • Scatteplot /histograms • Frequency sorting • Outliers

  7. Missing values and data cleaning • Cleaning data can be a painful process • Being systematic about cleaning data from the beginning of the process can save hours of work later in the analysis • There are few key tools to use in SPSS to clean data: • Sorting cases – allows you to quickly see within a variable if there are problematic cases • Indentify duplicate cases – shows cases which have the same unique identifier • Histograms and scatterplots – visually identify problematic variables and cases

  8. Missing values and data cleaning • The data cleaning process will also reveal cases where values are missing for certain variables • This is often expected (though in some cases may have been an error) • Handling missing values in SPSS is a simple matter of telling the software what values to handle as missing in the variable view

  9. Getting ready for analysis • Weight file • Split file • Select cases

  10. Analysis • Create new variables • Recode • Count • Compute • Rank cases (quintiles) • Aggregate • Frequencies • Compare mean • Crosstabs

  11. Create new variables using recode • Recoding a variable is most commonly used to take a categorical variable and to re-categorize it’s values. • For example – source of drinking water is a standard question in household surveys with several options that are adapted for the local context. • When describing water sources in analysis, we usually will compare improved vs. unimproved water sources • In the example on the right, the top box represents a module in the household questionnaire and the bottom box represents the categorization of improved vs. unimproved water sources. If we want to recode the question responses into a bi-variate variable, how do we do so in SPSS?

  12. Creating a new variable using compute • Computing a new variable is usually done when a mathematical formula is used to derive a new variable • A number of circumstances in a household questionnaire require computation • For example – a commonly used indicator in assessments when discussing demographics is the percentage of dependents in a household • Given the household questionnaire roster on the right, how can we create a variable for the percentage of dependents (where dependents are people under 15 and over 65)?

  13. Type of variables We work with two types of variables Categorical Continuous (Scale) Interval ex. Age 1 to n Nominal The categories are not ranked ex. 1=female, 2=male Ratio ex. Percentage of expenditure 0% to 100% Ordinal The categories are ranked ex. 1=poor, 2=medium, 3= good

  14. Type of variables

  15. Descriptive statistics Continuous Categorical • Range • Mean • Median • Mode • Frequencies • Crosstabs

  16. Best practices • Using syntax • Export files/outputs • Data file comments

More Related