Organizing your data for statistical analysis in spss
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Organizing Your Data for Statistical Analysis in SPSS PowerPoint PPT Presentation


  • 117 Views
  • Uploaded on
  • Presentation posted in: General

Organizing Your Data for Statistical Analysis in SPSS. Edward A. Greenberg, PhD. ASU HEALTH SOLUTIONS DATA LAB. Revised January 4, 2013. SPSS Data Sets. SPSS Data Sets. SPSS Data Sets. Rows are cases or observations Columns are variables (measurements)

Download Presentation

Organizing Your Data for Statistical Analysis in SPSS

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Organizing your data for statistical analysis in spss

Organizing Your Data for Statistical Analysis in SPSS

Edward A. Greenberg, PhD

ASU HEALTH SOLUTIONS DATA LAB

Revised January 4, 2013


Spss data sets

SPSS Data Sets


Spss data sets1

SPSS Data Sets


Spss data sets2

SPSS Data Sets

Rows are cases or observations

Columns are variables (measurements)

Up to 231-1 columns (2,147,493,647)

No limit on the number of cases


Variable types

Variable Types

Numeric (40 character maximum length)

Dates and times (various formats)

Other variations of numeric (currency, comma, scientific notation, etc.)

String (32,767 maximum length)


Variable names

Variable Names

Variable names must be unique.

Variable names may be up to 64 characters in length.

Names can contain letters, numbers, or special characters.

Names must start with a letter or @, #, or $.


Unit of analysis

Unit of Analysis

What constitutes a “case?”

A person

A household

An organization

An experimental trial


Level of measurement

Level of Measurement

}Scale

Nominal

Ordinal

Interval

Ratio


Labeling data

Labeling Data

Variable names may be short and cryptic.

Variable labels can be up to 255 characters.

SPSS procedures display at least 40 characters of variable labels.

Value labels can be up to 120 characters.


Order of variables

Order of Variables

The order of variables in the SPSS data file normally should be the same as the order of items in the questionnaire.

Use variable names that help you identify the scale or instrument to which they apply.


Case numbers

Case Numbers

Each case in an SPSS file should include a case number.

Often this will be the first variable in the file.

The case number does not identify the subject but it links the data record to the subject’s questionnaire.

Useful for correcting data entry errors


Create a codebook

Create a Codebook

  • When preparing to enter your data into SPSS, prepare a codebook for the data set.

  • The codebook documents all of the items to be entered in the data set:

    • Variable names and labels

    • Variable types and formats

    • Coded values for categorical items

    • Missing values


Sample codebook

Sample Codebook


Missing data

Missing Data

Data may be missing for several reasons:

Don’t know

Refused to answer

Not applicable

Skipped a question

Instrument problem

Data entry omission


Missing values

Missing Values

SPSS provides several ways of designating numeric data as “missing values.”

A blank cell is treated as “system missing,” represented by a dot (“.”) in the SPSS Data Editor.

Specific values can be declared as “user missing” values.


Missing values1

Missing Values

Up to three “user missing” values can be declared for a variable.

Or, a range of values plus one additional value can be declared to be missing.


Missing values2

Missing Values


Missing values3

Missing Values

In this example, variable AGEWED has three labeled values that are to be treated as missing


Missing values4

Missing Values

The three values are declared to be missing in the Missing Values dialog.


Missing values5

Missing Values

Expressions handle missing values in different ways.

The result of (var1+var2+var3)/3 is missing if any of the three variables is missing.

The result of MEAN(var1, var2, var3) is missing if all three of the variables are missing.


Missing values in procedures

Missing Values in Procedures

The FREQUENCIES procedure excludes cases with missing values from computations.


Multiple responses

Multiple Responses

  • Multiple-response items are questions that can have more than one value for each case.

  • Two ways of coding:

    • For each response, a variable can have one of two values e.g., 1=Yes and 2=No (“multiple-dichotomy” method)

    • Create a series of variables for 1st choice, 2nd choice, etc. (“multiple categories” method)


Mult response procedure

MULT RESPONSE Procedure

In the MULT RESPONSE procedure, multiple response variables are combines into groups.

The MULT RESPONSE procedure counts responses in multiple response groups in frequency or cross tabular tables.

Total percentages of responses generally will exceed 100%.


Repeated measures

Repeated Measures

Data that are recorded on more than one occasion for each subject

Some procedures, such as GLM, require that all measurements for a case be on the same data record.

Other procedures, such as the MIXED procedure, may expect one data record per occasion.


Repeated measures1

Repeated Measures

One data record per subject, one variable per occasion on which it is measured


Repeated measures2

Repeated Measures

One data record per occasion per subject


Repeated measures3

Repeated Measures

The good news is that SPSS allows you to easily restructure a data set

Restructure selected variables into cases

Restructure selected cases into variables

Transpose all data


  • Login