Introduction to categorical descriptive statistics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 43

Introduction to Categorical Descriptive Statistics PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

Introduction to Categorical Descriptive Statistics. Overview. Contingency tables Notation Descriptive statistics Difference in proportions Relative risk Odds ratio SPSS. Contingency Tables. Two dimensional tables Let X and Y be categorical variables X has I levels and Y has J levels

Download Presentation

Introduction to Categorical Descriptive Statistics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Introduction to Categorical Descriptive Statistics


Overview

  • Contingency tables

    • Notation

  • Descriptive statistics

    • Difference in proportions

    • Relative risk

    • Odds ratio

  • SPSS


Contingency Tables

  • Two dimensional tables

    • Let X and Y be categorical variables

      • X has I levels and Y has J levels

    • A contingency or cross-classification table is a tabular representation of the frequency counts for each pair of variable levels


Notation

  • Cell Counts


Example


Cell Proportions

  • Often not as interested in absolute counts within cells as opposed to the relationship between the cell proportions

  • To properly analyze cell proportions need to know experimental design and relationship between the variables

    • All variables can be considered response variables

    • One (or more) response variable and one (or more explanatory variable

      • Prospective study

      • Retrospective study


Two Variables

  • Proportion notation

    • {ij} gives the joint distribution

    • {i+} and {+j} represent the marginals

    • {j|i} is the conditional distribution of Y given level i of X


Example


Example

  • 1982 General Social Survey report on attitudes about death penalty and gun registration

    • Calculate joint, marginal and conditional distributions


Prospective Study

  • Subjects either select or are selected for treatment groups and then response is studied

    • Experimental

      • Subjects are randomly allocated to treatment groups

    • Observational

      • Subjects self-select treatment group

    • Principal aim is to compare conditional distribution of response for different levels of explanatory variable(s)


Example

  • Findings from the Aspirin Component of the Ongoing Physicians’ Health Study

    • Calculate conditional distribution


Retrospective Study

  • Given response, look back at levels of possible explanatory variables

    • Observational studies

  • Typically “over-sample” for response level of interest

  • If know overall population proportion in each response level could use Bayes theorem to calculate conditional distribution in direction of interest


Example

  • England-Wales 1968-1972 study on heart attacks and oral contraceptive use

    • Calculate appropriate conditional distribution


Descriptive Statistics

  • Comparing proportions for binary responses

    • Difference of proportions

    • Relative risk

    • Odds ratios

  • Independence

    • X and Y response: pij = pi+p+j, for all i,j

      • That is, pj|i = p+j, for all i,j

    • X explanatory, Y response: pj|i = pj|h, for each j, for all i,h


Descriptive Statistics

  • I x J tables

    • No completely satisfactory way to summarize association

    • Pairs of odds ratios

    • Concentration coefficient

    • Uncertainty coefficient


Difference of Proportions

  • Binary response variable

    • Generally, compare response for different explanatory levels

      • p1|i - p1|h

    • Difference lies between -1 and 1

    • Independence when difference equals 0 for all i,h and response levels j

    • Reasonable measure when absolute difference in proportions is relevant

    • Also can compare differences between columns


Difference of Proportions

  • Example


Difference of Proportions

  • Example


Difference of Proportions

  • Example


Relative Risk

  • Used when relative difference between proportions more relevant than absolute difference

  • p1|1 /p1|2

    • Relative risk of 1 corresponds to independence

    • Comparison on second response different

  • Usually can not be directly calculated from retrospective studies


Example

Risk for women having first child at 25 or older = .019 or 1.9%

Risk for women having first child before 25 = .0143 or 1.43%

Relative risk = .019/.0143 = 1.33

Increased risk = 33%


Relative Risk

  • Example


Odds Ratio

  • For 2x2 table,

    • In row 1, odds of being column 1 instead of column 2: O1 = p1|1 /p2|1

    • In row 2, odds of being column 1 instead of column 2: O2 = p1|2 /p2|2

    • Odds ratio: O1/O2


Odds Ratio

  • Takes values > 0

    • Sometimes look at log odds ratio

  • Invariant to interchanging rows and columns

    • Unnecessary to specify response variable

      • Unlike difference of proportions, and relative risk


Odds Ratio

  • Multiplicative invariance within given row or column

    • Like difference of proportions and relative risk

  • Equally valid for retrospective, prospective and cross-sectional studies


Example

Odds for women having first child at 25 or older = 31/1597 = .019/.981 = .0194

Odds for women having first child before 25 = 65/4475 = .0143/.9857 = .0145

Odds ratio = .0194/.0145 = 1.34


Relationship Between Odds Ratio and Relative Risk

  • Odds ratio = Relative risk (1-p1|2)/(1-p1|1)

  • When probability of outcome of interest is small, regardless of row condition, then can use odds ratio as an estimate of relative risk


Example


Interpreting Risks and Odds

  • Assess baseline risk

    • Example: Men who drink 16 ounces of beer a day are three times more likely to develop rectal cancer

  • Know time period of risk

    • Risks accumulate with time

      • Example: 1 in 9 women will develop breast cancer over their lifetime. But annual risk of women in their 30’s is 1 in 3700 and women in their 70’s is 1 in 235

  • Investigate confounding factors

    • Example: Older cars are almost 6 times as likely to be stolen than newer cars


Simpson’s Paradox

  • Survival rates for a standard and a new treatment at two hospitals


Relative Risk

  • Hospital A:

    • Risk of dying with standard treatment = 95/100 = .95

    • Risk of dying with new treatment = 900/1000 = .90

    • Relative risk = .95/.90 = 1.06


Relative Risk

  • Hospital B:

    • Risk of dying with standard treatment = 500/1000 = .5

    • Risk of dying with new treatment = 5/100 = .05

    • Relative risk = .5/.05 = 10.0


Combined Data

  • Group data from both hospitals

    • Risk of dying with standard treatment = 595/1100 = .54

    • Risk of dying with new treatment = 905/1100 = .82

    • Relative risk = .54/.82 = .66


What is Going On?

  • When data is combined, lose the information that the patients in Hospital A had BOTH a higher overall death rate AND a higher likelihood of receiving the new treatment

  • Misleading to summarize information over groups, especially if subjects were not randomly assigned to groups


Confounding Variables

  • Television ownership versus movie attendance


Confounding Variables

  • Control for income


More Examples

  • Discrimination in college admission

  • Racial bias in death penalty sentences


College Admission Bias

  • Over a given number of years the University of California, Berkeley admitted 44% of all men who applied to any one of six graduate programs and only 30% of women who applied

    • Is there evidence of discrimination in graduate admissions at Berkeley?


College Admission Bias


College Admission Bias?


Death Penalty Sentences

  • Results of 1981 Florida study of whether race of homicide defendant affect likelihood that death penalty would receive death penalty


Question

  • Based on data, is race a factor in whether the death penalty is received and if so how is race a factor?


Death penalty sentences


  • Login