Introduction to categorical descriptive statistics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 43

Introduction to Categorical Descriptive Statistics PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on
  • Presentation posted in: General

Introduction to Categorical Descriptive Statistics. Overview. Contingency tables Notation Descriptive statistics Difference in proportions Relative risk Odds ratio SPSS. Contingency Tables. Two dimensional tables Let X and Y be categorical variables X has I levels and Y has J levels

Download Presentation

Introduction to Categorical Descriptive Statistics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Introduction to categorical descriptive statistics

Introduction to Categorical Descriptive Statistics


Overview

Overview

  • Contingency tables

    • Notation

  • Descriptive statistics

    • Difference in proportions

    • Relative risk

    • Odds ratio

  • SPSS


Contingency tables

Contingency Tables

  • Two dimensional tables

    • Let X and Y be categorical variables

      • X has I levels and Y has J levels

    • A contingency or cross-classification table is a tabular representation of the frequency counts for each pair of variable levels


Notation

Notation

  • Cell Counts


Example

Example


Cell proportions

Cell Proportions

  • Often not as interested in absolute counts within cells as opposed to the relationship between the cell proportions

  • To properly analyze cell proportions need to know experimental design and relationship between the variables

    • All variables can be considered response variables

    • One (or more) response variable and one (or more explanatory variable

      • Prospective study

      • Retrospective study


Two variables

Two Variables

  • Proportion notation

    • {ij} gives the joint distribution

    • {i+} and {+j} represent the marginals

    • {j|i} is the conditional distribution of Y given level i of X


Example1

Example


Example2

Example

  • 1982 General Social Survey report on attitudes about death penalty and gun registration

    • Calculate joint, marginal and conditional distributions


Prospective study

Prospective Study

  • Subjects either select or are selected for treatment groups and then response is studied

    • Experimental

      • Subjects are randomly allocated to treatment groups

    • Observational

      • Subjects self-select treatment group

    • Principal aim is to compare conditional distribution of response for different levels of explanatory variable(s)


Example3

Example

  • Findings from the Aspirin Component of the Ongoing Physicians’ Health Study

    • Calculate conditional distribution


Retrospective study

Retrospective Study

  • Given response, look back at levels of possible explanatory variables

    • Observational studies

  • Typically “over-sample” for response level of interest

  • If know overall population proportion in each response level could use Bayes theorem to calculate conditional distribution in direction of interest


Example4

Example

  • England-Wales 1968-1972 study on heart attacks and oral contraceptive use

    • Calculate appropriate conditional distribution


Descriptive statistics

Descriptive Statistics

  • Comparing proportions for binary responses

    • Difference of proportions

    • Relative risk

    • Odds ratios

  • Independence

    • X and Y response: pij = pi+p+j, for all i,j

      • That is, pj|i = p+j, for all i,j

    • X explanatory, Y response: pj|i = pj|h, for each j, for all i,h


Descriptive statistics1

Descriptive Statistics

  • I x J tables

    • No completely satisfactory way to summarize association

    • Pairs of odds ratios

    • Concentration coefficient

    • Uncertainty coefficient


Difference of proportions

Difference of Proportions

  • Binary response variable

    • Generally, compare response for different explanatory levels

      • p1|i - p1|h

    • Difference lies between -1 and 1

    • Independence when difference equals 0 for all i,h and response levels j

    • Reasonable measure when absolute difference in proportions is relevant

    • Also can compare differences between columns


Difference of proportions1

Difference of Proportions

  • Example


Difference of proportions2

Difference of Proportions

  • Example


Difference of proportions3

Difference of Proportions

  • Example


Relative risk

Relative Risk

  • Used when relative difference between proportions more relevant than absolute difference

  • p1|1 /p1|2

    • Relative risk of 1 corresponds to independence

    • Comparison on second response different

  • Usually can not be directly calculated from retrospective studies


Example5

Example

Risk for women having first child at 25 or older = .019 or 1.9%

Risk for women having first child before 25 = .0143 or 1.43%

Relative risk = .019/.0143 = 1.33

Increased risk = 33%


Relative risk1

Relative Risk

  • Example


Odds ratio

Odds Ratio

  • For 2x2 table,

    • In row 1, odds of being column 1 instead of column 2: O1 = p1|1 /p2|1

    • In row 2, odds of being column 1 instead of column 2: O2 = p1|2 /p2|2

    • Odds ratio: O1/O2


Odds ratio1

Odds Ratio

  • Takes values > 0

    • Sometimes look at log odds ratio

  • Invariant to interchanging rows and columns

    • Unnecessary to specify response variable

      • Unlike difference of proportions, and relative risk


Odds ratio2

Odds Ratio

  • Multiplicative invariance within given row or column

    • Like difference of proportions and relative risk

  • Equally valid for retrospective, prospective and cross-sectional studies


Example6

Example

Odds for women having first child at 25 or older = 31/1597 = .019/.981 = .0194

Odds for women having first child before 25 = 65/4475 = .0143/.9857 = .0145

Odds ratio = .0194/.0145 = 1.34


Relationship between odds ratio and relative risk

Relationship Between Odds Ratio and Relative Risk

  • Odds ratio = Relative risk (1-p1|2)/(1-p1|1)

  • When probability of outcome of interest is small, regardless of row condition, then can use odds ratio as an estimate of relative risk


Example7

Example


Interpreting risks and odds

Interpreting Risks and Odds

  • Assess baseline risk

    • Example: Men who drink 16 ounces of beer a day are three times more likely to develop rectal cancer

  • Know time period of risk

    • Risks accumulate with time

      • Example: 1 in 9 women will develop breast cancer over their lifetime. But annual risk of women in their 30’s is 1 in 3700 and women in their 70’s is 1 in 235

  • Investigate confounding factors

    • Example: Older cars are almost 6 times as likely to be stolen than newer cars


Simpson s paradox

Simpson’s Paradox

  • Survival rates for a standard and a new treatment at two hospitals


Relative risk2

Relative Risk

  • Hospital A:

    • Risk of dying with standard treatment = 95/100 = .95

    • Risk of dying with new treatment = 900/1000 = .90

    • Relative risk = .95/.90 = 1.06


Relative risk3

Relative Risk

  • Hospital B:

    • Risk of dying with standard treatment = 500/1000 = .5

    • Risk of dying with new treatment = 5/100 = .05

    • Relative risk = .5/.05 = 10.0


Combined data

Combined Data

  • Group data from both hospitals

    • Risk of dying with standard treatment = 595/1100 = .54

    • Risk of dying with new treatment = 905/1100 = .82

    • Relative risk = .54/.82 = .66


What is going on

What is Going On?

  • When data is combined, lose the information that the patients in Hospital A had BOTH a higher overall death rate AND a higher likelihood of receiving the new treatment

  • Misleading to summarize information over groups, especially if subjects were not randomly assigned to groups


Confounding variables

Confounding Variables

  • Television ownership versus movie attendance


Confounding variables1

Confounding Variables

  • Control for income


More examples

More Examples

  • Discrimination in college admission

  • Racial bias in death penalty sentences


College admission bias

College Admission Bias

  • Over a given number of years the University of California, Berkeley admitted 44% of all men who applied to any one of six graduate programs and only 30% of women who applied

    • Is there evidence of discrimination in graduate admissions at Berkeley?


College admission bias1

College Admission Bias


College admission bias2

College Admission Bias?


Death penalty sentences

Death Penalty Sentences

  • Results of 1981 Florida study of whether race of homicide defendant affect likelihood that death penalty would receive death penalty


Question

Question

  • Based on data, is race a factor in whether the death penalty is received and if so how is race a factor?


Death penalty sentences1

Death penalty sentences


  • Login