Lecture 2: Statistical Overview

1 / 39

Lecture 2: Statistical Overview - PowerPoint PPT Presentation

Child Psychiatry Research Methods Lecture Series. Lecture 2: Statistical Overview. Elizabeth Garrett esg@jhu.edu. Two Types of Statistics. Descriptive Statistics

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Lecture 2: Statistical Overview' - jake

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Child Psychiatry

Research Methods Lecture Series

Lecture 2:Statistical Overview

Elizabeth Garrett

esg@jhu.edu

Two Types of Statistics
• Descriptive Statistics
• Uses sample statistics (e.g. mean, median, standard deviation) to describe the sample and the population from which it was drawn.
• Not “decision” oriented
• Pilot studies are descriptive
• Statistical Inference
• Inference: The act of passing from statistical sample data to generalizations …. usually with calculated degrees of certainty.
• Key elements:
• sample
• generalizations
• certainty
• Often used for making decisions:
• drug works or it doesn’t
• ADHD is genetically inherited or it isn’t
Example 1: “Viral Exposure and Autism”(Deykin and MacMahon, 1979)
• Hypothesis:
• Direct exposure to or clinical illness with measles, mumps, or chicken pox may play a causal role in autism.
Example 2:“Neurobiology of Attention in Fetal Alcohol Syndrome”(Lockhart, 2001?)

Hypotheses:

(1) The neurobiological basis of problems in response inhibition and motor impersistence in children with FAS is related to abnormalities in the “anterior” frontostriatal network.

(2) The neurobiological basis of problems in orienting/shifting attention in children with FAS is related to abnormalities in the “posterior” parietal network.

4 Statistical Plan

4.1 Primary outcome(s)

4.2 Statistical analysis

4.3 Sample size justification

4 Statistical Plan

4.1 Primary outcome(s)

Primary outcome variable not defined!

Common Problem:

Defining Primary Outcome Variables

Continuous

• MRI volumes
• fMRI activation levels
• blood pressure
• response time
• number of voxels activated
• cost of hospital visit
• neurobehavioral test score
Categorical

Nominal

Binary (two categories)

gene carrier status (as diagnosed by….)

measles (as diagnosed by….)

Polychotomous (more than two unordered categories)

region of activation

Ordinal

severity score (see BPI)

symptom rating

“on a scale of 1 to 5….”

Example 1: Primary outcomes

Disease history of

• measles
• mumps
• chicken pox

Example 2: Primary outcomes

MRI volumes of

• corpus collosum
• caudate
• cerebellar vermis
• parietal lobes
• frontal lobe

4 Statistical Plan

4.1 Primary outcomes

- Be clear about each variable and how it is measured.

- NOT okay to say “our primary outcome variable is cognition.”

- It IS okay to say “our primary outcome variable is cognition

as measured by the WISC-III.”

- Multiple outcomes are okay: e.g. MRI volumes and

cognitive tests can both be primary outcomes.

4 Statistical Plan

4.1 Primary outcome(s)

4.2 Statistical analysis

- How are you going to answer specific aims using

primary outcome variable?

Commonly seen statistical methods in analysis plans:
• t-test
• confidence interval
• Chi-square test
• Fisher’s exact test
• linear regression
• logistic regression
• Wilcoxon rank sum test
• ANOVA
• GEE
Key Idea: Data Reduction
• Statistics is the art/science of summarizing a large amount of information by just a few numbers and/or statements.
• Examples:

pvalue = 0.01

OR = 5.0

prevalence = 0.20  0.05

Example 1:
• Recall aim: To compare measles history in autistic versus non-autistic kids.
• Methods:
• Odds ratio: Quantifies risk of disease in two exposure groups
• Confidence interval: Answers “What is reasonable range for true odds ratio?”
• Fisher’s exact test: Answers “Is the risk the same in the two exposure groups?”
Statistical Analysis

“We will measure the risk of autism associated with measles using an odds ratio. Significance will be assessed by Fisher’s exact test and a 95% confidence interval will be calculated.”

Example 2:
• Recall aim: To compare MRI volumes in FAS kids and controls.
• Methods:
• Two-sample t-test: Answers “are the mean volumes in the two groups different?”
• 95% confidence interval: Answers “what is the estimated difference in volumes in the two groups, approximately?”
Statistical Analysis

“To answer the specific aims, we will compare the caudate volumes in the FAS group to those in the control group using a two sample t-test. We will also estimate a 95% confidence interval to provide a reasonable range of the difference in mean volumes in the two groups.”

4 Statistical Plan

4.1 Primary outcome(s)

4.2 Statistical analysis

- Data reduction is key: How are you going to combine

information from all patients to answer scientific

question?

- Specific methods need to be designated.

- Study design often changes after statistical

issues are considered!

4 Statistical Plan

4.1 Primary outcome(s)

4.2 Statistical analysis

4.3 Sample size justification

- Do you have enough subjects to answer the question,

but not too many so that you are efficient (in terms of

money and risks)?

Power and Sample Size Considerations
• All about precision! (Recall Craig last time)
• Intuition:
• the more individuals, the better your estimate
• the more individuals, the lessvariability in your estimate
• the more individuals, the moreprecise your estimate
• but, how precise need your estimate be?
• Example 1:
• Odds ratio of measles for autism: 3.7
• Interpretation: Babies exposed to measles prenatally or in early infancy are at 3.7 times the risk for autism compared to children who are unexposed.
• Strong result?
Three Theoretical Outcomes

95% confidence intervals

Actual Result from Study

95% Confidence interval: (0.97, 14.2)

Fisher’s exact pvalue = 0.12

Magnitude versus Significance
• Magnitude of finding: How big is the odds ratio?
• Statistical significance of the finding: Is the odds ratio different than 1?
• Clinical significance of the finding: Is the size of the estimated odds ratio worth worrying about?
• Autism and Measles:
• exposure to measles is rare
• need a lot of subjects to show significant difference!
Justifying sample size in a study design

Hypothesis testing:

Ho: OR=1

Ha: OR=3

Which is a more reasonable conclusion?

Issues:

type 1 error ()

type 2 error ()

Ha

Ho

Type I and II Errors
• Type I error ():
• The probability that we reject Ho given that it is true
• The probability that we find an association between measles and autism when, in truth, one does not exist.
• Type II error ():
• The probability that we reject Ha given that it is true
• The probability that we find no association between measles and autism when, in truth, one does exist.
• Note: Power = 1 - 
Sample size dictates overlap

Scenario 1:

Small samples

Large samples

Scenario 2:

Decision Rule
• Before study is completed, you know what you need to observe to find evidence for OR=1 or OR=3
• Scenario 1: If observed OR > 3.6, then conclude that there IS an association
• Scenario 2: If observed OR > 1.6, then conclude that there IS an association.
Type I Error: alpha

Alpha usually

predetermined = 0.05

Type II Error: beta

Beta is figured

out conditional

on alpha.

 = 0.60

If sample size is small,

beta will be big

If sample size is big,

beta will be small

 =0.02

Power: 1- beta

Power is

1 - beta.

Power = 0.40

If sample size is small,

power will be small

If sample size is large,

power will be large

Power = 0.98

Power/Sample Size Estimate
• Kids with Autism: N = 608
• Kids without Autism: N = 1216

“Using Fisher’s exact test, we have 80% power with alpha = 0.05 to detect an odds ratio of 3 if we enroll 608 children with autism and 1216 normal controls. This assumes that 3% of autistic children have been exposed to measles and 1% of the controls have been exposed.”

Example 2: FAS and controls
• How many FAS children and controls do we need to detect a significant difference in MRI volumes?
• From previous research we can estimate (i.e. guess):
• Volumes of cerebellar vermis in FAS kids are approximately 400.
• It would be interesting if FAS kids had volumes 10% or more less than normal controls (i.e. 400 versus 450).

control

FAS

control

FAS

Two sample t-test
• Same general approach as the odds ratio
• Define  = difference in mean volumes = control mean - FAS mean
• H0:  = 0
• Ha:  = 50
• Same thing: which hypothesis is more reasonable based on our data?
• Note: Based on previous research, we can estimate that the standard deviaion of volumes is 70.
What if N = 100 (50 per group)?

Alpha = 0.05

Beta = 0.06

Power/Sample Size Options
• For power = 80%, alpha = 0.05

32 FAS and 32 controls

• For power = 90%, alpha = 0.05

43 FAS and 43 controls

“To achieve 80% power with a type I error of 5%, we require 32 FAS kids and 32 controls. This will allow us to detect a 10% difference in mean MRI volumes of cerebellar vermis (400 versus 450, respectively) assuming standard deviations of 70 in each group.”

4 Statistical Plan

4.1 Primary outcome(s)

4.2 Statistical analysis

4.3 Sample size justification

-Explain justification in terms of statistics. Saying “we

are confident that 10 subjects will provide….” is not

sufficient.

General Biostatistics References
• Practical Statistics for Medical Research. Altman. Chapman and Hall, 1991.
• Medical Statistics: A Common Sense Approach. Campbell and Machin. Wiley, 1993
• Principles of Biostatistics. Pagano and Gauvreau. Duxbury Press, 1993.
• Fundamentals of Biostatistics. Rosner. Duxbury Press, 1993.