Statistics for GP and the AKT

1 / 74

# Statistics for GP and the AKT - PowerPoint PPT Presentation

Statistics for GP and the AKT. Sept ‘ 11. Aims. Be able to understand statistical terminology, interpret stats in papers and explain them to patients. Pass the AKT. Why should you care?. 10% of questions Much less than 10% of the work Easy marks. Plan – don ’ t despair!.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Statistics for GP and the AKT' - sara-glenn

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Statistics for GP and the AKT

Sept ‘11

Aims
• Be able to understand statistical terminology, interpret stats in papers and explain them to patients.
• Pass the AKT
Why should you care?
• 10% of questions
• Much less than 10% of the work
• Easy marks
Plan – don’t despair!
• Representing data:
• Parametric v non parametric data
• Normal distribution and standard deviation
• Types of data
• Mean, median, mode
• Prevalence and incidence
• Types of research:
• Types of studies
• Types of bias
• Tests of statistical significance
• Significance of results :
• P value
• Confidence intervals
• Type 1 and type 2 error
• Magnitude of results:
• NNT, NNH
• Absolute risk reduction, Relative risk reduction
• Hazard ratio
• Odds ratio
• Clinical tests
• Sensitivity, specificity
• Positive predictive value, negative predictive value
• Likelihood ratios for positive and negative test
• Pretty pictures:
• Forest plot
• Funnel plot
• Kaplan-Meier survival curve
The Normal Distribution
• Frequency on y axis and continuous variable on x
• Symmetrical, just as many have more than average as less than average
• Generally true for medical tests and measurements
Standard deviation
SD and the normal distribution
• 68.2% of data within 1SD
• 95.5% of data within 2SD
• 99.8% of data within 3SD
• 95% of data within 1.96 SD
Defining ‘normal’
• Can be used to define normal for medical tests e.g. Na
• But be definition 5% of ‘normal’ people will be ‘too high’ and 5% ‘too low’.
Parametric and non-parametric
• If it’s normally distributed, it’s parametric
• If it’s skewed, it’s non-parametic
Mean, median and mode
• Use mean for parametric data
• Median for non parametric data
• In a normal distribution:

Mean = median = mode

• For a negatively skewed distribution:

Mean < median < mode

• For a positively skewed distribution:

Mean > median > mode

• Remember alphabetical order, <for negative, >for positive
Types of data
• Continuous – can take any value e.g. height
• Discrete – can only take integers e.g. number of asthma attacks
• Nominal – into categories in no particular order e.g. colour of smarties
• Ordinal – into categories with an inherent rank e.g. Bristol stool chart
Prevalence and incidence
• Prevalence – proportion of people that have a disease at a given time
• Incidence – number of new cases per population per time
• Prevalence = incidence x length of disease
RCT

Cohort

Case controlled

Cross sectional

Group work

Definition

Strengths

Weaknesses

Example where it would be the most appropriate study to use

Types of research
RCT
• Interventional study
• Used to compare treatment(s) with a control group.
• Control group have placebo or current best treatment.
• Best evidence but….
• Expensive and ethical problems
• Two types
• Group comparative
• Cross-over

Disease

Exposed

Well

Population

selection

Disease

Not exposed

Well

Cohort
• Longitudinal/follow-up studies.
• Usually prospective
• Assessed using relative risk

Time

Exposed

Disease

Not exposed

Population

Time

selection

Exposed

Well

Not exposed

Case control
• Usually retrospective
• Reverse cohort study
• Assessed using odds ratio
Cross-sectional
• Prevalence study
• Evaluate a defined population at a specific time.
• Used to assess disease status and compare populations
Levels of Evidence
• Ia – Meta analysis of RCT’s
• Ib – RCT(s)
• IIa – well designed non-randomised trial(s)
• IIb – well designed experimental trial(s)
• III – case, correlation and comparative
• IV – panel of experts
• Ia – Meta analysis of RCT’s
• Ib – RCT(s)
• IIa – well designed non-randomised trial(s)
• IIb – well designed experimental trial(s)
• III – case, correlation and comparative
• IV – panel of experts

A

B

C

Bias
• Confounding
• Observer
• Publication
• Sampling
• Selection

CARD SORT

For bonus points, spot the odd one out!

Bias
• Confounding
• Exposed and non-exposed groups differ with respect characteristics independent of risk factor.
• Observer
• The patient/clinician know which treatment is being received.
• Outcome measure has a subjective element.
• Publication
• Clinically significant results are more likely to be published
• Negative results are less likely to be published
• Sampling
• Non-random selection from target population.
• Selection
• Intervention allocation to the next person is known before recruitment.
Avoiding Bias
• Confounding
• Study design
• Observer
• Blinding
• Publication
• Journals accept more outcomes with non-significant results
• Sampling
• Compare groups statistically
• Selection
• Randomisation
Types of significance testsQualitative
• Single sample (my sample vs manufacturer’s claim)
• Binomial test
• >1 independent sample (drug A vs drug B)
• Small sample – Fisher exact test
• Larger sample – Chi-squared
• Dependent sample
• Percentage agreement (+/- Kappa statistic)
Types of significance testsQuantitative - Parametric
• Single sample
• Student one-sample t-test
• Two independent samples
• Student independent samples t-test
• Two dependent samples
• Student dependent samples t-test
• >2 independent samples
• One-way ANOVA
• >2 dependent samples
• ANOVA
• Correlation
• Pearson correlation coefficient
Types of significance testsQuantitative – Non-parametric
• Single sample
• Kolmogorov-Smirnov test
• Two independent samples
• Mann-Whitney
• Two dependent samples
• Wilcoxon matched pairs sum test
• >2 independent samples
• Kruskal-Wallis test
• >2 dependent samples
• Friedman test
• Correlation
• Spearman
Types of significance testssummary table

*Chi squared – can be used to compare quantitative data if look at proportions/percentages

P value

“The p value is equal to the probability of achieving a result at least as extreme as the experimental outcome by chance”

• Usually significance level is 0.05

i.e. the chance that there is no real difference is less than 5%

Hypothesis
• Null hypothesis – states that there is no difference between the 2 treatments
Errors
• Type I error:
• False positive
• The null hypothesis is rejected when it is true
• Probability is equal to p value
• Depends on significance level set not on sample size
• Risk increased if multiple end points
• Type II error:
• False negative
• The null hypothesis is accepted when it is true i.e. fail to find a statistical significant difference
• More likely if small sample size
Confidence intervals
• 95% confidence interval means you are 95% sure that the result for the true population lies within this range
• The bigger the sample, i.e. the more representative of the true population, the smaller the confidence interval.
Confidence intervals (the maths)
• For 95% confidence interval:

Mean ± 1.96 x SEM

• Standard error of the mean

= SD / √n

i.e. standard deviation divided by square root of number of samples

As number of samples increases, SEM decreases.

Confidence intervals
• We measure the concentration span of a sample of 36 VTS trainees. The mean concentration span is 2.4 seconds and the standard deviation is 1.2 seconds.
• What is the approximate 95% confidence interval?
• 1.2 – 3.6 seconds
• Too short to measure and getting shorter
• 2.2 – 2.6 seconds
• 2.3 – 2.5 seconds
• 2.0 – 2.8 seconds
• I don’t care
Confidence intervals and trials
• If the confidence interval of a difference doesn’t include 0, then the result is statistically significant.

After 30 minutes of stats, the mean reduction in attention span was 2.3 minutes (0.8 – 3.8).

• If the confidence interval of a relative risk doesn’t include 1, then the result is statistically significant.

Relative risk of death after learning about stats was 0.7(0.3 – 1.1)

Magnitude of results
• NNT, NNH
• Absolute risk reduction, Relative risk reduction
• Hazard ratio
• Odds ratio
Relative risk
• How many times more likely if….?
• EER = Exposed (or experimental) event rate
• CER = Control event rate
• RR = EER / CER
Relative risk reduction (or increase)

RRR (RRI) = EER-CER

CER

RRI = relative risk reduction

EER = exposed event rate

CER = control event rate

Hazard
• Hazard ratio (HR) – estimate of RR over time
• Deaths rate in A/Death rate in B

(2=twice as many, 0.5=half as many)

• Note: hazard ratio does not reflect median survival time it is relative probability of dying
Number needed to treat (NNT)Number needed to harm (NNH)
• How many patients need to be treated to...
• Absolute risk reduction (ARR)=EER-CER

NNT = 1/ARR = 1/EER-CER

Scenario
• Claire Stewart thought women with no hair were more likely to pass CSA because having hair would distract trainees by getting in their eyes.
• She tested this by randomising her female trainees.

What is the relative risk of passing?

• What is the RRR/RRI?
• What is the NNT?
Odds ratio
• Used in case control studies
• Odds ratio: case odds/control odds

It doesn’t need the total.

How good is a test at predicting disease?
• If the test is negative, how sure can you be that you don’t have the disease?
• If the test is positive, how sure can you be that you do have the disease?
Sensitivity and specificity
• Sensitivity – proportion people that have the disease that test positive
• Specificity – proportion of people that don’t have the disease that test negative
Predictive values
• Positive predictive value – proportion of positive tests that actually represent disease
• Negative predictive value – proportion of negative tests that don’t have disease
Likelihood ratios
• Take into account prevalence of disease so are more useful
• Likelihood ratio for a positive test =

sensitivity / 1 – specificity

• Likelihood ratio for a negative test =

1 – sensitivity / specificity

• A likelihood ratio of greater than 1 indicates the test result is associated with the disease.
• A likelihood ratio less than 1 indicates that the result is associated with absence of the disease.
• A likelihood ratio close to 1 means the test is not very useful
An example….
• In a VTS group of 110 people, 30 people have the dreaded lurgy. A test is developed for this. Of the 30 people with the dreaded lurgy, 18 have a positive test. 16 of the others also have a positive test.
• What is the likelihood ratio for a positive test?
Pretty pictures
• Forest plot
• Funnel plot
• Kaplan-Meier survival curve
Forest plotsaka Blobbograms
• Used in meta analysis
• Graphical representation of results of different RCT’s

Odds ratio

of study

Confidence

interval

Studies

Size of box

= study size

Odds ratio of

summary

measure

Summary measure

Confidence interval

OR (CI)

Funnel plot
• Used in meta-analysis
• Demonstrates the presence/absence of publication bias

Y axis –

Measure of

precision

Individual study

X axis –

Treatment effect

Increased precision of study = reduced variance

Asymmetrical funnel = publication bias (missing data/studies)

Kaplan-Meier Survival Curve
• What % of people are still alive
Scenario
• We’ve driven Sarah Egan to insanity by not doing enough learning logs.
• She’s gone on a rampage with a gun because basically life will be better without any of us around (nothing to do with pregnancy hormones…obviously)
• Draw the Kaplan-Meier survival curve for MK GP trainees

Number of

trainees

Time (units)