Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice

1. 1 Bread and butter statistics:RCGP Curriculum Statement 3.5: Evidence-Based Practice

2. 2

3. 3 Topics for today - 1 Audit - definition Research � definition Bias Blinding Confidence intervals

4. 4 Topics for today - 2 Forest plot L�Abb� plot Hypothesis Null hypothesis Incidence Prevalence

5. 5 Topics for today - 3 Normal distribution Parameter Statistic Variable P-value Number needed to treat Number needed to harm

6. 6 Topics for today - 4 Odds ratio Statistical power Sensitivity Positive predictive value Specificity Reliability Validity

7. 7 Useful Websites http://www.jr2.ox.ac.uk/bandolier/ http://www.cebm.net http://www.cas.lancs.ac.uk/glossary_v1.1/Alphabet.html

8. 8 Audit � definition Clinical audit is a quality improvement process It seeks to improve patient care and outcomes through systematic review of care against explicit criteria and the implementation of change Aspects of the structure, processes, and outcomes of care are selected and systematically evaluated against explicit criteria Where indicated, changes are implemented at an individual, team, or service level and further monitoring is used to confirm improvement in healthcare delivery NICE

9. 9 The Audit cycle Identify the need for change Problems can be identified in 3 areas: Structure, Process, Outcome Setting Criteria and Standards - what should be happening Collect data on performance Assess performance against criteria and standards Identify changes needed

10. 10 The Audit cycle

11. 11 Research - definition Research is an ORGANISED and SYSTEMATIC way of FINDING ANSWERS to QUESTIONS SYSTEMATIC - certain things in the research process are always done in order to get the most accurate results ORGANISED - there is a structure or method in doing research. It is a planned procedure, not a spontaneous one. It is focused and limited to a specific scope FINDING ANSWERS is the aim of all research. Whether it is the answer to a hypothesis or even a simple question, research is successful when answers are found even if the answer is negative QUESTIONS are central to research. If there is no question, then the answer is of no use. Research is focused on relevant, useful, and important questions. Without a question research has no purpose

12. 12 Bias Dictionary definition - 'a one-sided inclination of the mind'. It defines a systematic tendency of certain trial designs to produce results consistently better or worse than other designs In studies of the effects of health care bias can arise from: systematic differences in the groups that are compared (selection bias) the care that is provided, or exposure to other factors apart from the intervention of interest (performance bias) withdrawals or exclusions of people entered into the study (attrition bias) how outcomes are assessed (detection bias) This use of bias does not necessarily imply any prejudice, such as the investigators' desire for particular results, which differs from the conventional use of the word in which bias refers to a partisan point of view

13. 13 Blinding participants, investigators and/or assessors remain ignorant concerning the treatments which participants are receiving. The aim is to minimise observer bias, in which the assessor, the person making a measurement, have a prior interest or belief that one treatment is better than another, and therefore scores one better than another just because of that.In a single blind study it is may be the participants who are are blind to their allocations, or those who are making measurements of interest, the assessors. In a double blind study, at a minimum both participants and assessors are blind to their allocations.In some circumstances much more complicated designs can be used, where blinding is described at different levels. To achieve a double-blind state, it is usual to use matching treatment and control treatments. For instance, the tablets can be made to look the same, or if one treatment uses a single pill once a day, but the other uses three pills at various times, all patients will have to take pills during the day to maintain blinding. If treatments are radically different (tablets compared with injection), a double-dummy technique may be used where all patients receive both an injection and a tablet, in order to maintain blinding. lack of blinding is a potent source of bias, and open studies or single-blind studies are potential problems for interpreting results of trials. Concealment of allocation The process used to prevent foreknowledge of group assignment in a randomised controlled trial, which should be seen as distinct from blinding. The allocation process should be impervious to any influence by the individual making the allocation - administered by someone who is not responsible for recruiting participants; for example, a hospital pharmacy, or a central office. Using methods of assignment such as date of birth and case record numbers (see quasi random allocation) are open to manipulation. Adequate methods of allocation concealment include: centralized randomisation schemes; randomisation schemes controlled by a pharmacy; numbered or coded containers in which capsules from identical-looking, numbered bottles are administered sequentially; on-site computer systems, where allocations are in a locked unreadable file; and sequentially numbered opaque, sealed envelopes.

14. 14 Confidence intervals Quantifies the uncertainty in measurement. It is usually reported as 95% CI, which is the range of values within which we can be 95% sure that the true value for the whole population lies. For example, for an NNT of 10 with a 95% CI of 5 and 15, we would have 95% confidence that the true NNT value was between 5 and 15.

15. 15 Confidence intervals A confidence interval calculated for a measure of treatment effect shows a range within which the true treatment effect is likely to lie. Confidence intervals are preferable to p-values, as they tell us the range of possible effect sizes compatible with the data. A confidence interval that embraces the value of no difference indicates that the treatment under investigation is not significantly different from the control. Confidence intervals aid interpretation of clinical trial data by putting upper and lower bounds on the likely size of any true effect. Bias must be assessed before confidence intervals can be interpreted. Even very large samples and very narrow confidence intervals can mislead if they come from biased studies. Non-significance does not mean �no effect�. Small studies will often report non-significance even when there are important, real effects. Statistical significance does not necessarily mean that the effect is real: by chance alone about one in 20 significant findings will be spurious. Statistical significance does not necessarily mean clinically important. It is the size of the effect that determines the importance, not the presence of statistical significance.

16. 16 Forest plot In a typical forest plot, the results of component studies are shown as squares centred on the point estimate of the result of each study. A horizontal line runs through the square to show its confidence interval usually, but not always, a 95% confidence interval. The overall estimate from the meta-analysis and its confidence interval are put at the bottom, represented as a diamond. The centre of the diamond represents the pooled point estimate, and its horizontal tips represent the confidence interval. Significance is achieved at the set level if the diamond is clear of the line of no effect. The plot allows readers to see the information from the individual studies that went into the meta-analysis at a glance. It provides a simple visual representation of the amount of variation between the results of the studies, as well as an estimate of the overall result of all the studies together

17. 17 Meta-analysis of effect of beta�blockers on mortality after myocardial infarction

18. 18 In the modern format ~

19. 19 L'Abb� plot L'Abb� plot � A first stage in any review is to look at a simple scatter plot, which can yield a surprisingly comprehensive qualitative view of the data. Even if the review does not show the data in this way you can do it from information on individual trials presented in the review tables. Trials in which the experimental treatment proves better than the control (EER > CER) will be in the upper left of the plot, between the y axis and the line of equality (Figure 1). If experimental is no better than control then the point will fall on the line of equality (EER = CER), and if control is better than experimental then the point will be in the lower right of the plot, between the x axis and the line of equality (EER < CER).

20. 20 L'Abb� plot Visual inspection gives a quick and easy indication of the level of agreement among trials. Heterogeneity is often assumed to be due to variation in the experimental and control event rates, but that variation is often due to the small size of trials. L'Abb� plots are becoming widely used, probably because people can understand them. They do have several benefits: the simple visual presentation is easy to assimilate. They make us think about the reasons why there can be such wide variation in (especially) placebo responses, and about other factors in the overall package of care that can contribute to effectiveness. They explain the need for placebo controls if ethical issues about future trials arise. They keep us sceptical about overly good or bad results for an intervention in a single trial where the major influence may be how good or bad was the response with placebo. Ideally a L'Abb� plot should have the symbols appropriate to the size of the trials. In Figure 2, There is an inset for the symbol size, and the two colours show trazodone used for erectile dysfunction in two different conditions (and with clear clinical heterogeneity, Bandolier 116). Figure 2: Trazodone for erectile dysfunction in psychogenic erectile dysfunction (dark symbols) and with physiological or mixed aetiology (light symbols)

21. 21 Hypothesis A tentative supposition with regard to an unknown state of affairs, the truth of which is thereupon subject to investigation by any available method, either by logical deduction of consequences which may be checked against what is known, or by direct experimental investigation or discovery of facts not hitherto known and suggested by the hypothesis. A proposition put forward as a supposition rather than asserted. A hypothesis may be put forward for testing or for discussion, possibly as a prelude to acceptance or rejection. �It is a truth universally acknowledged that a man in possession of a good fortune must be in search of a good wife.�

22. 22 Null hypothesis The statistical hypothesis that one variable (e.g. whether or not a study participant was allocated to receive an intervention) has no association with another variable or set of variables (e.g. whether or not a study participant died), or that two or more population distributions do not differ from one another. In simplest terms, the null hypothesis states that the results observed in a study are no different from what might have occurred as a result of the play of chance.

23. 23 Incidence The proportion of new cases of the target disorder in the population at risk during a specified time interval. It is usual to define the disorder, and the population, and the time, and report the incidence as a rate. For some examples of incidence studies, two of the best relate to how Parkinson's disease incidence varies with latitude, and how Perthes' disease (a developmental problem of the hip joint affecting younger children) varies with deprivation.

24. 24 Prevalence This is a measure of the proportion of people in a population who have a disease at a point in time, or over some period of time. There are several examples of prevalence worth looking at: Geographic variation in multiple sclerosis prevalence Prevalence of Atrial Fibrillation COPD prevalence Prevalence of schizophrenic disorders Body piercing - prevalence and risks Prevalence of migraine Prevalence and incidence of gout

25. 25 Normal distribution Normal distributions are a family of distributions that have the same general shape. They are symmetric with scores more concentrated in the middle than in the tails. Normal distributions are sometimes described as bell shaped. The height of a normal distribution can be specified mathematically in terms of two parameters: the mean (�) and the standard deviation (s). All normal density curves satisfy the following property which is often referred to as the Empirical Rule. 68% of the observations fall within 1 standard deviation of the mean, that is, between and . 95% of the observations fall within 2 standard deviations of the mean, that is, between and . 99.7% of the observations fall within 3 standard deviations of the mean, that is, between and . Thus, for a normal distribution, almost all values lie within 3 standard deviations of the mean.

26. 26 Parameter A parameter is a number computed from a population. Contrast this with the definition of a statistic. A parameter is a constant, unchanging value. There is no random variation in a parameter. If the size of the population is large (as is typically the case), then you may find that a parameter is difficult or even impossible to compute. An example of a parameter would be: the average length of stay in the birth hospital for all infants born in the United States.

27. 27 Statistic A statistic is a number computed from a sample. Contrast this with the definition of a parameter. If a statistic is computed from a random sample (as is typically the case), then it has random variation or sampling error. An example of a statistic would be: the average length of stay in the birth hospital for a random sample of 387 infants born in Johnson County, Kansas.

28. 28 Variable A measurement that can vary within a study, e.g. the age of participants. Variability is present when differences can be seen between different people or within the same person over time, with respect to any characteristic or feature that can be assessed or measured.

29. 29 P-value P-value � The probability (ranging from zero to one) that the results observed in a study (or results more extreme) could have occurred by chance. Convention is that we accept a p value of 0.05 or below as being statistically significant. That means a chance of 1 in 20, which is not very unlikely. This convention has no solid basis, other than being the number chosen many years ago. When many comparisons are bing made, statistical significance can occur just by chance. A more stringent rule is to use a p value of 0.01 ( 1 in 100) or below as statistically significant, though some folk get hot under the collar when you do it.

30. 30 Number needed to treat The inverse of the absolute risk reduction and the number of patients that need to be treated for one to benefit compared with a control The ideal NNT is 1, where everyone has improved with treatment and no-one has with control. The higher the NNT, the less effective is the treatment The value of an NNT is not just numeric - NNTs of 2-5 are indicative of effective therapies, like analgesics for acute pain NNTs of about 1 might be seen in treating sensitive bacterial infections with antibiotics, while an NNT of 40 or more might be useful e.g. when using aspirin after a heart attack

31. 31 Calculating NNTs NNT = 1/ARRARR = (CER � EER) where CER = control group event rateEER = experimental group event rate Sample CalculationThe results of the Diabetes Control and Complications Trial into the effect of intensive diabetes therapy on the development and progression of neuropathy indicated that neuropathy occurred in 9.6% of patients randomised to usual care and 2.8% of patients randomised to intensive therapy. NNT with intensive diabetes therapy to prevent one additional occurrence of neuropathy can be determined by calculating the absolute risk reduction as follows: ARR = (CER � EER) = (9.6% - 2.8%) = 6.8%NNT = 1/ARR = 1/0.068 = 14.7 or 15 Therefore need to treat 15 diabetic patients with intensive therapy to prevent one from developing neuropathy

32. 32 Number needed to treat Response to antibiotics of women with symptoms of UTI but negative dipstick urine test results: double blind RCT. Richards et al, BMJ 2005;331:143-6. Reduce duration of symptoms by 2 days? 4 Antibiotic prescribing in GP and hospital admissions for peritonsillar abscess, mastoiditis and rheumatic fever in children: time trend analysis. Sharland et al, BMJ 2005, 331, 328-9. Prevent one case of mastoiditis? At least 2500 Trigeminal neuralgia Rx anticonvulsants. To obtain 50% pain relief? 2.5 Arthritis Rx glucosamine for 3-8/52 cf. placebo. To improve symptoms? 5 MRC trial of treatment of mild HT: principal results. 17,354 individuals 36-64 years with diastolic 90-109 mmHg Rx benzoflurazide and propranolol for 5.5 years cf. placebo. BMJ 1985 291: 97-104. Primary prevention of one stroke at one year? 850

33. 33 Number needed to harm This is calculated in the same way as for NNT, but used to describe adverse events. For NNH, large numbers are good, because they mean that adverse events are rare. Small values for NNH are bad, because they mean adverse events are common. An example of how NNH values can be calculated along with NNT is that of inhaled corticosteroids used for asthma, where increasing dose made small improvement in efficacy, but large worsening for dysphonia and oral candidiasis.

34. 34 Odds ratio The ratio of the odds of having the target disorder in the experimental group relative to the odds in favour of having the target disorder in the control group (in cohort studies or systematic reviews) or the odds in favour of being exposed in subjects with the target disorder divided by the odds in favour of being exposed in control subjects (without the target disorder).

35. 35 Statistical power Statistical power � The ability of a study to demonstrate an association or causal relationship between two variables, given that an association exists. For example, 80% power in a clinical trial means that the study has a 80% chance of ending up with a p value of less than 5% in a statistical test (i.e. a statistically significant treatment effect) if there really was an important difference (e.g. 10% versus 5% mortality) between treatments. If the statistical power of a study IS low, the study results will be questionable (the study might have been too small to detect any differences). By convention, 80% is an acceptable level of power.

36. 36 Sensitivity Sensitivity Proportion of people with the target disorder who have a positive test. It is used to assist in assessing and selecting a diagnostic test/sign/symptom. A seNsitive test keeps false-Negatives down � 100% sensitive means all with positive tests have the condition SnNout When a sign/test/symptom has a high Sensitivity: a Negative result rules out the diagnosis. For example, the sensitivity of a history of ankle swelling for diagnosing ascites is 93%; therefore if a person does not have a history of ankle swelling, it is highly unlikely that the person has ascites.

37. 37 Positive predictive value Proportion of people with a positive test who have the target disorder The same as sensitivity

38. 38 Specificity Specificity Proportion of people without the target disorder who have a negative test. It is used to assist in assessing and selecting a diagnostic test/sign/symptom. A sPecific test keeps false-Positives down � 100% specific means all with negative tests do not have the condition SpPin When a sign/test/symptom has a high Specificity; a Positive result rules in the diagnosis. For example , the specificity of a fluid wave for diagnosing ascites is 92%; therefore if a person does have a fluid wave, it rules in the diagnosis of ascites.

39. 39 Reliability Reproducibility Stability over time and place Ease of replication Observer variation Confirmation of results

40. 40 Validity This term is a difficult concept in clinical trials, but refers to a trial being able to measure what it sets out to measure. A trial that set out to measure the analgesic effect of a procedure might be in trouble if patients had no pain. Or in a condition where treatment is life-long, evaluating an intervention for 10 minutes might be seen as silly. Looking at validity is not always easy, but a good worked example is that on acupuncture and stroke.

Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice

Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice

Presentation Transcript

Evidence-based Practice in Psychology (EBPP)

Curriculum Enhancement Tool for Promoting Teaching of Evidence-Based Practice

Improved Gastric Emptying

EVIDENCE BASED PRACTICE COMMITTEE MODELING EVIDENCE BASED PRACTICE: SEQUENTIAL COMPRESSION DEVICES

Genetics: What do you need to know in practice

Beyond evidence based practice

The Nature of Evidence

Evidence-Based Practice for Pharmacy Y2

How to Make a Great PBJ

Evidence-Based Curriculum

Building an Evidence-based Practice

Evidence Based Practice

How to make a peanut butter sandwich

Chapter 8 Putting Evidence-Based Practice Into Practice

Practice Sheet: 27.A

Evidence Based Practice University of Utah

Evidence Based Practices

Evidence-Based Practice and Practice-Based Evidence

Bread

Evidence-Based Practice: Course Examples

Evidence Based Practice & Practice Based Evidence

Putting the Evidence into Evidence Based Policy Maximising the Value of Statistics

Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice