Mark K. Stephens, MD Jill Manna Charles P. Schade, MD, MPH West Virginia Medical Institute

1. Mark K. Stephens, MD Jill Manna Charles P. Schade, MD, MPH West Virginia Medical Institute/ Quality Insights of Pennsylvania

2. Disclaimer I

4. Background-The Attribution Imperative Flawed evaluation designs beginning in the 6th SOW; missed opportunities Several studies purporting to show that QIOs produced quality improvements (and at least one that said they do not) IOM�s �we can�t prove it but we like them anyway� CMS emphasis on attribution in the 9th SOW The following slides address these points in detail.The following slides address these points in detail.

5. Flawed evaluation designs Confusion between program and QIO evaluation Pre-post design Poorly specified interventions Inaccurate documentation Lack of adequate control Negative incentives on QIOs QIOs pointed out this confusion of purpose before the 6th SOW started, but it wasn�t until the IOM identified the issue that CMS took it seriously We�ll discuss examples of most of these points in a few minutes.QIOs pointed out this confusion of purpose before the 6th SOW started, but it wasn�t until the IOM identified the issue that CMS took it seriously We�ll discuss examples of most of these points in a few minutes.

6. Typical QIO Scope of Work (6-8) Arbitrary improvement requirements Remeasurement before interventions effective Significant penalty for failure to perform Strong disincentive to use control groups Strong incentive to select providers based on likelihood of improvement (cherry picking) CMS imposed requirements for improvement on state QIOs that created perverse incentives. Arbitrary goals for states based on �reduction of the failure rate� rather than a theoretical or measured maximum level. This did not produce management confidence that a known level of effort would produce required results. It was an incentive to throw all available resources at a problem. There was little incentive to document, and after the fact efforts were spotty. When statewide performance was rewarded, using control groups was irrational. When relative performance of IPGs was rewarded, selecting an equivalent nonintervention group was not in the QIO�s survival interest.CMS imposed requirements for improvement on state QIOs that created perverse incentives. Arbitrary goals for states based on �reduction of the failure rate� rather than a theoretical or measured maximum level. This did not produce management confidence that a known level of effort would produce required results. It was an incentive to throw all available resources at a problem. There was little incentive to document, and after the fact efforts were spotty. When statewide performance was rewarded, using control groups was irrational. When relative performance of IPGs was rewarded, selecting an equivalent nonintervention group was not in the QIO�s survival interest.

7. Studies of QIO performance Jencks, 2003 Rollow, 2006 Weingarten, 2004 Snyder, 2005 Institute of Medicine, 2006 This is not a complete list, but was chosen to illustrate some of the problems noted above Because of the flaws in evaluation design and perverse incentives in the SOW, it is not surprising that these studies were inconclusive, controversial, and/or negativeThis is not a complete list, but was chosen to illustrate some of the problems noted above Because of the flaws in evaluation design and perverse incentives in the SOW, it is not surprising that these studies were inconclusive, controversial, and/or negative

8. Jencks, 2003* Presented changes in quality performance across the 6th SOW. This map represents composite quartile standing of the states at the end of the SOW. Nice map, documented improvements in quality nationwide, but the study did little more than allude to QIO work.Presented changes in quality performance across the 6th SOW. This map represents composite quartile standing of the states at the end of the SOW. Nice map, documented improvements in quality nationwide, but the study did little more than allude to QIO work.

9. Weingarten, et al.* 15 states Quality measures: QIO-collected data, varying sample sizes and periodicity Interventions: Retrospective survey of QIO staff using TQIP-like measures Did not control for secular trends GLM: feedback and checklists strongest positive impacts Results of the 6th SOW �what works?� special project. States and participants not randomly selected. Secular trends and recall bias probably negated the positive findings, which were modest at best, and may have been chance occurrences.Results of the 6th SOW �what works?� special project. States and participants not randomly selected. Secular trends and recall bias probably negated the positive findings, which were modest at best, and may have been chance occurrences.

10. Snyder and Anderson* 5 states CMS baseline and remeasurement indicators (sample was state level) QIO-supplied data on interventions (believed from TQIP) Hospitals grouped empirically into participating and non-participating No significant differences between groups 6th SOW impact study. The data source (TQIP) was suspect, and the timeframe would have been unlikely to detect effects. Demonstrated the potential impact of selection bias.6th SOW impact study. The data source (TQIP) was suspect, and the timeframe would have been unlikely to detect effects. Demonstrated the potential impact of selection bias.

11. Rollow, 2006* Looked at 7th SOW improvement across a number of settings, comparing IPG and non IPG performance. Made the suggestion that IPG membership resulted in better performance. Selection of IPGs led to non-comparable groups at baseline and remeasurement. Actual intensity and timing of interventions was not documented. Could not conclude that the �greater improvement� in IPGs was due to QIO activity. For some nursing homes (arrow), it appears improvement was largely complete before QIO interventions took place.Looked at 7th SOW improvement across a number of settings, comparing IPG and non IPG performance. Made the suggestion that IPG membership resulted in better performance. Selection of IPGs led to non-comparable groups at baseline and remeasurement. Actual intensity and timing of interventions was not documented. Could not conclude that the �greater improvement� in IPGs was due to QIO activity. For some nursing homes (arrow), it appears improvement was largely complete before QIO interventions took place.

12. Institute of Medicine The IOM found some evidence of association between QIO activity and improvement, but could not conclude on attribution. Providers working closely with their QIOs who improve more than those not working with the QIOs does not document that it was the QIOs who caused the improvement, unfortunately.The IOM found some evidence of association between QIO activity and improvement, but could not conclude on attribution. Providers working closely with their QIOs who improve more than those not working with the QIOs does not document that it was the QIOs who caused the improvement, unfortunately.

13. Summary:QIOs and The critics We did something Care improved We caused it Gosh, we�re good! What did you do and when did you do it? Care was improving anyway Care improved where you said you weren�t working You cherry picked Nothing you did worked These are extreme positions, but we will illustrate with examples.These are extreme positions, but we will illustrate with examples.

14. Outline of this talk What is �attribution?� Review of philosophy of causation How is causation demonstrated in health care systems? Demonstrating impact of interventions in processes of care-brief review of literature How can we attribute improvements in care to QIO efforts (without lying)? We�ll present summaries of two of our own studies to illustrate the third point.We�ll present summaries of two of our own studies to illustrate the third point.

15. Disclaimer II

17. Hume�s idea of causation* A precedes B in time A and B are contiguous in space and time Events of type A are constantly conjoined with events of type B

18. A Precedes B? In this contingency table, blonde hair and blue eyes are associated (p=0.0005) but we cannot conclude anything about causation. We know that blonde hair and blue eyes are unlikely to be independent, but we don�t know which came first (if either). And we don�t know if the two are associated regularly, or just in this one experiment. In this contingency table, blonde hair and blue eyes are associated (p=0.0005) but we cannot conclude anything about causation. We know that blonde hair and blue eyes are unlikely to be independent, but we don�t know which came first (if either). And we don�t know if the two are associated regularly, or just in this one experiment.

19. �I got the flu shot and I got the flu� Temporal precedence is not enough Lots of people get flu shots Lots of people get the flu So it stands to reason that some will get the flu after having the flu shot The notion that A must cause B if A is often followed by B is an example of a logical fallacy. Often such false conclusions are supported by plausible causal chains. For example, we know that some vaccines in the past have caused the illness they are supposed to prevent, and we know that two of the side effects of the flu vaccine are fever and malaise. Where such reasoning fails is at the point of regular association, where statistical tests fail to show the occurrence of flu following vaccination to be different from what would be expected by chance given the amount of flu in the population and the efficacy of the vaccine. The notion that A must cause B if A is often followed by B is an example of a logical fallacy. Often such false conclusions are supported by plausible causal chains. For example, we know that some vaccines in the past have caused the illness they are supposed to prevent, and we know that two of the side effects of the flu vaccine are fever and malaise. Where such reasoning fails is at the point of regular association, where statistical tests fail to show the occurrence of flu following vaccination to be different from what would be expected by chance given the amount of flu in the population and the efficacy of the vaccine.

20. Contiguous in space and time This is a Rube Goldberg machine illustrating a causal chain (of sorts). Hume�s idea of contiguity means we should be able to envision some mechanism by which an underlying cause produces its effect. For QIO work, we think of �interventions� as a key, but very early, part of the chain.This is a Rube Goldberg machine illustrating a causal chain (of sorts). Hume�s idea of contiguity means we should be able to envision some mechanism by which an underlying cause produces its effect. For QIO work, we think of �interventions� as a key, but very early, part of the chain.

21. Constantly Conjoined Is unlikely to have occurred by chance Given sufficient observations And the assumption that the two events are actually independent A statistical test can neither prove causation by itself, nor disprove causation. A statistical test can neither prove causation by itself, nor disprove causation.

22. When is association enough? The short answer is �never, by itself� But there are different ways of assessing association Some ways of assessing association get at temporal sequence A reliable product may be sufficient, even if we don�t always know how it works In the rest of this talk we�ll explore some alternative ways of assessing association, and discuss design and documentation criteria that can help satisfy Hume�s criteria, providing reasonable evidence of causation In the rest of this talk we�ll explore some alternative ways of assessing association, and discuss design and documentation criteria that can help satisfy Hume�s criteria, providing reasonable evidence of causation

23. Causation in health care Fundamental question: did a treatment do any good? Important subordinate questions Can we determine that an intervention took place? and when? How do we know who was exposed to a treatment? and how much of it? What do we mean by �do any good?� Is a treatment effect biologically plausible? The first question is related to temporal precedence and regularity of association. The second gets at defining the event believed to be a �cause� The third defines the effect The fourth approaches contiguity in space and time. It is said that we used to discharge cannon into the air to dispel �miasms� that were believed to cause malaria. Such a treatment today would not be considered biologically plausible, even if malaria attack rates declined in the face of heavy artillery.The first question is related to temporal precedence and regularity of association. The second gets at defining the event believed to be a �cause� The third defines the effect The fourth approaches contiguity in space and time. It is said that we used to discharge cannon into the air to dispel �miasms� that were believed to cause malaria. Such a treatment today would not be considered biologically plausible, even if malaria attack rates declined in the face of heavy artillery.

24. Why is it difficult to attribute cause in health care? Health systems are complicated, and may not respond identically to the same intervention Interventions are complicated, and may not be consistently named or described Lots of other things are going on in a health system when we intervene

25. Confounds* Any variable that is plausibly related to an independent variable of interest and could explain variation in the dependent variable but has not been accommodated. The next three slides are courtesy of Dana Keller, PhD, one of the QIO movement�s best study design experts while he was with the Delmarva Foundation. The next three slides are courtesy of Dana Keller, PhD, one of the QIO movement�s best study design experts while he was with the Delmarva Foundation.

26. Result of a Confound Conclusions about project impact are not justifiable.

27. Types of Confounds This is a long list, and I don�t know the definition of each and every one. This is a long list, and I don�t know the definition of each and every one.

28. Conclusions in the face of confounds Are possible with proper study design The �gold standard� design in clinical practice is the randomized clinical trial (RCT) Randomization done properly causes confounds to be equally likely in intervention and control groups Adequate statistical design permits detecting the effect of intervention in the noise of confounds Proper study design and statistical methods can allow one to reach conclusions in the face of confounds. However, even randomization cannot take care of all possible confounds.Proper study design and statistical methods can allow one to reach conclusions in the face of confounds. However, even randomization cannot take care of all possible confounds.

29. Multiple Risk Factor Intervention Trial (MRFIT) -- Mortality If the question is, �Why not make all QIO interventions randomized trials?� one good answer is �MRFIT.� MRFIT was a seven year, 12,866 subject randomized trial in the 1970s testing whether multiple lifestyle and treatment interventions (e.g., smoking, diet, blood pressure control...) could affect death from heart disease in high risk men. It failed.If the question is, �Why not make all QIO interventions randomized trials?� one good answer is �MRFIT.� MRFIT was a seven year, 12,866 subject randomized trial in the 1970s testing whether multiple lifestyle and treatment interventions (e.g., smoking, diet, blood pressure control...) could affect death from heart disease in high risk men. It failed.

30. How MRFIT failed Is is possible that the �comparison facilities� might also adopt the behavior QIOs are advocating for intervention facilities? And would it necessarily be bad if they did? It certainly was not bad for the UC patients in MRFIT, because cardiovascular death rates declined in both groups and societywide during that period of time.Is is possible that the �comparison facilities� might also adopt the behavior QIOs are advocating for intervention facilities? And would it necessarily be bad if they did? It certainly was not bad for the UC patients in MRFIT, because cardiovascular death rates declined in both groups and societywide during that period of time.

31. Other arguments against RCTs for health system interventions* Cost Potential ethical concerns Availability of large enough populations Time available for followup Limits to generalizabilty arising from study design Is CMS really going to multiply QIO budgets by a factor of 10 or more? Ethical concerns are greatest when the intervention is known to be effective at producing the desired systemic change, because some individuals are then denied an effective treatment Large scale randomized trials typically require years for followup, and that is generally not compatible with QIO contract cycles Is CMS really going to multiply QIO budgets by a factor of 10 or more? Ethical concerns are greatest when the intervention is known to be effective at producing the desired systemic change, because some individuals are then denied an effective treatment Large scale randomized trials typically require years for followup, and that is generally not compatible with QIO contract cycles

32. OK, so what kinds of intervention study designs should we use? Simple pre-post Interrupted time series Multiple baseline Sanson-Fisher (see previous slide) presents these three methods as appropriate for intervention trials. It may seem surprising that pre-post is on the list, given the problems of QIO attribution. But these problems come from inappropriate application of the pre-post method or failure to assure comparability of the intervention and comparison groups, and not from the method itself. Let�s see what other experts have to say.Sanson-Fisher (see previous slide) presents these three methods as appropriate for intervention trials. It may seem surprising that pre-post is on the list, given the problems of QIO attribution. But these problems come from inappropriate application of the pre-post method or failure to assure comparability of the intervention and comparison groups, and not from the method itself. Let�s see what other experts have to say.

33. This is the web page of the Cochrane Effective Practice and Organization of Care Group. It is a part of the international Cochrane Collaboration of researchers voluntarily assessing the evidence of effectiveness of health care in various area. There are currently more than 5,000 published Cochrane reviews of which about 50 are from the EPOC group. EPOC�s focus, as shown, is interventions designed to improve health professional practice. Reviews include topics like audit and feedback, telemedicine, and mass media. EPOC has published guidance on appropriate study designs for inclusion in reviews, which are described in the next slide.This is the web page of the Cochrane Effective Practice and Organization of Care Group. It is a part of the international Cochrane Collaboration of researchers voluntarily assessing the evidence of effectiveness of health care in various area. There are currently more than 5,000 published Cochrane reviews of which about 50 are from the EPOC group. EPOC�s focus, as shown, is interventions designed to improve health professional practice. Reviews include topics like audit and feedback, telemedicine, and mass media. EPOC has published guidance on appropriate study designs for inclusion in reviews, which are described in the next slide.

34. Study Designs for EPOC Reviews* Patient randomised controlled trials (P-RCT) Cluster randomised controlled trials (C-RCT) Non-randomised cluster controlled trials Controlled before and after studies (CBAs) Interrupted time series designs (ITS) P-RTC are as discussed above. C-RTC randomizes practitioners or institutions, and requires very large sample sizes. CBA studies are what QIOs have always done, only without very good control ITS holds considerable promise for QIO activities, because of how quality data are currently being managed and used in the program. Its drawbacks include inability to assess the impact of concurrent events on outcome, but there are ways of minimizing this problem. P-RTC are as discussed above. C-RTC randomizes practitioners or institutions, and requires very large sample sizes. CBA studies are what QIOs have always done, only without very good control ITS holds considerable promise for QIO activities, because of how quality data are currently being managed and used in the program. Its drawbacks include inability to assess the impact of concurrent events on outcome, but there are ways of minimizing this problem.

35. Multiple Baseline*: Particularly attractive for QIO work In a simple interrupted time series, all interventions occur at the same time. If there are external forces also acting at the same time, it can be difficult to impossible to sort out what may have caused changes in the desired outcome. Multiple baseline interventions can limit the impact of such externalities. For example, suppose the outcomes in the graph represent smoking rates in teenagers, and the interventions are community education efforts in a single state. A change in state tobacco tax during the time of the interventions would confound all the interventions equally if they occurred at the same time, but would have a more random influence on interventions starting at different times.In a simple interrupted time series, all interventions occur at the same time. If there are external forces also acting at the same time, it can be difficult to impossible to sort out what may have caused changes in the desired outcome. Multiple baseline interventions can limit the impact of such externalities. For example, suppose the outcomes in the graph represent smoking rates in teenagers, and the interventions are community education efforts in a single state. A change in state tobacco tax during the time of the interventions would confound all the interventions equally if they occurred at the same time, but would have a more random influence on interventions starting at different times.

36. ITS example from our research* 6th SOW, hospital quality improvement Worked with all hospitals in state Periodic data collection depending on hospital size Multiple interventions; audit and feedback had well-defined start in all hospitals To illustrate some of these issues, I want to shift to work we have done at WVMI. This is a study we published in 2004. We attributed improvements to audit and feedback, although there were numerous other interventions that we tried at various times. We considered the other interventions externalities or potential confounders in this study. The other interventions all took place after we began feeding back data to the hospitals.To illustrate some of these issues, I want to shift to work we have done at WVMI. This is a study we published in 2004. We attributed improvements to audit and feedback, although there were numerous other interventions that we tried at various times. We considered the other interventions externalities or potential confounders in this study. The other interventions all took place after we began feeding back data to the hospitals.

37. Formal evaluation Defined pre-intervention period as July 1998-June 2000 and post-intervention as July 2000-December 2001 Research questions: Did hospital quality measures improve between pre- and post- intervention periods? Did rate of change of improvement differ in pre- and post- intervention periods? This design used the hospitals as their own controls. We measured the rate of change in each period using the chi square test for linear trend in the quality measure data, and looked for situations where there were no significant trends during the pre intervention period and significant positive trends post intervention. We considered the post intervention period to have been three months after delivery of the first feedback data, which was the same for all hospitals.This design used the hospitals as their own controls. We measured the rate of change in each period using the chi square test for linear trend in the quality measure data, and looked for situations where there were no significant trends during the pre intervention period and significant positive trends post intervention. We considered the post intervention period to have been three months after delivery of the first feedback data, which was the same for all hospitals.

38. Documentation of Assessment for, or Administration of Pneumococcal Vaccine It was this kind of observation in the data that made an interrupted time series analysis seem appropriate.It was this kind of observation in the data that made an interrupted time series analysis seem appropriate.

39. Beta Blockers Within 24 Hours of Admission, AMI Patients (1) Here were some pre-intervention hospital performance data...Here were some pre-intervention hospital performance data...

40. Beta Blockers Within 24 Hours of Admission, AMI Patients (2) ...and here were post intervention performance. You can see that there was a shift to the right, and a narrowing of the variation among hospitals....and here were post intervention performance. You can see that there was a shift to the right, and a narrowing of the variation among hospitals.

41. Pre-Post Change in Slope of Trend Lines in Largest Hospitals Here are results of our ITS analysis for AMI quality measures...Here are results of our ITS analysis for AMI quality measures...

42. Pre-Post Change in Slope of Trend Lines in Largest Hospitals And for atrial fibrillation and heart failure...And for atrial fibrillation and heart failure...

43. Pre-Post Change in Slope of Trend Lines in Largest Hospitals And for stroke and pneumonia (we had a ways to go with pneumonia).And for stroke and pneumonia (we had a ways to go with pneumonia).

44. Summary of Inpatient Quality Improvement 1999-2001 17 evidence-based quality indicators Excluded ineligible patients, e.g. contraindications for beta blockers All positive indicators-should be 100% 14/15 indicators improved significantly from pre-intervention period, not controlling for baseline trends 8/15 measures developed significant positive trends from pre- to post-intervention period All but 2 indicators had either >5% increase or >25% reduction in variance across hospitals. Heart failure indicators were an exception (we don�t know why) ITS analysis appeared to isolate the response to the appropriate time period, probably eliminating the concern that secular trends might have explained the results, at least in the cases where the trend slopes were positive.All but 2 indicators had either >5% increase or >25% reduction in variance across hospitals. Heart failure indicators were an exception (we don�t know why) ITS analysis appeared to isolate the response to the appropriate time period, probably eliminating the concern that secular trends might have explained the results, at least in the cases where the trend slopes were positive.

45. 2 Commitment to Participate in Project 3 Data Collection by Participant 4 Obtained Commitment from Project Champion 5 Data Dissemination 6 System Changes 8 Provider Educational Efforts But what about those other interventions? This chart shows the most common interventions we used during the 6th SOW as reported in the Tracking Quality Improvement Projects (TQIP) database. There were a lot of problems with TQIP, but it was the only source of data with interventions categorized, assigned to an individual hospital, and dated.This chart shows the most common interventions we used during the 6th SOW as reported in the Tracking Quality Improvement Projects (TQIP) database. There were a lot of problems with TQIP, but it was the only source of data with interventions categorized, assigned to an individual hospital, and dated.

46. Linkage Compute midpoint of data collection period for all QI observations For every hospital, quarter, and QI combination identify: The closest QI observation with interval midpoint less than the first day of the quarter The closest QI observation with interval midpoint greater than the last day of the quarter Like Weingarten et al. we wanted to see if we could tie specific interventions to quality measure improvements, and unlike Weingarten, we had a continuous series of QI measures in every hospital through the SOW. So we built a database linking TQIP events with the quality measurement in the hospital immediately before and immediately after the event. This short timeline, we hoped, would reduce the impact of externalities�and other interventions.Like Weingarten et al. we wanted to see if we could tie specific interventions to quality measure improvements, and unlike Weingarten, we had a continuous series of QI measures in every hospital through the SOW. So we built a database linking TQIP events with the quality measurement in the hospital immediately before and immediately after the event. This short timeline, we hoped, would reduce the impact of externalities�and other interventions.

47. Time period definitions-HIQ cell This is what a data element looked like. In some instances, there was no intervention during the �intervention� time, which gave us control periods. We conducted the study using 6th SOW data 1999-2001, when we had TQIP 1 measurements also available.This is what a data element looked like. In some instances, there was no intervention during the �intervention� time, which gave us control periods. We conducted the study using 6th SOW data 1999-2001, when we had TQIP 1 measurements also available.

48. Before/After Quality Indicator Changes For most QIs the average absolute change was 5-10% (range 0.5-21%) The relative QI change was larger, averaging 34% Wide variation in individual observation pre-post changes The average pre-post time interval was a little less than 9 months, without much variation among QIs We saw substantial increases in quality indicator scores, widely varying across hospitals...We saw substantial increases in quality indicator scores, widely varying across hospitals...

49. Two statistical methods Relative risk at the individual QI level. Was a quality measure more likely to improve in an interval with an intervention than without it? Multiple regression After adjusting for secular trend, did any intervention or combination of interventions predict improvement in QI level? Here were our two methods of analysis and the questions they sought to answer Unfortunately, the answer to both was, �Probably not.� And a simulation study showed that the method would have detected associations between indicators and interventions given observed indicator improvement and intervention rates, if they had been present.Here were our two methods of analysis and the questions they sought to answer Unfortunately, the answer to both was, �Probably not.� And a simulation study showed that the method would have detected associations between indicators and interventions given observed indicator improvement and intervention rates, if they had been present.

50. Relative Risk - Number of Significant Associations* This sporadic occurrence of significant associations is similar to Weingarten�s results, and is almost certainly a result of multiple comparisons, though data collection/dissemination is interesting.This sporadic occurrence of significant associations is similar to Weingarten�s results, and is almost certainly a result of multiple comparisons, though data collection/dissemination is interesting.

51. Multiple regression summary Most TQIP events not significantly associated with QI change 7 significant associations by F test, p<0.05 2 (of 7) significant negative associations Secular trend significant in over 1/3 of indicators Models explain a tiny proportion of variance of indicator change The main factor explaining improvement was a �secular trend� term. This does not mean that QIO interventions had no effect, but only that we may not have described them or recorded them properly.The main factor explaining improvement was a �secular trend� term. This does not mean that QIO interventions had no effect, but only that we may not have described them or recorded them properly.

52. Three potentially significant data issues Definitions of intervention types were not well understood There was no reliability assessment on assignment of intervention types Start and end time of interventions Potential variation in intensity of intervention Although the TQIP system was established to track interventions, its potential as a tool in improving the quality and reliability of intervention data was never realized. So far as I know, there were never reports comparing QIOs on TQIP submissions, nor did CMS audit TQIP content for accuracy.Although the TQIP system was established to track interventions, its potential as a tool in improving the quality and reliability of intervention data was never realized. So far as I know, there were never reports comparing QIOs on TQIP submissions, nor did CMS audit TQIP content for accuracy.

53. TQIP Inpatient Events by Month There is suspicious temporal clustering among these events. Inquiry of the project coordinators who recorded the findings indicated that peaks occurred in conjunction with statewide events. They probably did not correspond to actual events occurring in hospitals.There is suspicious temporal clustering among these events. Inquiry of the project coordinators who recorded the findings indicated that peaks occurred in conjunction with statewide events. They probably did not correspond to actual events occurring in hospitals.

54. Lessons from the past Carefully collected and audited time series data were useful in documenting impact Haphazardly collected, unaudited intervention data were not Finding evidence of causality is all about observing well defined events at specific times with respect to outcomes

55. Applying the lessons Agree on taxonomy of interventions and record them consistently Get accurate times of interventions from our partners Spend the up front time documenting interventions in detail, e.g., using the SQUIRE guidelines* We cannot depend on CMS to do this for us. A logical group to develop a common approach to this would be the AHQA analytic network.We cannot depend on CMS to do this for us. A logical group to develop a common approach to this would be the AHQA analytic network.

56. Some promising signs CMS is firmly committed to collecting and reporting quality data periodically There does need to be more attention to quality of the data QIOs are being pushed to demonstrate improvement in narrower, better defined areas This is a fertile environment for using time series methods Some data used for quality are not currently audited, e.g., MDS. Other data could have better auditing, using QIO staff to teach providers in consistent data collection approaches, instead of the current �gotcha� approach. Outpatient data, where physicians are just starting to grapple with the difference between free text and structured information as they report quality measures, is a particularly good opportunity. National campaigns are another promising area, where there are numerous providers engaging in similar activities with known starting points. The Home Health QIOSC is currently working on time series analysis of the home health national campaign for this very reason. The new scope of work offers some important opportunities, including the subnational transitions projects, which lend themselves both to controlled comparisons and time series approaches. So, of course, do the CKD projects.Some data used for quality are not currently audited, e.g., MDS. Other data could have better auditing, using QIO staff to teach providers in consistent data collection approaches, instead of the current �gotcha� approach. Outpatient data, where physicians are just starting to grapple with the difference between free text and structured information as they report quality measures, is a particularly good opportunity. National campaigns are another promising area, where there are numerous providers engaging in similar activities with known starting points. The Home Health QIOSC is currently working on time series analysis of the home health national campaign for this very reason. The new scope of work offers some important opportunities, including the subnational transitions projects, which lend themselves both to controlled comparisons and time series approaches. So, of course, do the CKD projects.

57. Can QIOs ever satisfy Hume�s three elements? Temporal precedence? Yes, if we collect time data accurately Contiguity in space and time? Probably not, because the health care system is too complex Regularity of association? Of course! We do have a 50 state laboratory Consistent data collection related to interventions aimed at the same quality problems, documenting both the intervention and the response, can satisfy two of the three elements. As for the third (next slide)Consistent data collection related to interventions aimed at the same quality problems, documenting both the intervention and the response, can satisfy two of the three elements. As for the third (next slide)

58. The Hospital Quality Improvement Process Any resemblance between this simple schematic and the Rube Goldberg machine shown previously is purely coincidental. Actually, the health care system is much more complicated than this, and the relative size of the QIO versus other influences is undoubtedly exaggerated. QIOs do not control what goes on in the health care system, and it is mostly for health services researchers to tease out the causal links within the system. Any resemblance between this simple schematic and the Rube Goldberg machine shown previously is purely coincidental. Actually, the health care system is much more complicated than this, and the relative size of the QIO versus other influences is undoubtedly exaggerated. QIOs do not control what goes on in the health care system, and it is mostly for health services researchers to tease out the causal links within the system.

59. So, when is association enough? Health care quality in specified areas improves more when QIOs are involved than otherwise This happens with multiple kinds of interventions And in multiple settings And well documented, including savings in cost or improved outcomes If you were a health care administrator, or a government policy maker faced with the unpleasant realities of increasing costs and unreliable quality in health care, would you engage the services of companies that could provide you this?If you were a health care administrator, or a government policy maker faced with the unpleasant realities of increasing costs and unreliable quality in health care, would you engage the services of companies that could provide you this?

60. Questions? Comments?

Mark K. Stephens, MD Jill Manna Charles P. Schade, MD, MPH West Virginia Medical Institute

Mark K. Stephens, MD Jill Manna Charles P. Schade, MD, MPH West Virginia Medical Institute

Presentation Transcript

Laurie Glader, MD Emily Davidson, MD, MPH

Whitney K. Bryant , MD, MPH Anand Swaminathan , MD, MPH

Mark Micek, MD, MPH

Principal Investigator: Evelyn P. Whitlock, MD, MPH Elizabeth Eckstrom, MD, MPH David Feeny, PhD

Evelyn P. Whitlock, MD, MPH Al L. Siu, MD, MSPH

Ank Nijhawan, MD, MPH Esmaeil Porsa, MD, MPH

Suneet P. Chauhan, MD Eastern Virginia Medical School chauhasp@evms

Charles Cefalu, MD

Siobhan Dolan, MD, MPH Assistant Medical Director

Mark Weiner, MD

Joseph Bresee, MD Tom Shimabukuro, MD, MPH, MBA Pascale Wortley, MD, MPH

Mark Eakes MD/MPH Student Graduate Program in Public Health Eastern Virginia Medical School

Edward P. Sloan, MD, MPH, FACEP

Edward P. Sloan, MD, MPH, FACEP

Ohara C （ Mph ） , Murata A （ MD ） , Inoue M （ MD,PhD ） , Inoue K （ MD,PhD ）

Kyla Terhune, MD, Lesly Dossett, MD, MPH Vanderbilt University Medical Center

Gianfranco Pezzino, MD, MPH, Kansas Health Institute

Yongmei Peng, MD, MPH Qing Wang, MD

P. Bradley Hall, MD Executive Medical Director, West Virginia Medical Professionals Health Program

Kevin Daly MD Christopher Almond MD MPH

Non-Medical Drivers of Health Marci Morgenlander, MD MPH Deborah Porterfield, MD MPH