Epidemiologic Methods Fall 2011 First 5 Lectures

Epidemiologic MethodsFall 2011First 5 Lectures • Overview of Study Design (today) • Disease Occurrence I and II • Disease Association I and II

Study DesignOutline of Topics for Today • Individual vs group as unit of observation • Study base or cohort • Cohort design • Cross-sectional design • Case-control design • 3 ways to sample controls • Primary & secondary study bases • Case-crossover/case series design • Incident versus prevalent disease • Retrospective versus prospective – what matters

Human Subjects Studies Unit of observation Individual Group (eg, geographic area) Experimental Observational Ecological Studies Randomized Non-randomized Cohort Cross-sectional Case-Control Case only

Ecological study: water fluoride & dental caries • A total of 204 (99%) and 191 (93%) municipalities were involved in 1999 and 2004, respectively. • Unit of analysis was the municipalities. • Mean DMFS of 15-year-olds was used as outcome variable. • Concentration of fluoride in the water supply (mean) ppm. Ekstrand et al. Factors associated with inter-municipality differences in dental caries experience among Danish adolescents. Community Dent Oral Epidemiol 2010

Ecological association Fluoride concentration in the water supply versus mean DMFS among 15-year-olds in the different municipalities in 2004. Tendency line is presented.

Ecological fallacy • Mean income and traffic injuries • Results using community as the unit of analysis Communities with higher income have more traffic injuries. What about individuals with higher income?

Results using individual as the unit of analysis: Individuals with higher income have fewer traffic injuries.

Ecological Fallacy

Human Subjects Studies Unit of observation Individual Group (eg, geographic area) Experimental Observational Ecological Studies Randomized Non-randomized Cohort Cross-sectional Case-Control Case only

Cohort is the basis of design with the individual as the unit of measurement With individual as unit of observation, all study design is best thought of as ways of sampling the disease experience of an underlying study base or cohort

Concept of the Study Base • The study base is the population who experience the disease outcomes you will observe in your study • In a cohort study, the study base is an explicitly defined cohort • In a cross-sectional study, the study base is a hypothetical cohort sampled at one point in time • In a case-control study, the study base is the cohort, either explicit or hypothetical, that gave rise to the cases

Three Keys to Study Design Using Observation of Individuals • Identify the population that is the Study Base • Determine how the experience of the Study Base population will be sampled • Consider the timing of measurements of exposure and outcome, relative to each other

Cohort study design D = disease occurrence; arrow = losses to follow-up Minimum loss to follow-up (1%)

Initial Cohort • Research cohort, recruited and followed for particular study. Cohort defined by characteristics at baseline, the beginning of follow-up. • Administrative cohort, e.g. Kaiser members • Geographically defined cohort, e.g. residents of SF

Cohort Study Design • Mimics individual’s progress through life and accompanying disease risk • Gold standard because exposure/risk factor is observed before the outcome occurs • Randomized trial is a cohort design with exposure assigned rather than observed

Framingham Cohort Study The impact of diabetes on survival following myocardial infarction in men vs women. The Framingham Study.Abbott RD, Donahue RP, Kannel WB, Wilson PW. The impact of diabetes on recurrent myocardial infarction (MI) and fatal coronary heart disease was examined in survivors of an initial MI using 34-year follow-up data in the Framingham Study. Among nondiabetic patients, the risk of fatal coronary heart disease was significantly lower in women compared with men (relative risk, 0.6). In the presence of diabetes, however, the risk of recurrent MI in women was twice the risk in men. In addition, the effect of diabetes doubled the risk of recurrent MI in women (relative risk, 2.1) but had an insignificant effect in men. JAMA, 1989

Threat to Validity in a Cohort Study: Losses to follow-up What bias do those lost introduce? Missed disease diagnoses? Those with a risk factor more or less likelyto leave? ? ? ? ? ? ? ?

Subjects lost during follow-up • If losses are random, only power is affected • If disease incidence is research question, losses that are related to outcome will bias results • If association of risk factor to disease is focus, losses bias results only if related to both outcome and the risk factor

Swiss HIV Cohort 3111 patients, ‘96-’99 At least two visits Med. follow-up 28 mos HCV+ more rapid disease progression Adj RH = 1.7 (95% CI = 1.3 - 2.3) No loss to follow-up info (Greub, Lancet, 2000) Johns Hopkins Cohort 1955 patients, ‘95-’01 At least two visits Med. follow-up 25 mos HCV not associated with disease progression Adj RH = 1.0 (95% CI = 0.9 - 1.2) No loss to follow-up info (Sulkowski, JAMA, 2002) Two Cohort Studies of HCV/HIV Coinfection and Risk of AIDS

Cross-Sectional Study Design

Cross-Sectional DesignMeasures Prevalence • Measures prevalence of disease at one point in time. Two types: • Point prevalence: Do you currently have a backache? • Period prevalence: Have you had a backache in the past 6 months? • Don’t confuse with length of time to conduct the study. E.g. Both examples above could occur in a study conducted over 4 months.

Cross-Sectional DesignWeaknesses • Cannot determine whether putative cause preceded the disease outcome • Cannot distinguish factors associated with disease (incidence) from factors associated with survival with disease (prevalence)

Example of Cross-sectional Study • Objective: Determine the prevalence of neck and shoulder pain in a well-functioning cohort, identify factors associated with this pain, …and evaluate the impact of this pain on physical functioning. • Methods: 3,075 men and women in Health ABC study. 50% white/50% black. 70-79 y.o. Well-functioning. • Baseline visit: Participants were asked if they had had neck or shoulder pain lasting at least 1 month during the previous year, and to rate severity of any pain. Vogt et al. Neck and shoulder pain in 70- to 79-year-old men and women: findings from the Health, Aging and Body Composition Study. Spine 2003.

(continued) • Assessment of functioning was carried out using a battery of lower-extremity performance tests • Results: 11.9% reported neck pain and 18.9% reported shoulder pain. .. More severe neck/shoulder pain associated with lower physical performance measures.

Case-Control Design • What is the study base in this design? • Study Base = the population (or underlying cohort) that gave rise to the cases (Szklo and Nieto call it the “reference population”). • Case-control design is best understood by considering how the experience of a cohort is sampled • Study base is the key concept that links case-control design and cohort design

Case-Control Key Concept #1 • Think of the selection of cases and controls as occurring from an underlying cohort (or study base)

Primary & Secondary Study Bases • Case-control studies can be thought of as evolving from one of two study bases (i.e. underlying cohorts): • Primary study base vs Secondary study base • A primary study base is one where the underlying cohort is readily defined, either because it is: • previously assembled for research (e.g., Framingham) • administratively defined (e.g., Kaiser members in 2004) • or geographically defined (e.g., SF residents in 2008) • A secondary study base is more elusive. It starts with the cases and then works backwards.

Cohort study design D = disease occurrence; arrow = losses to follow-up Given that all the cases are diagnosed, how would you sample controls from this cohort for a case-control study?

3 Ways to Sample Controls within a Cohort (Primary Study Base) • At time each case is diagnosed = incidence density sampling • A random sample of the cohort at baseline = case-cohort design • From persons without disease at the end of follow-up = prevalent controls design

Incidence Density Sampling within a Cohort Study Study Base = Cohort In this example, controls are randomly sampled each time a case is diagnosed from those still in follow-up without the diagnosis. It follows that a subject who is a control at one time may become a case later in time.

Example of case-control study with incidence density sampling (of a primary study base) Abstract We investigated associations between metabolic syndrome, its components, and breast cancer risk in a nested case–control study on postmenopausal women of the ORDET cohort (N = 10,000). After a median follow-up of 13.5 years, 163 women developed breast cancer…Four matched controls per case were selected by incidence density sampling, and rate ratios were estimated by conditional logistic regression. Metabolic syndrome (i.e. presence of three or more metabolic syndrome components) was significantly associated with breast cancer risk (rate ratio 1.58 [95% confidence interval 1.07–2.33])… Metabolic syndrome and postmenopausal breast cancer in the ORDET cohort: A nested case–control study Agnoli et al. Nutr Metab Cardiovasc Dis. 2009

Counterintuitive Idea #1 • Individuals can be controls at one point in time and cases at a later time • Incidence density sampling: Sampling “person-time” not individuals. Each time a case occurs, the remaining cohort is sampled for controls from those in follow-up at that time. • Compare with a full cohort: Individuals contribute person-time up until becoming a case or the end of their follow-up. This case-control design samples from the person-time experience of the cohort.

Another example of incidence density sampling • Objective: Associations between serum C-reactive protein (CRP) and colon and rectal cancer • Methods: Nested case-control study within the European Prospective Investigation into Cancer and Nutrition, over 520,000 participants recruited 1992–2000. • Cases: Incident cancer cases identified through record linkage with regional cancer registries. Closure dates ranged from December 1999 to June 2003 • Controls: We used an incidence density sampling protocol for control selection, such that controls could include subjects who later became cases, while each control subject could also be sampled more than once. Matching characteristics were study center at the time of enrollment, sex, age at blood collection (6-month to 2-year intervals),… • Results: Evidence that elevated CRP concentrations are related to a higher risk of colon cancer but not rectal cancer. Aleksandrova et al. Circulating C-Reactive protein concentrations and risks of colon and rectal cancer. AJE 2010

Case-cohort design: sample baseline of cohort

Case-cohort design: Case-control study using sampling of the baseline of the underlying cohort Abstract We conducted a case cohort investigation of colorectal cancer among nondiabetic subjects enrolled in the Women’s Health Initiative Observational Study, a prospective cohort of 93,676 postmenopausal women. Fasting baseline serum specimens from all incident colorectal cancer cases (n = 438) and a random subcohort (n = 816) of Women’s Health Initiative Observational Study subjects were tested for insulin, glucose, total IGF-I, free IGF-I, IGF binding protein-3, and estradiol. Gunter et al. Insulin, Insulin-like Growth Factor-I, Endogenous Estradiol, and Risk of Colorectal Cancer in Postmenopausal Women. Cancer Res 2008;68(1):329–37

Individual can be a control and a case in case cohort design • In the subcohort, some will go on to become cases. How can we include them in the control group? • Goal is to compare the prevalence of the exposure of interest between the cases and the cohort. Not comparing cases and “non-cases.” • By comparing exposure between the cases and the underlying cohort, we can calculate an unbiased estimate of the risk ratio. • More on this in the Disease Association lectures.

Another case-cohort example • Objective: Determine whether serum C-reactive protein levels, a sensitive indicator of inflammation, are associated with the risk of cardiovascular mortality among older women. • Methods: We conducted a case-cohort study within the Study of Osteoporotic Fractures, a population-based study involving 9,704 women aged > or = 65 years from four U.S. centers. • Cases: Random sample of 92 women from the 1,125 women in the cohort who had died during the first 6 years of follow-up. • Controls: We randomly selected 400 women from the entire cohort. • Baseline serum C-reactive protein levels • Results: During 6 years of follow-up, women with C-reactive protein levels in the highest quartile (>3.0 mg/L) had a 8.0-fold (95% confidence interval [CI]: 2.2 to 29) greater risk of cardiovascular mortality than those in the lowest quartile (< or = 1.0 mg/L). Tice et al. The relation of C-reactive protein levels to total and cardiovascular mortality in older U.S. women. Am J Med 2003.

Case-control design using incident cases + prevalent controls

Case-Control Key Concept #2 • Closed (fixed) vs open (dynamic) cohorts • Any well-enumerated administrative or geographic defined population can be thought of as a dynamic cohort that continues to recruit new subjects over time

Incidence Density Sampling in a Dynamic Cohort (e.g., a primary study base: San Francisco County) New residents D Incidence-density sampling in a specified population with new subjects entering

Example of case-control study with incidence density sampling in a primary (dynamic) study base The association between long-term use of NSAIDs and non-Hodgkin lymphoma (NHL) was examined …in an ongoing population-based case-control study in the SF Bay Area. Cases were identified using Surveillance, Epidemiology, and End Results registry data. …Controls were a random sample of persons identified by random digit dialing… resided in the same six counties as the casesat the time of diagnosis… …After adjustment for age and sex, there was no consistent association between long-term use and NHL for all NSAIDs combined, aspirin, nonselective NSAIDs, and COX-2 inhibitors. Flick, et al. Use of NonsteroidalAntiinflammatory Drugs and Non-Hodgkin Lymphoma: A Population-based Case-Control Study.Am J Epidemiol 2006; Sept 1, 164:497-504.

Dynamic study base: Cases • Incident NHL between October 2001 and May 2004 in the 6 SF Bay Area counties. • Eligible cases were those who were SF Bay Area residents at the time of diagnosis, aged 21–85 years at the time of diagnosis, and able to complete an interview in English. NHL cases were identified using rapid case ascertainment and Surveillance, Epidemiology, and End Results registry data. • A total of 1,000 cases

Dynamic study base: controls • Control participants were frequency-matched to cases by age, sex, and county of residence. Control participants were a random sample of persons identified by random digit dialing or from Centers for Medicare and Medicaid Services files. Eligible controls had no history of NHL, resided in the same six counties as the cases at the time of diagnosis, were aged 21–85 years, and were able to complete an interview in English. • Participants completed in-person interviews designed to measure potential NHL risk factors. Questions were asked regarding use during the past 20 years of aspirin, prescription and over-the-counter nonselective NSAIDs, and cyclooxygenase-2 (COX-2) inhibitors. • A total of 1,060 controls

Definition of a Primary Study Base • Primary Study Base = population that gives rise to cases that can be defined before cases appear • Typically, based on participation in a research study; administrative definition; or geographical area • Our previous examples of an “underlying cohort” for a case-control study were primary study bases

Definition of a Secondary Study Base • Secondary Study Base = population that gave rise to cases, defined as those persons who would have been identified as cases if they had developed disease during the period of study • Start with cases and then attempt to identify hypothetical cohort that gave rise to them • Difficult concept but crucial to case-control design when primary study base not possible

Case Control Selection from a Secondary Study Base • Source of cases is often one or more hospitals or other medical facilities • For controls, problem is identifying who would come to the facility if diagnosed with the disease • Careful consideration has to be given to factors causing someone to show up at that institution with that diagnosis

Secondary Study Base • Example: Incident glioma cases seen at UCSF • Difficult to identify study base because referrals come from many areas • One possible control group might be UCSF patients with a different neurologic disease • Patients from a similar tertiary referral clinic are another possible control group • Residents of the neighborhood of the case are another possibility

Example: case control study with incident cases and 2ary study base • Oral contraceptive (OC) use has been linked to increased risk of breast cancer, largely on the basis of studies conducted before 1990. • In the Case-Control Surveillance Study, a US hospital-based case-control study of medication use and cancer, the authors assessed the relation of OC use to breast cancer risk among 907 case women with incident invasive breast cancer (731 white, 176 black) and 1,711 controls (1,152 white, 559 black) interviewed from 1993 to 2007. • After control for breast cancer risk factors, the multivariable odds ratio for 1 year or more of OC use, relative to less than 1 year of use, was 1.5 (95% CI: 1.2, 1.8). Rosenberg et al. A Case-Control Study of Oral Contraceptive Use and Incident Breast Cancer AJE 2008

Selection of cases & controls • The present analyses included patients interviewed from 1993 through 2007 in participating hospitals in Baltimore, New York, and Philadelphia. Eligible cases were 921 white and black women aged 25–69 years with invasive breast cancer diagnosed within the previous year who had had no other cancer besides nonmelanoma skin cancer. We excluded 14 women with missing values for duration or timing of OC use, which left 907 cases. • Controls were selected from 2,330 white and black women aged 25–69 years with no history of cancer, other than nonmelanoma skin cancer, who had been admitted for nonmalignant diagnoses that we judged to be unrelated to OC use (musculoskeletal disorders, such as ruptured discs and fractures (n = 647); acute infections (n = 622); and hernias, kidney stones, gallstones, and skin conditions (n = 442)). We did not include patients admitted for nonmalignant illnesses that are possibly caused or prevented by OC use, such as cardiovascular disease or endometriosis (28). What is the secondary study base? Can those who belong in this study base be identified?

Primary vs. Secondary Study Base • Main problem with a primary base is often ascertainment of all cases • eg, no registry of all cases for many diseases by geographic area • Main problem with a secondary base is the definition of the base • eg, hospital-based case-control studies common but very difficult to determine the study base

Epidemiologic Methods Fall 2011 First 5 Lectures