after the first steps the evolution of a longitudinal survey l.
Skip this Video
Loading SlideShow in 5 Seconds..
After the First Steps: The Evolution of a Longitudinal Survey PowerPoint Presentation
Download Presentation
After the First Steps: The Evolution of a Longitudinal Survey

Loading in 2 Seconds...

play fullscreen
1 / 59

After the First Steps: The Evolution of a Longitudinal Survey - PowerPoint PPT Presentation

  • Uploaded on

After the First Steps: The Evolution of a Longitudinal Survey. National Population Health Survey (NPHS) Douglas Yeo Workshop on Longitudinal Research in Social Science—A Canadian Focus Population Studies Centre, University of Western Ontario London, Oct. 25–27, 1999. NPHS Program.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'After the First Steps: The Evolution of a Longitudinal Survey' - barbie

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
after the first steps the evolution of a longitudinal survey

After the First Steps: The Evolution of a Longitudinal Survey

National Population Health Survey (NPHS)

Douglas Yeo

Workshop on Longitudinal Research in Social Science—A Canadian Focus

Population Studies Centre, University of Western Ontario

London, Oct. 25–27, 1999

  • To aid in the development of public policy
    • To understand the determinants of health
      • Economic, social, demographic, occupational and environmental correlates of health
    • To explore relationship between health status and health care utilisation
  • To follow a panel of people to reflect the dynamic process of health
  • To provide means to supplement content or sample
  • To allow linkage with administrative data
  • Sample allocation at the national, provincial and territorial levels
  • Minimum requirement of 1,200 households for each province and territory
  • Household component: 20,000 households
    • Use of the LFS sampling design
  • Health care institutions: 2,500 residents
  • The North: 2,400 persons
  • Longitudinal and cross-sectional
  • First cycle in 1994, repeated every 2 years
  • Personal and telephone interviews
  • Basic information collected from all household members
  • One household member selected as the health respondent (longitudinal respondent)
  • General Questionnaire
    • All household members
    • Proxy reporting permitted (55% of cases)
  • Health Questionnaire
    • One randomly selected respondent in each household
    • Proxy reporting rarely permitted (4% of cases)
content core general
Content—Core (General)
  • Two-week Disability
  • Health Care Utilization
  • Restriction of Activities
  • Chronic Conditions
  • Sociodemographic Characteristics
    • Country of birth, immigration, language
    • Labour force
    • Income
    • Education
content core health
Content—Core (Health)
  • Self-Perceived Health
  • Blood Pressure
  • Women’s Health
  • Height and Weight
  • Health Status
  • Physical Activity
  • Repetitive Strain (1996 and 1998)
  • Injuries
content core health9
Content—Core (Health)
  • Use of Medications
  • Smoking
  • Alcohol
  • Mental Health
  • Social Support
  • Sense of Coherence (1994 and 1998)
  • Alcohol Dependence (1996)
content focus 1994
Content—Focus 1994
  • Stress
    • Ongoing problems
    • Recent Life Events
    • Childhood and Adult Stressors (“traumas”)
  • Work Stress
  • Self-esteem
  • Mastery
content focus 1996
Content—Focus 1996
  • Access to Services
    • Blood pressure
    • Pap smear test
    • Mammography
    • Breast examinations
    • Breastfeeding
    • Physical check-up
    • Flu shots
    • Dental visits
    • Eye examination
    • Emergency services
    • Insurance coverage
hps 1996
  • Height and Weight
  • Breast Self-Examination
  • Breastfeeding
  • Pregnancy
  • HIV
  • Smoking
  • Alcohol
  • Sexual Health
  • Road Safety
  • Food Insecurity
  • Separate Realise
content focus 1998
Content—Focus 1998
  • Focus
    • Self Care
    • Family Medical History
    • Diet/Nutrition
    • Tobacco Alternatives
  • Food Insecurity supplement (HRDC)
content focus 2000
Content—Focus 2000
  • Additional chronic conditions
    • In-depth diabetes questions
    • Fibromyalgia
  • Tanning and UV exposure
  • Stress questions are back
    • Ongoing Problems
    • Recent Life Events
    • Childhood and Adult Stressors (“traumas”)
    • Work stress
    • Self-esteem
    • Mastery
  • Illicit drug use
file creation 1994
File Creation—1994
  • Core sample (20,000)
  • Buy-in sample
    • N.B., Ont., Man., B.C.
  • Files produced: Cross-sectional
    • 1994 General File (all household members)
    • 1994 Health File (one randomly selected respondent)
file creation 1996
File Creation—1996
  • 1994 responding panel members
  • Cross-sectional Files
    • 1996 General File (all household members)
    • 1996 Health File (one randomly selected respondent)
    • Includes buy-in sample
      • Ontario, Manitoba, Alberta
  • Longitudinal File (1994–96)
products files
  • Master Files 1994–95 & 1996–97 (released)
  • Share Files 1994–95 & 1996–97 (released)
    • (Health Canada & Provinces)
  • Public Use Microdata Files
    • 1994–95 Household, General & Health (rel.)
    • 1994–95 Institutions, Health (rel.)
    • 1996–97 Household, General & Health (rel.)
    • 1996–97 Institutions, Health & Longitudinal (late 1999)
    • 1996–97 Household, Longitudinal (doubtful)
products access
  • Master Files
    • Selected Regional Offices
      • Deemed employee of Statistics Canada
    • Remote Access
      • Internet job submission
      • Using test master files
      • Free to clients
products publications
  • NPHS Overview Report
    • 1994–95—self-rated health and income, chronic conditions and pain, depression, use of health care services and alternative medicine
    • 1996–97—chronic disease incidence, changes in activity limitation status, depression, repetitive strain injuries, smoking, use of health care services
    • 1998–99—March 2000 issue of Health Reports
products publications20
  • Health Reports—detailed articles:
    • Depression, chronic pain, immigrants’ health, sense of coherence, smoking, hormone replacement therapy, bicycle helmet use, sample design…
products nhrdp
  • National Health Research and Development Program
    • Jointly funded by Health Canada and Statistics Canada
    • Up to $300,000 annually for NPHS research
  • Cycle 1: 8 grants, papers available
  • Cycle 2: 7 grants, papers available
  • Cycle 3: 7 grants, research starting
  • Cycle 4: Health Canada preparing RFP
1994 sample design
1994 Sample Design
  • Household target population
    • Based upon Labour Force Survey (LFS) and Enquête sociale et de santé (in Quebec only)
    • Household residents in all provinces
      • Exclusions: Indian reserves, Canadian forces bases, remote areas in Ontario and Quebec
  • Stratified multistage design
1994 sample design23
1994 Sample Design
  • 1st stage
    • Strata formed
      • Major urban centres, urban towns, rural areas
      • Further stratified by geography and/or socioeconomic characteristics
    • Clusters (heterogeneous) formed independently within strata
      • Clusters selected based upon PPS sampling
  • 2nd stage
    • Dwelling lists prepared for each selected cluster
    • Subsample of households selected within each cluster
cluster sampling
Cluster Sampling
  • Highly cost-effective in terms of listing and data collection
    • Only selected clusters are listed
  • Less efficient than SRS
    • Neighbouring units similar (intracluster correlation)
  • PPS sampling
    • Vary the probability with which a unit is selected according to its size
    • Units do not have same probability of selection (unequal weights)
1994 sample design rejective method
1994 Sample DesignRejective Method
  • One member/hhld selected at random to be longitudinal respondent
  • Panel would underrepresentpersons in large hhlds (parents and children) and overrepresent persons in smaller hhlds (singles and elderly)
  • Portion of sample pre-identified for screening
    • If no member < 25 years old then screened out
  • Increased # hhlds visited by anticipated # screened out
1994 sample design integration with nlsc
1994 Sample DesignIntegration With NLSC
  • NLSC follows ~ 25,000 children
  • NPHS longitudinal respondents < 12 years of age collected by NLSC
    • NPHS children’s sample used in NLSC estimates and for NPHS
  • Due to scheduling constraints NPHS kids sample not selected before Q3 and Q4
sample design subsequent cycles
Sample Design: Subsequent Cycles
  • Longitudinal respondents recontacted, using contact information from previous cycles
    • Moved into an institution
    • Moved to territories
    • Moved to an Indian reserve => tried to get data
    • Moved temporarily away
    • Identified deaths
  • Hhlds in sample include hhlds where the longitudinal respondent currently lives
    • Hhld composition may have changed
sample design subsequent cycles28
Sample Design: Subsequent Cycles
  • Longitudinal respondents’ data used for panel and cross-sectional purposes
  • Hhld members data used for cross-sectional estimates only (General file)
  • NPHS kids sample now collected by NPHS, not NLSCY
  • Cross-sectional supplementary samples from previous cycles not followed up
sample design subsequent cycles29
Sample Design: Subsequent Cycles
  • Top-up of sample every second cycle
    • First time in 1998
    • For cross-sectional purposes only
    • Account for changing population, panel attrition
    • To cover population not present in 1994: new births, immigrants
data collection
Data Collection
  • Statistics Canada LFS interviewers
  • Computer-Assisted Personal or Telephone Interviews (CAPI/CATI)
    • Built-in edits, mins, maxes
    • Direct skip patterns
    • On-screen prompts
    • Pre-filling of text or data
  • Average interview time 1 hour
data collection31
Data Collection
  • Data collected at 4 points in time
    • For operational, seasonality reasons
    • June, August, November, February
  • Nonresponse: no contact, refusal
    • Letter sent, second call, senior interviewer follows up
    • Never replace sample dwellings with others
    • Resends: follow up nonresponse in subsequent quarters, and in special resend period the following June
data collection32
Data Collection
  • Tracing to find longitudinal respondents
    • Panel member only
  • Feed back information from previous cycles
    • Data quality check
    • Probes for reasons for change
    • Restriction of activities, chronic conditions, smoking
  • Some sociodemographic information not re-asked if no change
  • Editing
    • On-line edits in CAPI
    • Some head office consistency edits
      • Invalid, inconsistent data set to "not stated"
  • Coding of write-in information (e.g., drugs)
  • Creation of derived variables
response rates
Response Rates
  • 1994 Household: 88.7%
    • Selected respondent: 96.1%
  • 1996 Longitudinal
    • General: 93.6%
    • Health: 92.8%
    • Only 1.7% not traced
  • 1996 Cross-sectional Household: 82.5%
    • Selected Respondent: 95.0%
analysing complex data
Analysing Complex Data
  • Point estimation
    • Survey weights must be used in calculation of estimates to correctly draw conclusions about pop’n of interest
    • Weights take stratification, unequal sampling probabilites into account
  • Variance estimation
    • Using survey weight only not sufficient
    • Complex design (and design effect) must be accounted for to avoid serious underestimation of standard errors
effect of weighting
Effect of Weighting
  • Comparison of males and females who reported being in excellent or very good health
      • Weighted difference 65.3 - 61.6 = 3.7%
      • Unweighted difference 62.6 - 60.8 = 1.8%
1998 weighting methodology
1998 Weighting Methodology
  • All panel respondents have a longitudinal weight
    • Includes moved to institution, dead, etc.
  • Start with basic weights from 1994
    • Derived from LFS or L’enquête sociale et de santé weights
    • Probability of selecting a dwelling in a selected cluster
1998 weighting methodology38
1998 Weighting Methodology
  • Nonresponse adjustment—by weighting classes
    • To account for potential nonresponse bias.
    • Study if nonrespondents are different,
    • Create special weighting classes based on response propensity using CHAID to account for these differences properly
  • Calibrate to 1994 population totals (by province/age/sex)
1998 weighting methodology39
1998 Weighting Methodology
  • Three longitudinal weights
    • WT68LF: “Full”— for fully completed for all components/all occasions
    • WT68LP: “Partial”— for fully completed for 1994 and 1998
    • WT64LS: “Square”—entire panel of 17,276, including nonrespondents
design effects
Design Effects
  • Measure of complexity of sample design
    • Calculate design variance using bootstrap weights
    • Calculate SRS variance
    • Deff = design variance / SRS variance
  • Generally, deffs > 1 for clustered designs, deffs < 1 for stratified designs
  • Varies (sometimes greatly) by characteristic
variance estimation
Variance Estimation
  • Measuring sampling error for complex sample designs
    • Simple formulas not available
    • Most software packages do not incorporate design effect appropriately for variance calculations
  • Need to provide some measures of data quality to users
nphs variance estimation
NPHS Variance Estimation
  • Bootstrap resampling method (similar to jackknife) used for all variance estimation
    • Aggregates, proportions, differences, coefficients from linear and logistic regressions
    • Variance estimation program written in SAS/SPSS macros
  • Approximate coefficient of variation (CV) look-up tables also provided with PUMF
    • For categorical variables, totals, proportions
bootstrap weight method
Bootstrap Weight Method
  • Variance estimation divided into two phases:
    • Calculation of bootstrap weights
      • Calculated only once, by Statistics Canada
    • Variance estimation using bootstrap weights
      • Internally and externally
        • Bootstrap weights available for regional office master files, for share files, in remote access program (dummy files)
      • No need for design information
        • Bootstrap weights incorporate design effect implicitly
bootstrap weights calculation
Bootstrap Weights:Calculation
  • Resampling method, which divides records into subgroups (replicates) and determines the variation in the estimates from replicate to replicate
  • Within each stratum, resample within original sample by taking a SRSWR of n-1 of the n clusters in that stratum
bootstrap weights calculation45
Bootstrap Weights:Calculation
  • Recalculate the weight for each record in that stratum—this is the bootstrap weight
  • We now have a new bootstrap weight for every record on the file. This set of weights is the first bootstrap replicate. A new point estimate ( ) can be calculated using the weights of this replicate
  • Repeat B (e.g., B=500) times
bootstrap weights variance estimation
Bootstrap Weights:Variance Estimation
  • To estimate the variance of any estimate (), first calculate the estimate B times, using the weights from the B bootstrap replicates
  • Then calculate the variance among these B estimates:
bootstrap weight method advantages
Bootstrap Weight Method:Advantages
  • Sets of 500 bootstrap weights can be distributed to analysts
  • Handles large datasets
  • Interprovincial migration accounted for corrrectly in variance estimates
  • Recommended (over the jackknife) for estimating the variance of nonsmooth functions like quantiles, LICO, Gini index
variance estimation example
Variance Estimation Example
  • Comparison of % of males vs. females who are in excellent or very good health
    • Weighted difference 65.3 - 61.6 = 3.7%
  • SAS—scaled weights
    • Standard error: 0.36
    • 95% confidence interval (3.0, 4.4)
  • Bootstrap
    • Standard error: 0.70
    • 95 % confidence interval (2.3, 5.1)
limitations and feedback
Limitations and Feedback
  • Some topics could be explored more thoroughly
  • Data raises more questions than it answers
  • Sample sizes can become small in a hurry
  • Often useful to combine with other survey data to explain phenomenon
  • Nice to be able to calculate bootstrap variance which takes design into account
analytical findings
Analytical Findings
  • Proxy / nonproxy reporting
  • Handling item nonresponse
  • Handling data inconsistencies
  • Study gross flows / changes
nphs future directions
NPHS Future Directions
  • New Household Cross-sectional Survey
    • Provide health-region estimates
    • Sample size of 130,000 / 30,000
    • Core, regional and rotating focus content
    • 45-minute interview
    • Every two years starting in 2000–01
nphs future directions54
NPHS Future Directions
  • Expanded Health Care Institutions Survey
    • Provide national and provincial estimates
    • To start in 2001?
  • Expanded Northern Survey
    • Total sample of 3,000 (1,000 per territory)
  • National Person-oriented Registries
    • NPHS data linked
nphs future directions55
NPHS Future Directions
  • Current Household Survey
    • Strictly longitudinal focus
    • New cohort to start in 2004?
    • Physical measures content to (sample of) longitudinal cohort twice in 20 year life
    • Continue every two years
nphs future directions58
NPHS Future Directions
  • Focus entirely longitudinal
    • Content will now specialise
  • How big should the new household cohort be?
  • Institutions, North panels:
    • How long should they be kept?
    • Integration with other surveys
    • When should new cohort be started?
national population health survey


National Population Health Survey

Manager: Lorna Bailie

Output Manager: Bryan Lafrance, 613-951-3285

Senior Methodologists: Harold Mantel, 613-951-4150

Douglas Yeo, 613-951-8614