1 / 19

Analyzing Survey Error with Latent Class Models

Analyzing Survey Error with Latent Class Models . Paul Biemer RTI International and University of North Carolina. March 18, 2005. What is Latent Class Analysis?. Special case of log-linear analysis with latent variables

liesel
Download Presentation

Analyzing Survey Error with Latent Class Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing Survey Error with Latent Class Models Paul Biemer RTI International and University of North Carolina March 18, 2005

  2. What is Latent Class Analysis? • Special case of log-linear analysis with latent variables • Latent variables are constructs which are measured imperfectly by indicator variables • Traditional LCA assumes local independence • i.e., P(A and B|X) = P(A|X)P(B|X) for latent variable X and indicators A and B • LCA models contain • Structural component – describes relationship among latent variables and covariates • Measurement component – describes the relationship among the indicators, latent variables and covariates

  3. Uses of Latent Class Analysis in Survey Research • Substantive researchers focus on the structural component of the LCM • Errors treated as nuisance parameters • Survey methods researchers focus on the measurement component • Estimate components of total survey error • Evaluation of questionnaires and alternative survey designs • Population size estimation • Compensation for missing data • Survey bias adjustment

  4. Objective of LCA for Measurement Error Analysis • Obtain estimates of classification error for a categorical survey variable • For e.g., false positive and false negative error rates • Why are these LCA estimates useful? • Quantify the measurement error in the data • Identify the correlates of measurement • Trace error to its root causes • Eliminate the cause through redesign

  5. Example – Estimating the Error in Survey Measurements of Marijuana Use Three Indicators of Marijuana Use Indicator A - How long has it been since you last used marijuana or hashish? A = Yes, if indication of last 12 month use A = “No” if otherwise Indicator B - Now think about the past 12 months from your 12-month reference date through today. On how many days in the past 12 months did you use marijuana or hashish? B = “Yes” if response is 1 or more days; B = “No” otherwise

  6. Indicator C – a composite variable based upon 7 questions such as • used in last 12 months? • spent a great deal of time getting it, using it, or getting over its effects? • used drug much more often or in larger amounts than intended? C = “Yes” if response is positive to any question suggesting use in last 12 months C = “No” otherwise

  7. Statistical Framework NOTATION X = true drug use status (1 if use, 2 if no use) unknown latent variable A, B, and C are 3 dichotomous indicators of X or

  8. Log-linear Formulation of the Latent Class Model is equivalent to in which i.e., hierarchical LLM {AX BX CX}

  9. Estimation Use MLE to obtain estimates of from the multinomial likelihood equation of the AxBxC classification table

  10. Some Results (modeling details in Biemer and Wiesen, 2000) • LCA models were fit to three years of data from the National Survey of Drug Use and Health • Discovered several important anomalies were in the estimates of marijuana use • Low frequency marijuana uses tended to answer negatively to the frequency question • Composite variable was subject to false positive as a result of a questionnaire problem that was subsequently corrected

  11. False Positive Error Rates Under Model 1

  12. Estimates of False Negative Error Rates

  13. Frequency of Use for Persons Responding ‘No’ to A 5.84 More than 300 days 5.84

  14. Other Applications Nonsampling Error Research • Identifying flawed questions and other questionnaire problems • Estimating census undercount in a capture-recapture framework • Characterizing respondents, interviewers, and questionnaire elements that contribute to survey error • Adjusting for nonresponse and missing data in surveys

  15. Other Applications (cont’d) Substantive Research • Causal modeling • Log-linear analysis compensating for measurement error • Cluster analysis • Variable reduction and scale construction

  16. Importance of Model Validity Depends Upon the Application • In the previous example, validity was “proven” by ability to identify real questionnaire problems. • In other applications, this type of validation may be quite difficult • Further, LCA methodology is being pushed to adjust the reported survey estimates for misclassification bias. • Unemployment rate • Expenditures • Total population size in a census

  17. Some Issues for Future Research Investigating the Validity of LCA Estimates • Robustness of the estimates of classification error probabilities to violations of the model assumptions • Local dependence • Unobserved heterogeneity • Dependent classification errors • Unequal probability sampling • Sample clustering

  18. Some Issues for Future Research(cont’d) • Robustness of the model fit statistics • L2 and X2 • Convergence problems • Local maxima • Boundary solutions • Bias in the estimates of standard errors of the estimates • Effects of weighting • Clustered samples

  19. Some Recent Literature • Asparouhov, T., Muthen & Muthen (2004). “Weighting for Unequal Probability of Selection in Latent Variable Modeling,” Mplus Web Notes: No. 7, Version 3 • Patterson, B., Dayton, M., and Graubard, B. (2002). “Latent Class Analysis of Complex Sample Survey Data: Application to Dietary Data,” JASA, Vol. 97, No. 459, pp. 721-741 • Vermunt, J. and Magidson, J. (2001). “Latent Class Analysis with Sampling Weights,” presented at the Sixth Annual Meeting of the Methodology Section of the American Sociological Association, University of Minnesota • Biemer, P., Brown, G., and Judson, D. (2004). “Robustness of LCA Estimates of Population Size to Model Failure,” unpublished Census Bureau project reports

More Related