1 / 116

Psychometric analyses of ADNI data

This paper discusses psychometric analyses of ADNI data, including the ADNI neuropsychological battery, latent variable approaches, SEM and IRT methods, and specific analyses of memory and executive functioning. The paper also explores ways to handle categorical data in SEM.

hdavies
Download Presentation

Psychometric analyses of ADNI data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Psychometric analyses of ADNI data Paul K. Crane, MD MPH Department of Medicine University of Washington

  2. Disclaimer • Funding for this conference was made possible, in part by Grant R13 AG030995 from the National Institute on Aging. • The views expressed do not necessarily reflect official policies of the Department of Health and Human Services; nor does mention by trade names, commercial practices, or organizations imply endorsement by the U.S. Government.

  3. Outline • ADNI neuropsychological battery • Latent variable approaches • SEM and IRT • ADAS-Cog in ADNI • Memory in ADNI • Executive functioning in ADNI

  4. ADNI Neuropsychological Battery

  5. Handout • There is a handout that summarizes these tests and provides the variable names for the variables in the dataset

  6. ADNI Neuropsychological Battery

  7. MCI

  8. AD

  9. Alternate Word Lists There are also two versions of the Rey AVLT that are alternated

  10. Summary • Repeated administration of a rich neuropsychological battery at 6 month intervals for 2 (AD) or 3 (NC, MCI) years • How do we drink from that fire hose?

  11. Strategies for analyzing these data • Pick a couple of tests and ignore the others • ADAS-Cog and MMSE • CDR and CDR-SB • Modifications of those tests • ADAS-Tree • ADAS-Rasch • Composite scores for specific domains • Z score • Something fancier using latent variable approach

  12. Outline • ADNI neuropsychological battery • Latent variable approaches • SEM and IRT • ADAS-Cog in ADNI • Memory in ADNI • Executive functioning in ADNI

  13. Latent variable approach • “Items” not intrinsically interesting, only as indicators of the underlying thing measured by the test • Many nice properties follow

  14. Parallel development 1: SEM • “Measurement part” of the model specifies how latent constructs are modeled • “Structural part” of the model specifies relationships between latent constructs and each other, and between latent constructs and other covariates

  15. http://sites.google.com/site/lvmworkshop/home/downloads-general/2010-downloadshttp://sites.google.com/site/lvmworkshop/home/downloads-general/2010-downloads

  16. Bunch of indicators limmtotal avtot1 avtot2 avtot3 avtot4 avtot5 avtotb avtot6 ldeltotal avdel30min avdeltot cot1scor cot2scor cot3scor cot4totl mmballdl mmflagdl mmtreedl

  17. “Memory” limmtotal avtot1 avtot2 avtot3 avtot4 avtot5 avtotb avtot6 Memory ldeltotal avdel30min Underlying single factor with many indicators avdeltot cot1scor cot2scor cot3scor cot4totl mmballdl mmflagdl mmtreedl

  18. limmtotal LM story avtot1 avtot2 avtot3 Rey word list avtot4 avtot5 avtotb avtot6 Memory ldeltotal avdel30min avdeltot cot1scor ADAS word list cot2scor cot3scor cot4totl mmballdl MMSE words mmflagdl mmtreedl

  19. limmtotal LM story avtot1 HC volume avtot2 avtot3 Rey word list avtot4 avtot5 avtotb avtot6 Memory ldeltotal avdel30min avdeltot This is what we care about! cot1scor ADAS word list cot2scor cot3scor cot4totl mmballdl MMSE words mmflagdl mmtreedl

  20. limmtotal avtot1 HC volume avtot2 avtot3 avtot4 avtot5 avtotb avtot6 Memory* ldeltotal avdel30min avdeltot This is what we care about! cot1scor … and typically don’t care whether memory is modeled this way or with all that secondary structure cot2scor cot3scor cot4totl mmballdl mmflagdl mmtreedl

  21. limmtotal LM story avtot1 HC volume avtot2 avtot3 Rey word list avtot4 avtot5 avtotb avtot6 Memory ldeltotal avdel30min avdeltot This is what we care about! cot1scor And sometimes we care about it for 600,000 SNPs, or for voxels – we need to move outside of an SEM package for some of the analyses we want to do ADAS word list cot2scor cot3scor cot4totl mmballdl MMSE words mmflagdl mmtreedl

  22. Parallel development 2: IRT • Models are nested within SEM • Single factor confirmatory factor analysis model • Initially worked out with binary indicators • Extended 1960s to polytomous items (Samejima) • It’s only the measurement part • Attention to the quality of measurement and the quality of scores

  23. Typical SEM example Indicator 1 Indicator 2 Construct Indicator 3 Indicator 4

  24. Depression Beck Zung Depression CESD PHQ-9

  25. A closer look at PHQ-9 • A 9 item depression scale • Standard scores totalled up • Typical SEM model would take that total score and treat it as a continuous indicator by using a linear link (single loading parameter) PHQ-9

  26. PHQ-9 measurement properties

  27. IRT approach to PHQ-9

  28. SEM and IRT, then and now • SEM was initially about total scores as indicators of constructs measured in common across tests • IRT was initially about item level data that had to satisfy assumptions • More recently: merging the strengths of the two approaches • Computational rather than conceptual advances, or maybe computational advances have fueled conceptual advances

  29. Categorical data in SEM • Runmplus code: • That little code snippet tells Mplus to treat all of the elements in the local `vlist’ as categorical data • Mplus default is WLSMV • Other appropriate ways of handling • Major reason Mplus is dominant SEM software used at FH

  30. What about IRT? • Array of tools to address measurement precision • Explicit focus on measurement properties and measurement precision differentiates it from SEM

  31. Pretend for a moment that a single factor model was appropriate… • Item response theory (IRT) developed middle of last century • Lord and Novick / Birnbaum (1968) • Polytomous extension Samejima 1969 • Lord 1980 • Hambleton et al. 1991 • XCALIBRE, Parscale, Multilog • All variations on a single factor CFA model

  32. 4 items each at 0.5 increments

  33. Comments on that test • Essentially linear test characteristic curve • Immaterial whether the standard score or the IRT score is used in analyses • No ceiling or floor effect • People at the extremes of the thing measured by the test will get some right and get some wrong • Pretty nice test!

  34. 2 items each at 0.5 increments

  35. Comments on that test • Essentially linear test characteristic curve • Immaterial whether the standard score or the IRT score is used in analyses • No ceiling or floor effect • People at the extremes of the thing measured by the test will get some right and get some wrong • Pretty nice test! • But that’s what we said about the last one and it had twice as many items!

  36. Why might we want twice as many items? • Measurement precision / reliability • CTT: summarized in a single number: alpha • IRT: conceptualized as a quantity that may vary across the range of the test • Information • Mathematical relationship between information and standard error of measurement • Intuitively makes sense that a test with 2x the items will measure more precisely / more reliably than a test with 1x the items

  37. Test information curves for those two tests

  38. Standard errors of measurement for the two tests

  39. Comments about these information and SEM curves • Information curves look more different than the SEM curves • Inverse square root relationship • TIC 100  SEM 0.10 (1/10) • TIC 25  SEM 0.20 (1/5) • TIC 16  SEM 0.25 (1/4) • TIC 9  SEM 0.33 (1/3) • TIC 4  SEM 0.50 (1/2) • Trade off between test length and measurement precision

  40. These were highly selected “tests” • It would be possible to design such a test if we started with a robust item pool • Almost certainly not going to happen by accident / history • What are more realistic tests?

  41. Test characteristic curves for 2 26-item dichotomous tests

  42. Comments on these TCCs • Same number of items but very different shapes • Now it may matter whether you use an IRT score or a standard score in analyses • Both ceiling and floor effects

  43. TICs

  44. SEMs

  45. Comments on the TICs and SEMs • Comparing the red test and the blue test: the red test is better for people of moderate ability (more items close to where they are) • For people right in the middle, measurement precision is just as good as a test twice as long • Items far away from your ability level don’t help your standard error • The blue test is better for people at the extremes (more items close to where they are)

  46. Where do information curves come from? • Item information curves use the same parameters as the item characteristic curves (difficulty level, b, and strength of association with latent trait or ability, a) (see next slides) • Test information is the sum of all of the item information curves • We can do that because of local independence

  47. I(θ) = D2a2*P(θ)*Q(θ)

  48. I(θ) = D2a2*P(θ)*Q(θ)

  49. I(θ) = D2a2*P(θ)*Q(θ)

  50. I(θ) = D2a2*P(θ)*Q(θ)

More Related