1 / 29

Accuracy, Reliability, and Validity of Freesurfer Measurements

Accuracy, Reliability, and Validity of Freesurfer Measurements. David H. Salat salat@nmr.mgh.harvard.edu. Why Talk About This?. This is not meant to imply that everything is perfect in FreeSurfer processing

astanford
Download Presentation

Accuracy, Reliability, and Validity of Freesurfer Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accuracy, Reliability, and Validity of Freesurfer Measurements David H. Salat salat@nmr.mgh.harvard.edu

  2. Why Talk About This? • This is not meant to imply that everything is perfect in FreeSurfer processing • The information here should be used as a guide for how to assess the data in your own projects. • These are general theories, that apply to all types of data, structural, functional, cognitive, etc.

  3. What is Accuracy? • Accuracy: the degree of closeness of a measured or calculated quantity to its actual (true) value (e.g. a physical property such as length or thickness) • MRI measures are indirect. We may be able to measure morphometry accurately given the contrast of the MR image, however, this contrast may differ from measurements from the actual tissue properties.

  4. What is Reliability? • Measures obtained for the same individual on two different trials, typically close together in time to avoid a biological influence on the reliability measure • Reliability of a labeling procedure in the same scan (e.g. hippocampus; usually for manual labeling) • Reliability of the labeling procedure in the same subject on two different scans collected on the same scanner (automated procedures) • Reliability of the labeling procedure in the same subject on two different scans collected on two different scanners (multi-site studies) • Effect Reliability: Replication of the experiment in an independent sample.

  5. What is Validity? • Validity: the extent to which an indirect measurement is representative of what it is supposed to measure. • For example, in fMRI we use blood flow as an indirect measure of neural activity. Is this a valid measure of neural activity?

  6. Validity Examples • Internal validity: Strength of the overall experimental design, study sample size, analysis procedures, etc.? • External validity: Generalize to another sample? (replication) • Ecological validity: Applied in the real world outside of the experimental setting? (clinical application) • Construct validity: Totality of evidence? (do the data fit with what is known?) • Convergent validity: Correlation with other types of measures that it should theoretically be correlated with? (do the data correlate with ‘gold standards’) • Discriminant validity: Not correlated with measures it should not be correlated with? (intracranial volume/age)

  7. Types of Error • Random Error: Unknown and unpredictable changes in the measurement • Should be unbiased • Accuracy, reliability, and validity all limited by error • Systematic error: Predictable offset or scaling of data • Typically comes from some more obvious aspect of the data acquisition/analysis (e.g. there is a global offeset of values at 3T as at 1.5T; this must be considered when combining data across scanners) • Can potentially be identified and corrected

  8. Why is this important? • Sensitivity: Poor reliability increases variance across individuals and across timepoints. • Many studies would benefit from the ability to measure minute changes across time. • Interpretation: Validity is directly tied to interpretation. You may have a valid measure of ‘cortical thickness’, but ‘cortical thickness’ might not be a valid measure of degeneration

  9. Cortical ReconstructionSubcortical Segmentation (Recon-all) Output Data Surfaces (computer models) Original T1 Data Volumes(labeled MRI images) Segmentation, parcellation, white matter parcellation Thickness, aparc, curv, sulc, jacobian VisualizationTksurfer Visualizationtkmedit Individual subjects Individual subjects Group comparisons/statistics Region of interestanalysis Region of interestanalysis Spreadsheet Stat software Spreadsheet Stat software Qdec, mri_glmfit

  10. Accuracy and Validity of Spherical Averaging for Labeling Structural and Functional Anatomy Use of folding patterns to align subjects. Alternative to Talairach/MNI. Fischl et al., 1999

  11. Anatomic Labeling Matching a manual anatomic label of the central sulcus across individuals.Bruce will talk about matching cytoarchetectonics. Fischl et al., 1999

  12. Functional Labeling Matching a functional retinotopic labels of the visual fields across individuals. Fischl et al., 1999

  13. Enhanced fMRI Statistical Power Averaging functional data across subjects on a cognitive task. Fischl et al., 1999

  14. Cortical Thickness(Results fall within expected range) • Consistent with published findings: • crowns of gyri are thicker than the fundi of sulci • sensory areas are among the thinnest in the cortex. Fischl et al., 1999

  15. Values match manual measurements from published imaging data Fischl et al., 1999

  16. Manual Measurements • Age effects with automated procedures replicated with manual measures • Can only be done in regions where folds are appropriate • Calcarine also consistent values across studies (different scanners) Salat et al., 2004 Calcarine Orbitofrontal Kuperberg et al., 2003

  17. Cortical Thickness Comparison with Postmortem Measures Rosas et al., 2002

  18. Subcortical/Volumetric Segmentation: Automated measures are similar in size and region to manual measures, and predict who will develop AD Fischl et al., 2002

  19. Cortical Parcellations: Compared to Manually Labeled Data • 1 volume and 2 surface based labeling schemes • Percent of subjects labeled correctly at each location across the surface. Volume Atlas Surface Atlas Surface Atlas 2 Fischl et al., 2004 Desikan et al., 2006

  20. White matter Parcellation: same subjects scanned at different times • Most regions within 5% Salat et al., 2008

  21. Comparison across time, scanner, field strength, number of scans, sequence type, scanner upgrade, and scanner manufacturer Han et al., 2006

  22. Effects of Pulse Sequence, Voxel Geometry/resolution, and Parallel Imaging • Wonderlick et al, 2009: Parallel acceleration, increased spatial resolution, high bandwidth multiecho sequence. • Reliability high across imaging parameters. • Significant measurement bias observed between MPR and all isotropic sequences for all cortical regions and some subcortical structures. • Improvements in MRI acquisition technology do not compromise data reproducibility, but consistency should be maintained. • Jovicich et al., 2009: Averaging multiple acquisitions, B1 correction, acquisition sequence (MPRAGE vs. multi-echo-FLASH), scanner upgrades (Sonata-Avanto, Trio-TrioTIM), segmentation atlas (MPRAGE or multi-echo-FLASH) • Minimally affected by different manipulations • Volume measurements across platforms (Siemens Sonata vs. GE Signa) and field strengths (1.5 T vs. 3 T) result in bias but with comparable variance as within-scanner • Multi-site studies may not necessarily require a much larger sample to detect a specific effect.

  23. Replication of Study Results:Split Sample • Concordant results are likely not due to statistical error • Current study with 5 samples used in prior literature assessing the replicability of cortical/subcortical (Fjell et al., 2009; Walhovd et al., 2009) Salat et al., 2004

  24. Replicable WM Parcellation results across sex and hemisphere Men Women Salat et al., 2008

  25. Replication of Effects in Same Participants Across Scanning Conditions Dickerson et al., 2008

  26. Consistent Findings Across 4 samples Used To Identify Regions with Predictive Validity • Regional measures predict who will progress to AD. Dickerson et al., 2008

  27. Conclusions • Any tool used for MR analysis should be rigorously tested for accuracy, reliability, and validity • Most of the measures from Freesurfer have good accuracy, reliability, and validity across a range of conditions • These results are dependent on optimal input data and correct implementation • These data provide confidence, but do not substitute for using similar procedures to check data from each new study

  28. Cross Sequence Parameters • Different pairs of flip angles can be used for reliable measures Fischl et al., 2004

More Related