1 / 34

fBIRN “Human Phantom” Reproducibility Analysis

fBIRN “Human Phantom” Reproducibility Analysis. Functional Imaging Research of Schizophrenia Testbed (FIRST) Biomedical Informatics Research Network (BIRN) Grant Support: NCRRP41RR13218. fBIRN, March 3-5, 2004, Irvine, CA. Author List. Kelly H. Zou, PhD, Douglas N. Greve, PhD,

blake-yates
Download Presentation

fBIRN “Human Phantom” Reproducibility Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. fBIRN “Human Phantom” Reproducibility Analysis Functional Imaging Research of Schizophrenia Testbed (FIRST) Biomedical Informatics Research Network (BIRN) Grant Support: NCRRP41RR13218 fBIRN, March 3-5, 2004, Irvine, CA

  2. Author List Kelly H. Zou, PhD, Douglas N. Greve, PhD, Meng Wang, MSE, Steven D. Pieper, PhD, Simon K. Warfield, PhD, Nathan S. White, BS, Mark G. Vangel, PhD, Ron Kikinis, MD, William M. Wells, III, PhD, and FIRST BIRN Affiliations: Surgical Planning Laboratory, Brigham and Women’s Hospital, Department of Health Care Policy, Harvard Medical School, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Artificial Intelligent Laboratory, Massachusetts Institute of Technology, Computational Radiology Laboratory, Brigham and Women's Hospital, Functional Imaging Research of Schizophrenia Testbed (FIRST), Biomedical Informatics Research Network (BIRN) Grant Support: NCRRP41RR13218

  3. Study Outline •Multi-Site BIRN Study: 11 Sites (MN, UCI, UNC, UCLA…, BWH, MGH) •5 Healthy males as “Human Phantoms” •2 Visits on separate days per site per subject, 2 extra visits at one site for 3 of the 5 subjects •4 Sensory Motor (SM), 2 cognitive (Cog), and 2 breath-hold (BH) runs per visit

  4. Participating Institutions

  5. Goals •Purposes: It is meaningful to pool the data in order to yield a larger sample size in the next-phase clinical study (Schizophrenic vs. normal controls)? How to calibrate the differences due to various factors?

  6. Introduction • Main problem (Pooling): How to combine multi-site data and to validate the pooling mechanism? • Current problem (Calibration): How to compare and combine human phantom data in the calibration step? https://share.spl.harvard.edu/users/zou

  7. SM Task •Focus on Reproducibility of the SM Task Only: •Subjects perform bilateral finger tapping on button boxes (1 dummy button box and 1 actual) in time with 3Hz audio cue and flashing checkerboard square •Subjects press buttons 1 through 4 in consecutive order and then back again using both hands at simultaneously and in sync •Time frames=85; Scan time=4:06; Days 1 and 2

  8. Scanning Protocol

  9. Data Examined and Compared •Task: Sensory Motor •Site:5 Sites with 1.5T, 4 with 3T, 1 with 4T •Subject ID:#101; 103; 104; 105; 106 •Run: 4 and registered—Later combined by EM • Day:#101; 103; 106 tested on 4 days at Stanford and other subjects tested on 2 Days/Site •Threshold: B-Float data:  = – log10(p-value)sign(F-statistic) = 5; 7; and 9

  10. Image Registration •Performed registration over the repeated runs and across the sites in FreeSurfer • Voxel-to-voxel registration of the anatomical volume with the functional volume was conducted to convert the subject's anatomical volume to the corresponding functional space using a transformation matrix • An algorithm for TKRegister matrix contention was applied

  11. Image Registration • The TkRegister defines the registration matrix to be: T=-dc 0 0 (Nc/2)dc 0 0 ds -(Ns/2)ds 0 -dr 0 (Nr/2)dr 0 0 0 1 where dc, dr, and ds were the resolutions, Nc, Nr, and Ns were the number of columns,rows, and slices, respectively.

  12. Statistical Variables •Factors impacting on the activation patterns:

  13. Statistical Methods •Selection of Threshold: •The activation threshold was selected on the scale of the activation (“B-float”) data •The map was further standardized using the absolute value for each voxel, prior to statistical inferences

  14. STAPLE EM Flow Chart

  15. Estimation problem • Complete data density: Binary ground truth Ti for each voxel i. Expert j makes segmentation decisions Dij Expert performance characterized by sensitivity p and specificity q. • We observe expert decisions D. If we knew ground truth T, we could construct maximum likelihood estimates for each expert’s sensitivity (true positive fraction) and specificity (true negative fraction):

  16. Expectation-Maximization • General procedure for estimation problems that would be simplified if some missing data was available. • Key requirements are specification of: The complete data. Conditional probability density of the hidden data given the observed data. • Observable data D • Hidden data T, prob. density f(T|D,) • Complete data (D,T)

  17. Expectation-Maximization • Solve the incomplete-data log likelihood maximization problem • E-step: estimate the conditional expectation of the complete-data log likelihood function. • M-step: maximize

  18. Statistical Methods •Level 1 STAPLE EM: An EM-algorithm, Truth and Performance Level Estimation (STAPLE), was applied across the 4 runs to optimally derive a composite 3D gold standard activation map This algorithm combined all of the factors and enabled visualization of the gold standard in 3D Slicer •Level 2 STAPLE EM: a further EM-algorithm was applied tocompare site-to-site differences • P-values were computed via linear models

  19. Significant Factors •Statistical significant factors: For sensitivity: subject (p=0.01) For specificity: subject (p=0.04) run (p=0.04)

  20. Mean Activation Percentage

  21. Mean Sensitivity

  22. Mean Specificity

  23. STAPLE EM Over the 4 Runs MGH 104; Visit 1; 2D: Slice #18

  24. STAPLE EM Over the 4 Runs Visit 1: 3D Activation Map of 4 Runs and combined by EM (threshold =9) MGH103

  25. STAPLE EM Over the 4 Runs Visit 1: 3D Activation Map of 4 Runs and combined by EM (threshold =9) MGH103

  26. STAPLE EM Over the 4 Runs Visit 2: 3D Activation Map of 4 Runs and combined by EM (threshold =9) MGH103

  27. STAPLE EM Over the 4 Runs Visit 2: 3D Activation Map of 4 Runs and combined by EM (threshold =9) MGH103

  28. Summary •Site vs. Subject: The variability across subjects appeared greater than that across sites • Field Strength: 3T and 4T were better than 1.5T yielding more activation and less variability in sensitivity and specificity • Runs: There was a non-constant effect over the runs (registered) after the rest and task periods

  29. Summary • Visits:Less activation was observed and more robust and systematic activation under different thresholds for Visit 2 than Visit 1 • Extra Visits (Visits 1-4): For the three subjects performing for 4 visits at Stanford, less activation was observed for the latter two days with higher specificity and less variability. Variability across runs was quite high • Site vs. Subject: The variability across subjects appeared greater than that across sites

  30. Conclusion These findings may help develop a calibration plan to minimize the variability introduced by the sites,ultimately enabling us to pool independent functional data of normal and schizophrenic subjects across different institutions

  31. References fMRI Reproducibility • Brannen JH, Badie B, Moritz CH, Quigley M, Meyerand ME, and Haughton VM: Reliability of functional MR imaging with word-generation tasks for mapping Broca's area. American Journal of Neuroradiology 22 (2001) 1711-1718 • Machielsen WCM, Rombouts SARB, Barkhof F, Scheltens P, and Witter MP:fMRI of visual encoding: reproducibility of activation. Human Brain Mapping 9 (2000) 156-164 • Le TH and Hu X: Methods for assessing accuracy and reliability in functional MRI. NMR in Biomedicine 10 (1997) 160-164 • Genovese CR, Noll, DC and Eddy, WF: Estimating test-retest reliability in fMRI I: statistical methodology. Magnetic Resonance in Medicine 38 (1997) 497-507 • Maitra R, Roys SR, and Gullapalli RP: Test-retest reliability estimation of functional MRI Data. Magnetic Resonance in Medicine 48 (2002) 62-70

  32. References • Casey BJ, Cohen JD, O'Craven K, Davidson RJ, Irwin W, Nelson CA, Noll DC, Hu X, Lowe MJ, Rosen BR, Truwitt CL, Turski PA: Reproducibility of fMRI results across four institutions using a spatial working memory task. NeuroImage 8 (1998) 249-261 • Wei XC, Yoo S-S, Dickey CC, Zou KH, Guttmann CRG, Panych LP. Functional MRI of auditory verbal working memory: long-term reproducibility analysis. NeuroImage 21 (2004) 1000- 1008. EM-Algorithm • Warfield SK, Zou KH, Wells WM III: Validation of image segmentation andexpert quality with an expectation-maximization algorithm. MICCAI 2002, LNCS 2488 (2002) 290-297 • Warfield SK, Zou KH, Wells WM III: Simultaneous Truth and Performance Level Estimation (STAPLE): An Algorithm for the Validation of Image Segmentation. IEEE Transactions on Medical Imaging (2004) In Press

  33. References Visualization • Gering DT, Nabavi A, Kikinis R, Hata N, O'Donnell LJ, Grimson WE, Jolesz FA, Black PM, Wells MW III: An integrated visualization system for surgical planning and guidance using image fusion and an open MR. Journal of Magnetic Resonance Imaging 13 (2001) 967-975 ROC Analysis • Metz CE, Sherman BA, Shen JH: Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously- distributed data. Statistics in Medicine 17 (1998) 1033-1053 • Zou KH, Warfield SK, Bharatha A, Tempany CMC,Kaus M, Haker S, Wells WM III, Jolesz FA, Kikinis R: Statistical validation of imagage segmentation quality based on a spatial overlap index. Academic Radiology 11 (2004) 178-189 • Zou KH, Warfield SK, Fielding JR, Tempany CM, Wells MW III, Kaus MR, Jolesz FA, Kikinis R: Statistical validation based on parametric receiver operating characteristic analysis of continuous classification data. Academic Radiology 10 (2003) 1359-1368

  34. References • Zou KH, Wells MW III, Kikinis R,Warfield: Three validation metrics for automatedprobabilistic image segmentation of brain tumors. Statistics in Medicine 23 (2004) In Press • Zou KH, Hall WJ, Shapiro DE: Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Statistics in Medicine 16 (1997) 2143-2156 fMRI Threshold Selection • Genovese CR, Lazar NA, and Nichols T: Thresholding of statistical maps in functional neuroimaging using the false discovery rate. NeuroImage 15 (2002) 870-878 https://share.spl.harvard.edu/users/zou

More Related