1 / 53

Model-driven statistical analysis of fMRI data

Model-driven statistical analysis of fMRI data. Keith Worsley Department of Mathematics and Statistics, Brain Imaging Centre, Montreal Neurological Institute, McGill University www.math.mcgill.ca/keith. References.

dior
Download Presentation

Model-driven statistical analysis of fMRI data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model-driven statistical analysis of fMRI data Keith Worsley Department of Mathematics and Statistics, Brain Imaging Centre, Montreal Neurological Institute, McGill University www.math.mcgill.ca/keith

  2. References • Worsley et al. (2002). A general statistical analysis for fMRI data. NeuroImage, 15:1-15. • Liao et al. (2002). Estimating the delay of the response in fMRI data. NeuroImage, 16:593-606. • FMRISTAT: MATLAB package from www.math.mcgill.ca/keith/fmristat

  3. First scan of fMRI data Highly significant effect, T=6.59 1000 hot 890 rest 880 870 warm 500 0 100 200 300 No significant effect, T=-0.74 820 hot 0 rest 800 T statistic for hot - warm effect warm 5 0 100 200 300 Drift 810 0 800 790 -5 0 100 200 300 Time, seconds fMRI data: 120 scans, 3 scans each of hot, rest, warm, rest, hot, rest, … T = (hot – warm effect) / S.d. ~ t110 if no effect

  4. Exploring the data: PCA of time  space Temporal components (sd, % variance explained) 1 105.7, 77.8% 2 26.1, 4.8% Component 3 15.8, 1.7% 4 14.8, 1.5% 0 20 40 60 80 100 120 Frame Spatial components 1 2 Component 3 4 0 2 4 6 8 10 Slice 1: exclude first frames 2: drift 3: long-range correlation or anatomical effect: remove by converting to % of brain 4: signal?

  5. Modeling the data: Choices … • Time domain / frequency domain? • AR / ARMA / state space models? • Linear / non-linear time series model? • Fixed HRF / estimated HRF? • Voxel / local / global parameters? • Fixed effects / random effects? • Frequentist / Bayesian? Compromise: Simple, general, valid, robust, fast statistical analysis

  6. Covariates example: pain perception Alternating hot and warm stimuli separated by rest (9 seconds each). 2 warm warm hot hot 1 0 -1 0 50 100 150 200 250 300 350 Hemodynamic response function: difference of two gamma densities 0.4 0.2 0 -0.2 0 50 Responses = stimuli * HRF, sampled every 3 seconds 2 1 0 -1 0 50 100 150 200 250 300 350 Time, seconds

  7. Linear model for fMRI time series with AR(p) correlated errors • Linear model: • ? ? • Yt = (stimulust * HRF) b + driftt c + errort • AR(p) errors: • ? ? ? • errort = a1 errort-1 + … + ap errort-p + s WNt • ‘White Noise’ unknown parameters

  8. 0.3 0.2 0.1 0 -0.1 First step: estimate the autocorrelation ? AR(1) model: errort = a1errort-1 + s WNt • Fit the linear model using least squares • errort = Yt – fitted Yt • â1 = Correlation ( errort , errort-1) • Estimating errort’schanges their correlation structure slightly, so â1 is slightly biased: Raw autocorrelation Smoothed 15mm Bias corrected â1 ~ -0.05 ~ 0

  9. Hot - warm effect, % Sd of effect, % 1 0.25 0.2 0.5 0.15 0 0.1 -0.5 0.05 -1 0 T = effect / sd, 110 df 6 4 T > 4.93 (P < 0.05, corrected) 2 0 -2 -4 -6 Second step: pre-whiten, refit the linear model Pre-whiten: Yt* = Yt – â1 Yt-1, then refit using least squares:

  10. Higher order AR model? Try AR(3): No correlation biases T up ~12%  more false positives a a a 1 2 3 0.3 0.2 AR(1) seems to be adequate 0.1 0 … has little effect on the T statistics: -0.1 AR(1) AR(2) AR(3) 5 0 -5

  11. Results from 4 runs on the same subject

  12. Mixed effects linear modelfor combining effects from different runs/sessions/subjects: • Ei = effect for run/session/subject i • Si = standard error of effect • Mixed effects model: Ei = covariatesi c + Si WNiF +  WNiR }from Lin. Mod. ? ? Usually 1, but could add group, treatment, age, sex, ... ‘Fixed effects’ error, due to variability within the same run Random effect, due to variability from run to run

  13. REML estimation using the EM algorithm • Slow to converge (10 iterations by default). • Stable (maintains estimate 2 > 0 ), but • 2 biased if 2 (random effect) is small, so: • Re-parameterize the variance model: Var(Ei) = Si2 + 2 = (Si2 –minj Sj2) + (2 +minj Sj2) = Si*2 + *2 • 2 = *2 –minj Sj2 (less biased estimate) ^ ^ ? ? ^ ^

  14. Problem: 4 runs, 3 df for random effects sd  ... Run 1 Run 2 Run 3 Run 4 MULTISTAT 1 Effect, 0 E i -1 0.2 Sd, S 0.1 i 0 5 T stat, 0 E / S i i -5 … very noisy sd: … and T>15.96 for P<0.05 (corrected): … so no response is detected …

  15. Solution: Spatial regularization of the sd • Basic idea: increase df by spatial smoothing (local pooling) of the sd. • Can’t smooth the random effects sd directly, - too much anatomical structure. • Instead, • random effects sd • fixed effects sd • which removes the anatomical structure before smoothing.  ) sd= smooth  fixed effects sd

  16. ^ Average Si  Random effects sd, 3 df Fixed effects sd, 440 df Mixed effects sd, ~100 df 0.2 0.15 0.1 0.05 0 divide multiply Random sd / fixed sd Smoothed sd ratio 1.5 random effect, sd ratio ~1.3 1 0.5

  17. 400 300 200 100 0 0 20 40 Infinity Effective df depends on smoothing FWHMratio2 3/2 FWHMdata2 e.g. dfrandom = 3, dffixed = 4  110 = 440, FWHMdata = 8mm: = + dfratio = dfrandom(2 + 1) 1 1 1 dfeffdfratiodffixed fixed effects analysis, dfeff = 440 dfeff FWHM = 19mm Why 100? If out by 50%, dbn of T not much affected Target = 100 df random effects analysis, dfeff = 3 FWHMratio

  18. Run 1 Run 2 Run 3 Run 4 MULTISTAT 1 Effect, 0 E i -1 0.2 Sd, S 0.1 i 0 5 T stat, 0 E / S i i -5 Final result: 19mm smoothing, 100 effective df … … less noisy sd: … and T>4.93 for P<0.05 (corrected): … and now we can detect a response!

  19. P-values assessed for: • Peaks or local maxima • Spatial extent of clusters of neighbouring voxels above a pre-chosen threshold (~3) • Correct for searching over a pre-specified region (usually the whole brain), which depends on: • number of voxels in the search region (Bonferroni) or • number of resels = volume / FWHM3 in the search region (random field theory) • in practice, take the minimum of the two!

  20. FWHM is spatially varying (non-isotropic) • fMRI data is smoother in GM than WM • VBM data is highly non-isotropic • Has little effect on P-values for local maxima (use ‘average’ FWHM inside search region), but • Has a big effect on P-values for spatial extents: smooth regions → big clusters, rough regions → small clusters, so • Replace cluster volume by cluster resels = volume / FWHM3

  21. FWHM – the local smoothness of the noise voxel size (1 – correlation)1/2 FWHM = (2 log 2)1/2 (If the noise is modeled as white noise smoothed with a Gaussian kernel, this would be its FWHM) Volume FWHM3 P-values depend on resels: resels = Local maximum T = 4.5 Clusters above t = 3.0, search volume resels = 500 0.1 0.1 0.08 0.08 0.06 0.06 P value of cluster P value of local max 0.04 0.04 0.02 0.02 0 0 0 500 1000 0 0.5 1 1.5 2 Resels of search volume Resels of cluster

  22. Resels=1.90 P=0.007 Resels=0.57 P=0.387

  23. Statistical summary: clusters clus vol resel p-val (one) • 1 33992 54.22 0 ( 0) • 2 14150 25.03 0 ( 0) • 3 12382 20.29 0 ( 0) • 4 2538 3.12 0.011 (0.001) • 5 2538 2.77 0.016 (0.001) • 6 1577 2.15 0.035 (0.002) • 7 1000 1.43 0.098 (0.006) • 8 500 1.31 0.119 (0.007) • 9 1000 1.07 0.179 (0.011) • 10 385 0.99 0.208 (0.013)

  24. Statistical summary: peaks • clus peak p-val (one) q-val (i j k) ( x y z ) • 1 12.72 0 ( 0) 0 (59 74 1) ( 10.5 -28.7 24.1) • 1 12.58 0 ( 0) 0 (60 75 1) ( 8.2 -31 23.7) • 1 11.45 0 ( 0) 0 (61 73 2) ( 5.9 -25.3 17.5) • 1 11.08 0 ( 0) 0 (62 66 4) ( 3.5 -6.9 6.3) • 1 10.95 0 ( 0) 0 (61 70 4) ( 5.9 -16.2 4.8) • 1 10.6 0 ( 0) 0 (62 69 3) ( 3.5 -15 12.1) • 2 5.07 0.029 (0.004) 0 (48 69 10) ( 36.3 -7.3 -36.3) • 3 5.06 0.029 (0.004) 0 (73 72 9) (-22.3 -15.3 -30.5) • 3 5.03 0.033 (0.004) 0 (81 63 10) ( -41 6.6 -34.1) • 13 5.02 0.035 (0.005) 0 (88 72 8) (-57.4 -16.4 -23.6) • 6 4.91 0.054 (0.007) 0 (42 69 3) ( 50.4 -15 12.1) • 11 4.91 0.055 (0.007) 0 (69 70 7) (-12.9 -12.9 -15.9) • 9 4.91 0.055 (0.007) 0 (48 46 5) ( 36.3 40.5 6.7) • 1 4.85 0.069 (0.008) 0 (52 93 2) ( 27 -71.6 10.2) • 3 4.82 0.08 (0.009) 0 (79 66 8) (-36.3 -2.5 -21.4) • 3 4.81 0.082 (0.009) 0 (78 65 8) ( -34 -0.2 -21) • 1 4.8 0.086 ( 0.01) 0 (62 59 5) ( 3.5 10.4 1.9) • 3 4.77 0.097 (0.011) 0 (82 61 10) (-43.4 11.2 -33.4) • 1 4.75 0.106 (0.012) 0 (55 71 2) ( 19.9 -20.7 18.3) • 5 4.73 0.114 (0.012) 0 (67 84 2) ( -8.2 -50.8 13.5)

  25. T>4.86

  26. T>4.86 T > 4.93 (P < 0.05, corrected)

  27. T>4.86 T > 4.93 (P < 0.05, corrected)

  28. T>4.86

  29. Efficiency : optimum block design Sd of hot stimulus Sd of hot-warm 0.5 0.5 20 20 0.4 0.4 15 15 Magnitude Optimum design 0.3 0.3 10 10 Optimum design 0.2 0.2 X 5 5 0.1 0.1 0 0 X 0 0 5 10 15 20 5 10 15 20 InterStimulus Interval (secs) (secs) (secs) 1 1 20 20 0.8 0.8 15 15 Delay 0.6 0.6 Optimum design X Optimum design X 10 10 0.4 0.4 5 5 0.2 0.2 (Not enough signal) (Not enough signal) 0 0 0 5 10 15 20 5 10 15 20 Stimulus Duration (secs)

  30. Efficiency : optimum event design 0.5 (Not enough signal) uniform . . . . . . . . . ____ magnitudes ……. delays random .. . ... .. . 0.45 concentrated : 0.4 0.35 0.3 Sd of effect (secs for delays) 0.25 0.2 0.15 0.1 0.05 0 5 10 15 20 Average time between events (secs)

  31. + + How many subjects? • Largest portion of variance comes from the last stage i.e. combining over subjects: sdrun2sdsess2sdsubj2 nrunnsessnsubj nsessnsubj nsubj • If you want to optimize total scanner time, take more subjects. • What you do at early stages doesn’t matter very much!

  32. References • Worsley et al. (2002). A general statistical analysis for fMRI data. NeuroImage, 15:1-15. • Liao et al. (2002). Estimating the delay of the response in fMRI data. NeuroImage, 16:593-606. • FMRISTAT: MATLAB package from www.math.mcgill.ca/keith/fmristat

  33. 0.6 0.4 0.2 0 -0.2 -0.4 -5 0 5 10 15 20 25 t (seconds) Estimating the delay of the response • Delay or latency to the peak of the HRF is approximated by a linear combination of two optimally chosen basis functions: delay basis1 basis2 HRF shift • HRF(t + shift) ~ basis1(t)w1(shift) + basis2(t)w2(shift) • Convolve bases with the stimulus, then add to the linear model

  34. 3 2 1 0 -1 -2 -3 -5 0 5 shift (seconds) • Fit linear model, estimate w1andw2 • Equate w2/w1to estimates, then solve for shift (Hensen et al., 2002) • To reduce bias when the magnitude is small, use • shift / (1 + 1/T2) • where T = w1/ Sd(w1) is the T statistic for the magnitude • Shrinks shift to 0 where there is little evidence for a response. w2 /w1 w1 w2

  35. Shift of the hot stimulus T stat for magnitude T stat for shift Shift (secs) Sd of shift (secs)

  36. Shift of the hot stimulus T stat for magnitude T stat for shift T>4 T~2 Shift (secs) Sd of shift (secs) ~1 sec +/- 0.5 sec

  37. Run 1 Run 2 Run 3 Run 4 MULTISTAT 4 2 Effect, 0 E i -2 -4 2 Sd, 1 S i 0 5 T stat, 0 E / S i i -5 Combining shifts of the hot stimulus (Contours are T stat for magnitude > 4)

  38. Shift of the hot stimulus Shift (secs) T stat for magnitude > 4.93

  39. References • Worsley et al. (2002). A general statistical analysis for fMRI data. NeuroImage, 15:1-15. • Liao et al. (2002). Estimating the delay of the response in fMRI data. NeuroImage, 16:593-606. • FMRISTAT: MATLAB package from www.math.mcgill.ca/keith/fmristat

  40. False Discovery Rate (FDR) • Benjamini and Hochberg (1995), Journal of the Royal Statistical Society • Benjamini and Yekutieli (2001), Annals of Statistics • Genovese et al. (2001), NeuroImage • FDR controls the expected proportion of false positives amongst the discoveries, whereas • Bonferroni / random field theory controls the probability of any false positives • No correction controls the proportion of false positives in the volume

  41. Signal + Gaussian white noise P < 0.05 (uncorrected), T > 1.64 5% of volume is false + Signal True + Noise False + FDR < 0.05, T > 2.82 5% of discoveries is false + P < 0.05 (corrected), T > 4.22 5% probability of any false +

  42. Comparison of thresholds • FDR depends on the ordered P-values: • P1 < P2 < … < Pn. To control the FDR at a = 0.05, find • K = max {i : Pi< (i/n) a}, threshold the P-values at PK • Proportion of true + 1 0.1 0.01 0.001 0.0001 • Threshold T 1.64 2.56 3.28 3.88 4.41 • Bonferroni thresholds the P-values at a/n: • Number of voxels 1 10 100 1000 10000 • Threshold T 1.64 2.58 3.29 3.89 4.42 • Random field theory: resels = volume / FHHM3: • Number of resels 0 1 10 100 1000 • Threshold T 1.64 2.82 3.46 4.09 4.65

  43. P < 0.05 (uncorrected), T > 1.64 5% of volume is false +

  44. FDR < 0.05, T > 2.67 5% of discoveries is false +

  45. P < 0.05 (corrected), T > 4.93 5% probability of any false +

More Related