Neuroimaging Techniques for Machine Learners: Validation and Inference Insights

Neuroimaging for Machine LearnersValidation and inference PRoNTo course May 2012 Christophe Phillips Cyclotron Research Centre, ULg, Belgium http://www.cyclotron.ulg.ac.be

Introduction • Within vs. between subjects analysis • GLM and contrasts • Statistical inference • Multiple comparison problem • Other inference levels • Conclusion

Standard Statistical Analysis (encoding) Input Output Standard univariate approach ... Voxel-wise GLM model estimation Correction for multiple comparisons Independent statistical test at each voxel BOLD signal Univariate statistical Parametric map Time Find the mappinggfrom explanatory variableXto observed dataY g: X  Y

Within vs. between subjects Within subject analysis: • data = functional MRI (fMRI) • modelling BOLD response time series, usually activation task • sometimes with parametric modulation and/or confounding regressor • « first level » analysis • output = raw signal, contrast images and statistical maps

Within vs. between subjects Between subjects analysis: • data = tissue probability, deformation map, contrast image, FDG-PET scan,… • modelling group differences or regressing subjects’ score (confound/interest) • « second level » analysis • output = raw signal, contrast images and statistical maps

PET scan

X = + Y General Linear Model N: # images p: # regressors • GLM defined by: • design matrix X • error term ε distribution

Design matrix example Single subject fMRI time series Group comparison with two-sample t-test

T-test & contrast A contrast selects a specific effect of interest: contrast c = vector of length p. cTβ= linear combination of regression coefficients β. cT = [1 0 0 0 0 …] cTβ = 1xb1 + 0xb2 + 0xb3 + 0xb4 + 0xb5 + . . . cT = [0 -1 1 0 0 …] cTβ = 0xb1+-1xb2 + 1xb3 + 0xb4 + 0xb5 + . . . Under i.i.dassumptions (for error term ε):

Statistics: p-values adjusted for search volume set-level cluster-level voxel-level mm mm mm p c p k p p p T Z p ( ) corrected E uncorrected FWE-corr FDR-corr uncorrected º 0.000 10 0.000 520 0.000 0.000 0.000 13.94 Inf 0.000 -63 -27 15 SPMresults: 0.000 0.000 12.04 Inf 0.000 -48 -33 12 0.000 0.000 11.82 Inf 0.000 -66 -21 6 10 Height threshold T = 3.2057 {p<0.001} 0.000 426 0.000 0.000 0.000 13.72 Inf 0.000 57 -21 12 0.000 0.000 12.29 Inf 0.000 63 -12 -3 voxel-level 0.000 0.000 9.89 7.83 0.000 57 -39 6 20 mm mm mm 0.000 35 0.000 0.000 0.000 7.39 6.36 0.000 36 -30 -15 T Z p ( ) 0.000 9 0.000 0.000 0.000 6.84 5.99 0.000 51 0 48 uncorrected º 0.002 3 0.024 0.001 0.000 6.36 5.65 0.000 -63 -54 -3 0.000 8 0.001 0.001 0.000 6.19 5.53 0.000 -30 -33 -18 30 13.94 Inf 0.000 -63 -27 15 0.000 9 0.000 0.003 0.000 5.96 5.36 0.000 36 -27 9 0.005 2 0.058 0.004 0.000 5.84 5.27 0.000 -45 42 9 12.04 Inf 0.000 -48 -33 12 0.015 1 0.166 0.022 0.000 5.44 4.97 0.000 48 27 24 40 11.82 Inf 0.000 -66 -21 6 0.015 1 0.166 0.036 0.000 5.32 4.87 0.000 36 -27 42 13.72 Inf 0.000 57 -21 12 12.29 Inf 0.000 63 -12 -3 50 9.89 7.83 0.000 57 -39 6 7.39 6.36 0.000 36 -30 -15 60 6.84 5.99 0.000 51 0 48 6.36 5.65 0.000 -63 -54 -3 70 6.19 5.53 0.000 -30 -33 -18 5.96 5.36 0.000 36 -27 9 80 5.84 5.27 0.000 -45 42 9 5.44 4.97 0.000 48 27 24 0.5 1 1.5 2 2.5 5.32 4.87 0.000 36 -27 42 1 Design matrix t-test example: Passive wordlistening versus rest cT = [ 1 0 ] Q: activation during listening ? Null hypothesis:

Classical inference u • The Null Hypothesis H0 • = what we want to disprove (no effect). • The Alternative Hypothesis HA • = outcome of interest.  Significance level α: Acceptable false positive rate α.  threshold uα Null Distribution of T t Observation of test statistic t = a realisation of T P-val  Reject H0 in favour of HA if t > uα p-value = evidence against H0. Null Distribution of T

RSS0 RSS F-test & contrast Null Hypothesis H0: True model is X0 (reduced model) X0 X0 X1 Test statistic: ratio of explained variability and unexplained variability (error) 1 = rank(X) – rank(X0) 2 = N – rank(X) Full model ? or Reduced model?

F-test example: movement related effects

t > 0.5 t > 3.5 t > 5.5 Multiple comparison problem High Threshold Med. Threshold Low Threshold Good SpecificityPoor Power(risk of false negatives) Poor Specificity(risk of false positives)Good Power

Signal Multiple comparison problem Noise Signal+Noise

Use of ‘uncorrected’ p-value, a = 0.1 11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5% Percentage of Null Pixels that are False Positives Multiple comparison problem Use of ‘corrected’ p-value, a=0.1 FWE

More inference levels

Conclusion From a ‘machine learner’ perspectives, GLM is useful for: • Check for « is there any effect? » • Data filtering (i.e. confound removal) • HRF deconvolution for fMRI • Feature selection/masking 19

Thank you for your attention! Any question?

Neuroimaging Techniques for Machine Learners: Validation and Inference Insights

Neuroimaging Techniques for Machine Learners: Validation and Inference Insights

Presentation Transcript

IT/CS 811 Principles of Machine Learning and Inference

Validation of capsule filling machine

The Central Neuroimaging Data Archive Tools for storing, analyzing, and exploring neuroimaging data

IT/CS 811 Principles of Machine Learning and Inference

Neuroimaging for Cognitive Research

Neuroimaging: from image to Inference

Clinical and Advanced Neuroimaging : A Primer for Providers

Technologies for Neuroimaging

IT/CS 811 Principles of Machine Learning and Inference

IT/CS 811 Principles of Machine Learning and Inference

Neuroimaging and intersubjectivity in psychiatry

Inference by permutation of multi-subject neuroimaging studies

Textual entailment inference in machine translation

NeuroImaging

Neuroimaging technologies

Neuroimaging

Developmental Neuroimaging

Validation and Validation Checks

Boosting for prediction and inference

Textual entailment inference in machine translation

IT/CS 811 Principles of Machine Learning and Inference

Inference by permutation of multi-subject neuroimaging studies