1 / 36

Kernel Methods for fMRI Pattern Prediction 

Kernel Methods for fMRI Pattern Prediction . Yizhao Ni 2 , Carlton Chu 1 , Craig Saunders 2 , John Ashburner 1 1 .Wellcome Trust Centre for Neuroimaging, ION, UCL, London, UK. 2. ISIS Group, School of Electronics and Computer Science, University of Southampton.

evette
Download Presentation

Kernel Methods for fMRI Pattern Prediction 

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kernel Methods for fMRI Pattern Prediction  Yizhao Ni2 ,Carlton Chu1, Craig Saunders2, John Ashburner1 1.Wellcome Trust Centre for Neuroimaging, ION, UCL, London, UK. 2.ISIS Group, School of Electronics and Computer Science, University of Southampton.

  2. What are we measuring in fMRI?

  3. What are we measuring in fMRI? Blood Oxygen Level Dependent signal • neural activity   blood flow   oxyhemoglobin   T2*  MR signal So we can use the change in fMRI signal to infer the neural activity Source: Brief Introduction to fMRIby Irene Tracey

  4. =  = Peak BOLD Impulse Response a.k.a Haemodynamic Response function HRF Stimulus (“Neural”) Predicted BOLD Brief Stimulus  = Undershoot Stimulus (“Neural”) Predicted BOLD HRF Initial Undershoot Predicted BOLD HRF Stimulus (“Neural”) Does the BOLD signal response immediately? • neural activity   blood flow   oxyhemoglobin   T2*  MR signal NO

  5. Encoding and Decoding models General linear model fMRI images Design matrix Regression Machine Voxel-wse parameter estimation Activation pattern Frsiton. K, Bayesian decoding of brain images, 2007 Neuroimage Statistical parametric map (SPM) Prediction of the stimuli from images

  6. Face Fruit DOG Pittsburgh Brain Activity Interpretation Competition 2007 Playing VR games in MRI scanner Eye tracker http://www.ebc.pitt.edu

  7. Feature ratings to be predicted

  8. Methods • We employed two kernel regression techniques –Kernel Ridge Regression (KRR) and Relevance Vector Regression (RVR) • Ratings were trained and predicted independently • Good pre-processing and post-processing play important roles • We achieved very high scores among all groups (Max z’>0.961 Max r> 0.745)

  9. Data preprocessing Tissue Segmentation and masking

  10. Detrending 8 Discrete Cosine Basis functions Linear Detrend Time course of one voxel in the fMRI volumes C can any basis functions linear Discrete cosine

  11. Creating the Kernel • We denote each images as a feature vector xi • We define a p x m matrix X • -m is the number of image volumes (time points) • -p is the number of voxels in each image The linear Kernel matrix is And we can calculated the detrened kernel efficiently using the R matrix

  12. Ridge Regression • The primal form of ridge regression Note: Too big to compute as p is >100000 • The dual form, kernel ridge regression

  13. 0 0 0 0 0 Relevance Vector Regression Basis functions y1 y2 w1 w2 = b 0 yn wn With unknown varaince With unknown prior precision

  14. Introduction • We employed two kernel regression techniques –Kernel Ridge Regression (KRR) and Relevance Vector Regression (RVR) • Ratings were trained and predicted independently • Good pre-processing and post-processing play important roles • We achieved very high scores among all groups (Max z’>0.961 Max r> 0.745)

  15. Get pre-processing right is crucial • Masking • Ridge alignment (no unwarp) by SPM5 • No slice time correction • Discrete cosine functions detrendi(highpass filter) • Smooth by Gaussian kernel

  16. Feature Selection • Remove voxels which are very unlikely to provide information • From neuroimage literatures, gray matter shows higher BOLD response than white matter and CSF • SPM5 segmentation on EPI directly Gray matter Smoothed Mask

  17. Detrending 8 Discrete Cosine Basis functions Linear Detrend

  18. Detrending • The left Gram matrix is generated form the images pre-processed by the competition committee. The right Gram matrix is generated from images after DCT detrend, which is smoother. (subject13 vr1,vr2)

  19. Kernel Method • The kernel is a similarity measure between scans. For a linear kernel, it is the dot product between two scans. • We also used non-linear kernel linear RBF (γ=1.7/1e6) Polynomial (d=2,θ=1e7)

  20. Kernel Regression • There are different variants, such as relevance vector regression (RVR), support vector regression (SVR), kernel ridge regression (KRR). • The General formula is • w is the weighting, y is the rating, x is one of the images, b is the bias (scalar), εthe noise, N the number of training set

  21. Regression using Kernel Methods Here N is the number of training samples unused Training Testing

  22. Ridge Regression : Primal form • Simple linear regression X is the design matrix (scans x voxels), y is the target value • The goal is try to find the β which gives the minimum least square error as well as minimize the square of β

  23. Ridge Regression : Dual form

  24. 0 0 0 0 0 Relevance Vector Regression Basis functions y1 y2 w1 w2 = b 0 yn wn With unknown varaince With unknown prior precision

  25. Relevance Vector Regression The objective is to maximise the term p(y|α,σ2), which is called the marginal likelihood, or type-II maximum likelihood is basically the kernel matrix with a column of 1 appended at the end is the posterior weights

  26. Post-processing • Constrained Quadratic Programming for deconvolution • Gaussian Smoothing temporally Original Prediction of movie 2, subject 14, hits. Corr=0.66 Deconvolved data constrained from 0 to 1. Smoothed data, Corr=0.76 Reconvolved data, corr=0.75

  27. Regional Mask • Anatomical templates of Visual and Auditory cortex from International Consortium for Brain Mapping (ICBM) was used (www.loni.ucla.edu/ICBM/ ) • The probability templates were non-linear register to individual subject via SPM5 normalization (templatesubjects, source normalized EPI tempate, then defore the ICBM template with the same difformation fields) Subject14 visual cortex Use for “Interior Exterior” Subject13 auditory cortex Use for “Dog”

  28. Predict Instruction • A template of “instruction” is created from the average the trainings • The template is convolve with the predicted rating to find the correct onset point • Fit the prediction with the template

  29. Predict Requests to Search • Predict “hit something” for three subjects • 2. Prune most of the points and only keep some high value peaks • 3. See which peak is in which slot and set the corresponding search request as 1 in this slot. • For each “search something” request, we find 4 most possible slots • Finally, the predicted block is convolved with the HRF Assumptions: 1. Each request appears 4 times 2. There is at least one request per slot 3. The requests are the same for all 3 subjects

  30. Predict Instruction • A template of “instruction” is created from the average the trainings • The template is convolve with the predicted rating to find the correct onset point • Fit the prediction with the template

  31. Predict Velocity and Faces • Performance improves when we shift scan one TR forward • This implies either shorter hemodynamicdelay, or other causes (motor preparation?) Cross validation (Subject 14, train VR1, predict VR2) Faces Velocity

  32. Weight Volume Subject13 Face Subject14 Velocity Subject1 Instruction

  33. Conclusions &Results • Linear kernel works well for objective ratings • Non-Linear kernels are preferable for subjective ratings (emotional) • Pre-processing and post-processing are crucial • SPM5 is not only an analysis tool, but also a resourceful library containing useful functions Result from 2nd Submission Z Sub1 Z Sub2 Z Sub3 Avg Z Inv Z of average Required Feature 0.909 1.014 0.957 0.960 0.744 Req + Extra Feature 0.909 1.014 0.959 0.961 0.745 Max Z Max r (comp score) 0.961 0.745

  34. Improved PBAIC 2006 results COMPETITION SCORE (maximum average correlation across features for each summative index) 0.520813 SUMMATIVE INDICES (average correlation across features) Z'Sub1 Z'Sub2 Z'Sub3 Avg Z' Inv Z' of Average Base Features 0.552 0.619 0.562 0.577 0.521 Base + Actor N/A N/A N/A N/A N/A Base + Actor + Location N/A N/A N/A N/A N/A Max Z' Max r(comp score) 0.577 0.521 The top score last year is 0.515 ! And we got 0.521

  35. FIL Team From left to right: Dr. John Ashburner: The General who is currently on leave (Maastricht) Chia-Yueh Carlton CHU: Captain Geoffrey Tan: Medic, busy at collecting blood Yizhao Ni: Mercenary from Southampton, the land of kernel method.

More Related