Outline - PowerPoint PPT Presentation

emma
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Outline PowerPoint Presentation
play fullscreen
1 / 37
Download Presentation
Outline
115 Views
Download Presentation

Outline

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Spatial segmentation of imaging mass spectrometry data with edge-preserving image denoisingand clustering Theodore Alexandrov, Michael Becker, SörenDeininger, GüntherErnst, LianeWehder, Markus Grasmair, Ferdinand von Eggeling, Herbert Thiele, and Peter Maass

  2. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  3. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  4. Background: what is MS imaging? • In the words of All-MightyWikipedia: • Mass spectrometry imaging is a technique used in mass spectrometry to visualize the spatial distribution of e.g. compounds, biomarker, metabolites, peptides or proteins by their molecular masses. • Or in images:

  5. Goals of this paper: • To propose a new procedure for spatial segmentation of MALDI-imaging datasets. • This procedure clusters all spectra into different groups based on their similarity. • This partition is represented by a segmentation map, which helps to understand the spatial structure of the sample.

  6. Goal: in images… (it is MS Imaging after all)

  7. Why? • Current multivariate algorithm (PCA) are not meant for MS data and cannot be used to directly interpret the data. • Current clustering algorithm do not take in account spatial information. • Here, we assume that spectra close to each other should be similar.

  8. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  9. Overview: Pipeline

  10. Datasets • Rat brain coronal section • 80 µm raster • 200 laser shots per position; 20185 spectra • Data acquired: 2.5 kDa-25 kDa • Data considered: 2.5 kDa-10 kDa; 3045 points • Section of neuroendocrine tumor (NET) invading the small intestine • 50 µm raster • 300 laser shots per position; 27360 spectra • Data acquired:1 kDa-30 kDa • Data considered: 3.2 kDa-18kDa; 5027 points

  11. Spectra Preprocessing • Baseline correction • TopHat algorithm, minimal baseline width set to 10%, default in ClinProTools • No normalization • No binning • ASCII -> Matlab

  12. Peak-Picking • Part1: conventional peak picking applied to each 10th spectrum. Select 10 peaks. • Orthogonal Matching Pursuit (OMP) because it is fast and simple • Gaussian kernel deconvolution • Part 2: keep consensus peaks: • Only keep peaks that appear in at least 1% of the considered spectra • Omit spurious peaks

  13. Edge-preserving denoising of m/z images • Imaging dataset is a reduced datacube with 3 coordinates: x, y, m/z (reduced in m/z dimension by peak picking) • MALDI-imaging data is noisy • Must be able to keep fine anatomical or histological details • Grasmair modification of Total Variation minimizing Chambolle algorithm • Parameter θ between 0.5 and 1: smoothness of resulting image

  14. Edge-preserving denoising of m/z images • Total variation (TV) ~ sum of absolute differences between neighboring pixels • Chambolle algorithm searches for an approximation of the image with small TV • Chambolle algorithm => smoothness adjusted globally by manually choosing a parameter • Grasmair locally adapts denoising parameter of Chambolle

  15. Clustering • Specify number of cluster a-priori • High Dimensional Discriminant Clustering (HDDC) • Available in Matlab tool box • Each cluster is modeled by a Gaussian distribution of its own covariance structure. • HDDC developed for high-dimensional data (d > 10) • Note: In Matlab HDDC = high-dimensional data clustering

  16. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  17. Rat brain: peak picking • used 2019 spectra out of 20185 (10%) • potential peaks: 373 peaks (red triangles) • consensus peaks: 110 peaks (green triangles) • Present in at least 20 spectra out of the 2019 (1%) • Discarded peaks mostly in low m/z regions • Hypothesize they are noise peaks because MALDI imaging spectra have high baseline in low m/z region.

  18. Rat brain: peak picking • OMP successfully detects major peaks • Gaussian function provides reasonable approximation of peak shape

  19. Rat brain: noise in MALDI-imaging • Strong noise • Noise variance changes within m/z image and between m/z images • Noise variance is linearly proportional to peak intensity

  20. Rat brain: oise in MALDI-imaging

  21. Edge-preserving denoising • Apply Grasmair method to selected 110 consensus peaks • Efficiently removes the noise while not smoothing out edges

  22. Rat brain: segmentation map • Shows anatomical features • Restricted to spatial resolution of MALDI-imaging dataset

  23. Rat brain: importance of edge-preserving denoising • No denoising: borders do not match as well • 3x3 median smoothing: bad edge preservation • 5x5 median smoothing: lose many regions

  24. Rat brain: co-localized masses • Find mass values expressed in region

  25. Rat brain: the role of parameterspeak picking • 3 main parameters in addition to peak width • Portion of spectra considered for peak picking (each 10th spectrum) • Number of peaks selected for each spectrum (10 peaks) • Percentage of spectra where peak is found for consensus peak list (1%)

  26. Rat brain: the role of parameterspeak picking • Robust to changes of second and third parameter 5 10 20 peaks 0.1% 1% 5%

  27. Rat brain: the role of parameterspeak picking • Increase of parameter 1 can be compensated by higher value for parameter 2 Each 20th spectrum Each 5th spectrum

  28. Rat brain: the role of parametersdenoising and number of clusters • Segmentation maps for • 3 levels of denoising (0.6, 0.7, 0.8) • 3 number of clusters (6, 8, 10) • Decrease in number of clusters merge features • Too much denoising causes loss of structure details

  29. Rat brain: the role of parametersdenoising and number of clusters

  30. Human neuroendocrine tumor dataset

  31. Human neuroendocrine tumor dataset

  32. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  33. Conclusions • Peak picking: usually done on mean spectrum • 1% consensus better for peaks in small spatial area • Edge-preserving denoising • One study with average moving window and one study posthoc to improve classification • Clustering methods • HDDC better results than k-means but significantly slower • Currently, mostly hierarchical clustering = memory intensive • Importance to cancer studies • Represents a proteomic functional topographic map

  34. Criticism • Didn’t explain why they got rid of part of the range for which the data was acquired • Dataset reduction by peak picking • done initially on per spectrum basis, it may get rid of lower abundance peaks which still show interesting image • Also, because the peak must be present in 1% of the 10% selected spectra, can miss smaller regions of interest if bad selection of 10% • Highly parameterized + slow running time would make it hard to run many trials

  35. Thank you

  36. TV-minimization (Grasmair slides)

  37. TV-minimization (Grasmair slides)