1 / 37

Outline

Spatial segmentation of imaging mass spectrometry data with edge-preserving image denoising and clustering. Theodore Alexandrov , Michael Becker, Sören Deininger , Günther Ernst, Liane Wehder , Markus Grasmair , Ferdinand von Eggeling , Herbert Thiele, and Peter Maass. Outline.

emma
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spatial segmentation of imaging mass spectrometry data with edge-preserving image denoisingand clustering Theodore Alexandrov, Michael Becker, SörenDeininger, GüntherErnst, LianeWehder, Markus Grasmair, Ferdinand von Eggeling, Herbert Thiele, and Peter Maass

  2. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  3. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  4. Background: what is MS imaging? • In the words of All-MightyWikipedia: • Mass spectrometry imaging is a technique used in mass spectrometry to visualize the spatial distribution of e.g. compounds, biomarker, metabolites, peptides or proteins by their molecular masses. • Or in images:

  5. Goals of this paper: • To propose a new procedure for spatial segmentation of MALDI-imaging datasets. • This procedure clusters all spectra into different groups based on their similarity. • This partition is represented by a segmentation map, which helps to understand the spatial structure of the sample.

  6. Goal: in images… (it is MS Imaging after all)

  7. Why? • Current multivariate algorithm (PCA) are not meant for MS data and cannot be used to directly interpret the data. • Current clustering algorithm do not take in account spatial information. • Here, we assume that spectra close to each other should be similar.

  8. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  9. Overview: Pipeline

  10. Datasets • Rat brain coronal section • 80 µm raster • 200 laser shots per position; 20185 spectra • Data acquired: 2.5 kDa-25 kDa • Data considered: 2.5 kDa-10 kDa; 3045 points • Section of neuroendocrine tumor (NET) invading the small intestine • 50 µm raster • 300 laser shots per position; 27360 spectra • Data acquired:1 kDa-30 kDa • Data considered: 3.2 kDa-18kDa; 5027 points

  11. Spectra Preprocessing • Baseline correction • TopHat algorithm, minimal baseline width set to 10%, default in ClinProTools • No normalization • No binning • ASCII -> Matlab

  12. Peak-Picking • Part1: conventional peak picking applied to each 10th spectrum. Select 10 peaks. • Orthogonal Matching Pursuit (OMP) because it is fast and simple • Gaussian kernel deconvolution • Part 2: keep consensus peaks: • Only keep peaks that appear in at least 1% of the considered spectra • Omit spurious peaks

  13. Edge-preserving denoising of m/z images • Imaging dataset is a reduced datacube with 3 coordinates: x, y, m/z (reduced in m/z dimension by peak picking) • MALDI-imaging data is noisy • Must be able to keep fine anatomical or histological details • Grasmair modification of Total Variation minimizing Chambolle algorithm • Parameter θ between 0.5 and 1: smoothness of resulting image

  14. Edge-preserving denoising of m/z images • Total variation (TV) ~ sum of absolute differences between neighboring pixels • Chambolle algorithm searches for an approximation of the image with small TV • Chambolle algorithm => smoothness adjusted globally by manually choosing a parameter • Grasmair locally adapts denoising parameter of Chambolle

  15. Clustering • Specify number of cluster a-priori • High Dimensional Discriminant Clustering (HDDC) • Available in Matlab tool box • Each cluster is modeled by a Gaussian distribution of its own covariance structure. • HDDC developed for high-dimensional data (d > 10) • Note: In Matlab HDDC = high-dimensional data clustering

  16. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  17. Rat brain: peak picking • used 2019 spectra out of 20185 (10%) • potential peaks: 373 peaks (red triangles) • consensus peaks: 110 peaks (green triangles) • Present in at least 20 spectra out of the 2019 (1%) • Discarded peaks mostly in low m/z regions • Hypothesize they are noise peaks because MALDI imaging spectra have high baseline in low m/z region.

  18. Rat brain: peak picking • OMP successfully detects major peaks • Gaussian function provides reasonable approximation of peak shape

  19. Rat brain: noise in MALDI-imaging • Strong noise • Noise variance changes within m/z image and between m/z images • Noise variance is linearly proportional to peak intensity

  20. Rat brain: oise in MALDI-imaging

  21. Edge-preserving denoising • Apply Grasmair method to selected 110 consensus peaks • Efficiently removes the noise while not smoothing out edges

  22. Rat brain: segmentation map • Shows anatomical features • Restricted to spatial resolution of MALDI-imaging dataset

  23. Rat brain: importance of edge-preserving denoising • No denoising: borders do not match as well • 3x3 median smoothing: bad edge preservation • 5x5 median smoothing: lose many regions

  24. Rat brain: co-localized masses • Find mass values expressed in region

  25. Rat brain: the role of parameterspeak picking • 3 main parameters in addition to peak width • Portion of spectra considered for peak picking (each 10th spectrum) • Number of peaks selected for each spectrum (10 peaks) • Percentage of spectra where peak is found for consensus peak list (1%)

  26. Rat brain: the role of parameterspeak picking • Robust to changes of second and third parameter 5 10 20 peaks 0.1% 1% 5%

  27. Rat brain: the role of parameterspeak picking • Increase of parameter 1 can be compensated by higher value for parameter 2 Each 20th spectrum Each 5th spectrum

  28. Rat brain: the role of parametersdenoising and number of clusters • Segmentation maps for • 3 levels of denoising (0.6, 0.7, 0.8) • 3 number of clusters (6, 8, 10) • Decrease in number of clusters merge features • Too much denoising causes loss of structure details

  29. Rat brain: the role of parametersdenoising and number of clusters

  30. Human neuroendocrine tumor dataset

  31. Human neuroendocrine tumor dataset

  32. Outline • Background on MS Imaging and goals of paper • Methods • Results • Conclusions and Criticism

  33. Conclusions • Peak picking: usually done on mean spectrum • 1% consensus better for peaks in small spatial area • Edge-preserving denoising • One study with average moving window and one study posthoc to improve classification • Clustering methods • HDDC better results than k-means but significantly slower • Currently, mostly hierarchical clustering = memory intensive • Importance to cancer studies • Represents a proteomic functional topographic map

  34. Criticism • Didn’t explain why they got rid of part of the range for which the data was acquired • Dataset reduction by peak picking • done initially on per spectrum basis, it may get rid of lower abundance peaks which still show interesting image • Also, because the peak must be present in 1% of the 10% selected spectra, can miss smaller regions of interest if bad selection of 10% • Highly parameterized + slow running time would make it hard to run many trials

  35. Thank you

  36. TV-minimization (Grasmair slides)

  37. TV-minimization (Grasmair slides)

More Related