1 / 91

Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops. www.bioinformatics.ca. Module #: Title of Module. 2. Module 3 Metabolite Identification and Annotation – Part II. David Wishart Informatics and Statistics for Metabolomics May 3-4, 2012. ppm. 7. 6. 5. 4. 3. 2. 1. Goal of Metabolite Annotation.

church
Download Presentation

Canadian Bioinformatics Workshops

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Canadian Bioinformatics Workshops www.bioinformatics.ca

  2. Module #: Title of Module 2

  3. Module 3 Metabolite Identification and Annotation – Part II David Wishart Informatics and Statistics for Metabolomics May 3-4, 2012

  4. ppm 7 6 5 4 3 2 1 Goal of Metabolite Annotation

  5. Metabolite ID by Spectral Deconvolution (NMR) Mixture Compound A Compound B Compound C

  6. Alternatives to Chenomx • AMIX (Bruker) • AutoFit (automated fitting) • MetaboMiner (2D NMR) • HMDB (NMR spectral match) • PRIMe Spin Assgn (NMR spectral matching server) • rNMR and BRMB Peaks Server • CCPN-MP

  7. AutoFit - Automated NMR Profiling

  8. Performance of Autofit Synthetic Real P. Mercier et al. J Biomol NMR. 2011 Apr;49(3-4):307-23

  9. NMR Compound ID from Mixtures - MetaboMiner ID’d Compounds Raw TOCSY Spectrum http://wishart.biology.ualberta.ca/metabominer/

  10. Standard reference libraries 225 TOCSY spectra 488 HSQC spectra Specialized sub-libraries for CSF, plasma and urine Algorithms for automatic processing & compound identification “Minimal signature peaks” 1D 1H peak list as sanity check Extra dimensional information for identification Support for direct spectral annotation MetaboMiner Software Design

  11. MetaboMiner Performance

  12. NMR Compound ID - HMDB Peak list to HMDB NMR spectrum of mixture Phenyllactate Phenylpyruvate Phenylacetic acid Tropic acid Benzyl alcohol … http:///www.hmdb.ca High scoring matches

  13. PRIMe Spin Assign http://prime.psc.riken.jp/?action=nmr_search

  14. rNMR http://rnmr.nmrfam.wisc.edu/

  15. BMRB Peaks Server http://www.bmrb.wisc.edu/metabolomics/query_metab.php

  16. CCPN - MP http://www.ccpn.ac.uk/ccpn/projects/metabolomics/

  17. Metabolite ID by GC-MS GC -MS total Ion chromatogram

  18. Recall EI MS Generates Multiple Peaks Molecular ion EI Breaks up Molecules in Predictable Ways

  19. GC-MS Spectrum

  20. Recall GC-MS Analytes are Derivatized Methoxime

  21. Metabolite ID by GC-MS • GC-MS is often best for identification of amino acids, organic acids, sugars, fatty acids and molecules with MW<500 • GC has higher resolution and reproducibility than LC • EI-MS is more standardized than soft ionization methods, so EI spectra are more comparable • Most common route is to use AMDIS + NIST database

  22. NIST 11 MS Database • 243,893 EI spectra of 212,961 cmpds • 9934 ion trap MS for 4649 cmpds • 91,557 Qtof & QqQ spectra for 3774 compounds • 224,038 RI values for 21,847 cmpds

  23. NIST MS Search Software

  24. AMDIS (Automated Mass Spectral Deconvolution and Identification System) • Noise analysis • Determines background noise level • Component perception • Identifies peaks by comparing to noise • Spectral deconvolution • Generates a “clean” or model spectrum • Compound identification • Identifies compounds via a library search using a match factor

  25. Match Factor (MF) • Measures the similarity of the MS spectrum of the query to the MS spectrum in the reference database • Defined as the normalized dot product of the query and the reference spectra Iref corresponds to the intensities of the reference spectra, Iqry corresponds the intensities of the query spectra, M corresponds to the masses (m/z) w is a weighting term to penalize uncertain peaks

  26. GC-MS Protocol • Prepare a set of external n-alkane standards (8-9 n-alkanes spanning octane to hexadecane) and run as an external calibration standard • Run a “blank sample” containing just the solvent and derivatization agents • Run the sample of interest (under the same conditions as the blank)

  27. GC-MS Protocol External n-alkane standard used for RI calculation

  28. GC-MS Protocol • Create a calibration file using the n-alkane mixture (sets retention indices [RI’s] to the standard values) • Analyze the sample data file against the CAL(calibration)-file for the alkane mixture (sets and recalculates RI's using the n-alkanes) • Search the NIST database for matches and displaying the results of the search • Get rid of “false” positives by comparing the “blank” against the sample spectrum

  29. Step 1- Create Calibration File AMDIS

  30. Step 2 – Calibrate Sample Spectrum Using CAL-file AMDIS

  31. Step 3 – Search NIST Database for Matches GC Peak List AMDIS EI-MS Spectrum For 11.597

  32. Step 3 – Search NIST Database for Matches (Zero in) 73 & 144 are 2 most abund. m/z Peak Spectrum MF = 84% Match To Valine Reference Spectrum Match factor ³ 60% (if in doubt compare “blank” and your signal)

  33. Other GC-MS Options • Alternatives to AMDIS • AnalyzerPro (SpectralWorks) • ChromaTOF (Leco) • Evaluated in TrAC Trends in Analytical Chemistry Volume 27, Issue 3, March 2008, Pages 215-227 • Alternatives to NIST08 or NISTII • Golm Database (Open access) • FiehnLib (Leco, Agilent) • HMDB???

  34. The Golm Database • GC-MS (Quad and TOF) database • Contains MSRI (MS + retention index) or MST data for 1450 identified metabolites • Includes 10,336 spectra linked to analytes • Downloadable libraries compatible with NIST08 and AMDIS software • Primary focus on plant metabolites • Supports compound name and MS queries • MS submissions via NIST08 or AMDIS format

  35. Golm Database http://gmd.mpimp-golm.mpg.de/

  36. Golm Database

  37. The FiehnLib GC-MS Database • 2212 EI MS and RI data for quadrupole &TOF GC-MS • Over 1000 primary metabolites below 550 Da • Covers lipids, amino acids, fatty acids, amines, alcohols, sugars, amino-sugars, sugar alcohols, sugar acids,, and sterolsphosphates, hydroxyl acids, purines

  38. Metabolite ID by LC-MS LC -MS total Ion chromatogram

  39. Levels of Metabolite Identification in MS • 4 levels of metabolite identification • Positively identified compounds • Confirmed by match to known standard • Putatively identified compounds • Match to MS + RT or MS/MS + RT • Compounds putatively identified in a compound class • Unknown compounds

  40. Metabolite ID by LC-MS • LC-MS is often best for identification of lipids, bases, amino acids, organic acids, fatty acids and other somewhat hydrophobic molecules • Metabolite ID typically requires both MS and MS/MS data (along with retention time information) and internal standards • Compound ID can be done by high accuracy mass matching and/or by MS/MS matching to spectral databases

  41. Simple MW Search DBs ChEBI (www.ebi.ac.uk/chebi/) PubChem (http://pubchem.ncbi.nlm.nih.gov/) ChemSpider (www.chemspider.com) HMDB (www.hmdb.ca)

  42. PubChem MW Search Available Under “Advanced Search”

  43. PubChem Results

  44. ChEBI MW Search http://www.ebi.ac.uk/chebi/advancedSearchForward.do

  45. Advanced MS Search DBs NIST/AMDIS (http://chemdata.nist.gov) Metlin (http://metlin.scripps.edu/) HMDB (www.hmdb.ca) MassBank (www.massbank.jp)

  46. Advanced MS Search DBs • These databases support not only MW or MW range searches, but also support parent ion searches (positive, negative, neutral), peak list searches (from MS or MS/MS data) as well as MS/MS spectral matching • These DBs are intended more for MS-based metabolomics and compound ID than the simple MW search tools

  47. MS Compound ID - HMDB Peak list to HMDB LC-MS Spectrum Phenyllactate Phenylpyruvate Atrolactic acid Homovanillin Coumaric acd http:///www.hmdb.ca High scoring matches

  48. MS Compound ID - HMDB • Database of ~100,000 predicted masses from ~10,000 known metabolites • Includes adduct mass calculations for 30+ possible or expected metabolite adducts • Allows selection of different databases (DrugBank, HMDB, FooDB, T3DB), mass tolerance and ionization mode • Designed for mixture deconvolution (i.e. identification of multiple compounds at a time)

More Related