1 / 34

MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests

MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests. Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein. Mass Spectrometry Data Center. Library searching in not new. Organize for Reuse. MS Library Searching.

avery
Download Presentation

MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass Spectrometry Data Center

  2. Library searching in not new Organize for Reuse

  3. MS Library Searching • Hertz, Hites and Biemann Anal. Chem. (1971). • PBM: McLafferty, Hertel, Villwock Org. Mass Spectrom. (1974). • SISCOM: Damen, Henneberg, Weimann, Anal. Chem. Acta (1978). • INCOS: Sokolow, Karnofsky, Gustafson , Finnigan Application Report 2 (March 1978). • Stein, Scott J. Amer. Soc. Mass Spectrom., (1994).

  4. Sum over all peaks in common ‘Dot Product’(cosine of ‘angle’ between a pair of spectra) • Measured = f(m/z abundance) • Reference = f(m/z abundance) • f(abundance) : Weight as you like Normalize

  5. Traditional GC/MS Library Search

  6. Variability Depends on S/N ~7,000 Radiodurans Peptides, LCQ (PNNL/NCRR) Medians

  7. Library Searching for Peptides • LIBQUEST (Yates) • Yates et al, Anal. Chem., 1998, 70, 3557 • X!Hunter (Beavis) • Craig et al, J. Proteome Res., 2006, 5, 1843 • BiblioSpec (MacCoss) • Frewen et al., Anal. Chem. 2006, 78, 5678 • Spectral Comparison (Kearney) • Liu et al, Proteome Science 2007, 5:3 • SpectraST (Aebersold) • Lam et al., Proteomics 2007 6, 655-667 • NIST Peptide Ion Fragmentation Library • June 2006 release (US-HUPO – March 2004)

  8. Why Spectrum Libraries? • More sensitive • Better scoring • Faster • Annotation • Unrestricted precursor ion

  9. Identification by Spectrum Matching is More Sensitive than by Spectrum/Sequence Matching Simple Protein Mix

  10. Spectrum/Spectrum Scores are More Robust than Sequence/Spectrum Scores 99% Confidence Sequence score

  11. Matching Spectra is Faster than Matching Sequence 0.005/s vs. 6.2/s per query spectrum

  12. Reference Library Building • Extract identified spectra from sequence search • Multiple search engines • Instrument-class specific • Create ‘consensus’ spectra • Two or more matching spectra, also save best • Assign probability of being correct • Refine confidence starting from decoy FDR • Classify peptides – tryptic, missed cleavage, semi, mods • Create searchable spectral library • Resolve conflicts, add annotation

  13. Three Classes of Libraries I. Conventional Target Identification • Peptides (Proteins) II. Identifiable • By unconventional searching III. Not Identifiable • Account for all recurring spectra • QA/QC

  14. 1350 747 353 1752 318 833 78K6/07 34K6/06 I.OMSSAoverlap with MS/MS Library Search Identified spectra (1% FDR) for 1-D Yeast NCI/CPTAC – Vanderbilt

  15. II. Identify What we CanDerive Class-specific FDR • Tryptic • Simple • Expected missed cleavages • Unexpected missed cleavages • Semitryptic (cleaved tryptic) • No missed cleavage • In source (with parent at same retention) • In sample • Missed cleavage • In source (with parent) • In sample (obey rules) • Uncommon – reject • Others …

  16. Atypical Peptide Ionsuse Sequence Search Method • Tryptic only with many mods • Less common: Methylation, Phosphorylation, … • Artifacts: Na, K, Carbamyl • InsPecT/Pevzner (Unidentified, +70) • High charge states, >2 missed cleavages • Use class specific score thresholds

  17. HSA/Fibrinogen/Transferrin Mix 6124 Consensus Peptide Spectra, IT, Qtof, TofTof Ion Trap Peptide Ions: 1300 HSA, 1100 Fibrinogen, 700 Transferrin

  18. contiguous = tryptic, exploded = semitryptic

  19. III. Library ofRecurring, Unidentified Spectra • Create consensus spectra • From similar spectra from an experiment • Combine from multiple experiments • Identify spectra in other experiments • QA/QC: Artifacts, in standards, … • Apply other sequencing methods

  20. Assign all Spectra • Identified Spectrum • Matches library peptide or unidentified spectrum • Subset of peaks match library spectrum (impure) • Similar to a matched spectrum (cluster) • Not a Peptide • Low S/N • Maximum/Median <15 • High charge state (many large peaks) • Proteins, large fragments, … • One dominant peak • Stable ion, not peptide • Singly charged (high/low abund < 1.2) • Probable artifact, lower probability of identification • Narrow m/z range • Peptide?

  21. exploded = identified, contiguous = unidentified

  22. exploded = identified, contiguous = unidentified

  23. assigned assigned Sequence Search, De Novo, Theoretical Spec, Similarity, ... Pep. Lib Unass. Lib No ID No ID No ID No ID Garbage filter Mass spectrometer unassigned Library Pipeline of the Future

  24. NCI/NIH - CPTAC:Clinical Proteomic Technology Assessment for Cancer http://proteomics.cancer.gov Technology assessment; develop standard protocols and clinical reference sets; and evaluate methods to ensure data reproducibility. Broad Institute of MIT and Harvard, Memorial Sloan-Kettering Cancer Center, Purdue University, University of California, San Francisco,, and Vanderbilt University School of Medicine. NCI grants (U24CA126476-01, U24CA126485-01, U24CA126480-01, U24CA126477-01, and U24CA126479-01).

  25. Run-to-Run Chromatographic Reproducibility

  26. YICENQDSISSK Lab-to-Lab Chromatography INCAPSLTQ BroadOrbitrap PurdueLTQ VandyOrbitrap VandyLTQ NYUOrbitrap NISTLTQ

  27. HSA_CAM_SigmaA9511_5H_8MS2_m2_10de_040406_05

  28. Measures of Reproducibility • Identified ions • Unique peptides, Ions, Spectrum counts • Unidentified components • Classify by type, link to origin • Ion cluster analysis • MS1 linked to MS2 • Chromatography • Time evolution of ion clusters

  29. Ion Component Analysis

  30. Ion Component Analysis (Yeast)

  31. Components in Replicate Runs total ▲▼ run 1,2 ■ in both sampled identified

More Related