1 / 48

Modified Peptide MS/MS Interpretation Karl R. Clauser Broad Institute of MIT and Harvard

Modified Peptide MS/MS Interpretation Karl R. Clauser Broad Institute of MIT and Harvard Bioinformatics for Protein Identification ASMS Fall Workshop Baltimore, MD November 5-6, 2009. Outline. Fixed, variable, mix modifications and search space Multiple rounds of searching

jlu
Download Presentation

Modified Peptide MS/MS Interpretation Karl R. Clauser Broad Institute of MIT and Harvard

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modified Peptide MS/MS Interpretation Karl R. Clauser Broad Institute of MIT and Harvard Bioinformatics for Protein Identification ASMS Fall Workshop Baltimore, MD November 5-6, 2009

  2. Outline • Fixed, variable, mix modifications and search space • Multiple rounds of searching • Diagnostic marker ions for modifications • Data acquisition methods specific for modifications • Ambiguity in localizing phosphorylation sites • Sample handling chemistry artifacts • Resources for masses/descriptions of known modifications

  3. Fixed, Mix and Variable Modifications Variable Allow 2 possibilities for an AA. Allow both in 1 spectrum if more than one location/AA. Fixed Redefine the wild type as Mix Search in 2 cycles Cycle 1: all KR light Cycle 2: all KR heavy DO NOT allow both light and heavy in 1 spectrum

  4. Variable Modifications Expand the Search Space Fixed Mods only Allow Variable Mods precursor mass filter Calculate MH+ fixed mods only tolerance filter Shift range filter AA composition filter Candidates passing precursor mass filter Precursor MH+ shift Calculate MH+ Variable mod combinations -256 -176 -160 -97 -81 -80 -32 -16 -2 -1 0 17 3ST 2ST 2ST 1ST 1ST 1ST 2M 1M 2N 1N * ^Q 1M 1M 1M 1N ? ? ? ? ? ? ? ? ? ? 1 .05 AA composition tolerance filter

  5. Methods of Constraining Allowed # of Modifications/Peptide • Parent mass shift range • Spectrum Mill • Max Number of mods/peptide • Sequest - all mods have same max • X!Tandem - ? • Phenyx - each mod can have different max • Max Permutations of mods/peptide • Mascot - cap on permutations/peptide • Candidate sequence contains sequence tags present in spectrum • Protein Pilot/Paragon

  6. Round 1: search all proteins Get high confidence peptide hits 0-1 missed cleavages Minimal number of variable AA modifications Round 2: limit the search to proteins identified in round 1 Semi-/un-specific cleavage Increase the number of modifications Allow for AA substitutions Allow for undefined modifications Multiple Rounds of Searching Alternate names for similar concept • X!Tandem: refinement • Mascot: Error tolerant • Spectrum Mill: search saved hits, homology mode, unassigned single mass gap • Phenyx: 2-rounds • ProteinPilot/ Paragon: thorough ID, fraglet-taglet

  7. X!Tandem - Refinement Search

  8. Mascot - Error Tolerant Search Creasy DM, and Cottrell JS. (2002) Proteomics 2, 1426-1434.

  9. Mascot - Error Tolerant Search Result

  10. Spectrum MillUnassignedSingle Mass GapSearch

  11. + + + 1 2 3 4 5 6 S A M P L E R 6 5 4 3 2 1 + i j Pdb Pgap yj* + Pgap Pgap bi + Pgap bi * 0 sequence mismatches: bi , bi*, yj , yj*, Pe, Pdb match 1 sequence mismatch at A: bi*, yj match 2 sequence mismatches at A and P: yj matches Spectrum Mill - Unassigned mass gap Wide open precursor mass filter coupled with complementary ion principle Pe bi bi * yj yj* Relative Abundance Mass (m/z)

  12. Spectrum Mill - Unassigned Single Mass Gap Result b* ions Removal of Met -131.0405 Acetylation + 42.0106 - 89.0299 The b*-ions (b-ions plus the precursor mass shift) contain the modification and represent the complements of the detected y-ions. The absence unmodified b-ions means that the modification is on the N-terminus. Mass Gap# IDsPresumed Modification -89 Da 153 Met loss + Acetylation -17 49 pyro-Glu, pyro-CamC +16 12 Oxidation +32 28 Dioxidation +42 2 Acetylation +57 62 Overalkylation +80 7 Phosphorylation Number of identifications with below 5% FDR for particular mass gaps from an Agilent 6520 Q-Tof LC-MS/MS dataset collected on a HeLa cell lysate digested with trypsin and separated on the basis of peptide isoelectric point into 24 fractions by off-gel electrophoresis.

  13. Phenyx:2 Rounds

  14. Phenyx: Effect of the parameters for one protein 2rnd, Add variable mods 205 valid, 84% cov. 2rnd, With all mods And half cleaved 348 valid, 90% cov. 1rnd, Only 3 fixed mods 131 valid, 75% cov.

  15. Phenyx: Use the Annotation in SwissProt, TrEMBL In the Feature Tables Sequence processing annotations Removal of signal peptides Removal of transit peptides Extraction of active chains Post-translational modifications Sequence variants Splicing variants Sequence mutations 57292 variants / 20328 human proteins

  16. Phenyx: Search Annotated PTMs in SwissProt 15 unique spectra

  17. Limited de novo sequencing generates Taglets A large number of short sequence tags –‘Taglets’ – are called. Each Taglet rated with the chance it is correct, allowing a large number to be used but more likely Taglets to have more influence. Applied Biosystems ProteinPilot™ SoftwareParagon™ Algorithm S T G I I T H Y S A Taglets: STI, TI, AS, YH, TIG, IT, SA, etc… Shilov et al Mol Cell Proteomics, 6:1638-1655, (2007).

  18. ST, TI, STI, AS, DI, DIN, SE, EQ, NA, SEQ The Paragon™ Algorithm: Varying Search Space on a ContinuumTaglets for Sequence Temperature Value (STV) Sequence Tags in Order of Decreasing Certainty: >DHE3_BOVIN (P00366)Glutamate dehydrogenase 1, mitochondrial precursor (EC 1.4.1.3) (GDH)MYRYLGEALLLSRAGPAALGSASADSAALLGWARGQPAAAPQPGLVPPARRHYSEAAADREDDPNFFKMVEGFFDRGASIVEDKLVEDLKTRETEEQKRNRVRSILRIIKPCNHVLSLSFPIRRDDGSWEVIEGYRAQHSQHRTPCKGGIRYSTDVSVDEVKALASLMTYKCAVVDVPFGGAKAGVKINPKNYTDNELEKITRRFTMELAKKGFIGPGVDVPAPDMSTGEREMSWIADTYASTIGHYDINAHACVTGKPISQGGIHGRISATGRGVFHGIENFINEASYMSILGMTPGFGDKTFVVQGFGNVGLHSMRYLHRFGAKCITVGESDGSIWNPDGIDPKELEDFKLQHGTILGFPKAKIYEGSILEVDCDILIPAASEKQLTKSNAPRVKAKIIAEGANGPTTPEADKIFLERNIMVIPDLYLNAGGVTVSYFEWLNNLNHVSYGRLTFKYERDSNYHLLMSVQESLERKFGKHGGTIPIVPTAEFQDRISGASEKDIVHSGLAYTMERSARQIMRTAMKYNLGLDLRTAAYVNAIEKVFRVYNEAGVTFT A segment with cold STV A segment with warmer STV The segment with the hottest STV in this protein

  19. 1.0 iTRAQ on K, N-term MMTS on C Probability of Feature Deamidation on N,Q Oxidized M iTRAQ on Y 0 Pyroglutamic acid of E Dehydration of E,D Controlling Search Space with the Paragon™ Algorithm • Using feature probabilities avoids include/exclude decisions and simplistic rules. • When combined with STVs, search space is dynamic by spectrum and even segment of the database. Try only most likely mods for ‘cold’ segments Try only more likely mods for ‘warm’ segments Try all mods for ‘hot’ segments in the database Same concept also used with digestion specificity, mass tolerances, etc.

  20. Pause for Questions

  21. m/z 98 Phosphoric Acid CID Phospho Ser Dehydroalanine Diagnostic Marker Ions for Modifications(Immonium ions and Neutral Losses from Precursor) Mass Modification P-98 H3PO4 phospho Ser, Thr 216, P-80 phospho Tyr P-64 SOCH4 oxidized Met P-43 carbamylated N-term 204, P-203 N-Acetylglucosamine (GlcNAc)

  22. Data Acquisition Methods Specific for Modifications • ETD - Electron transfer dissociation • ECD - Electron capture dissociation • MS3 - ion trap • Multi-stage activation - ion trap • Precursor ion scan - triple quadrupole, Q-Tof • Neutral-loss scan - triple quadrupole Review: Boersema, P; Mohammed, S; and Heck, A. Phosphopeptide fragmentation and analysis by mass spectrometry. J. Mass. Spectrom. 2009, 44, 861–878.

  23. Multi-stage Activation in an Ion Trap Single fill Single isolation Multi Activation Single Mass Analysis Multi fill Multi isolation Multi Activation Multi Mass Analysis Schroeder, MJ, Schabanowitz, J, Schwartz, JC, Hunt, DF and Coon JJ. Anal. Chem. 2004, 76, 3590-3598.

  24. Single vs. Multi-stage Activation MS/MS in an Ion Trap (K)L/G/V|S|V/s|P S R(A) Single Activation Multi-stage Activation

  25. Time Considerations for Different Acquisition Strategies Boersema, P; Mohammed, S; and Heck, A: J. Mass. Spectrom. 2009, 44, 861–878.

  26. O-GlcNAcylation • Addition of a single sugar residue: N-Acetylglucosamine (GlcNAc) to serine or threonine residues of nuclear and cytoplasmic proteins. • Present in all multi-cellular organisms • Different from ‘conventional’ glycosylation: • Inside the cell • Transient modification • Enzymes responsible for addition and removal of modification • i.e. analogous to phosphorylation • O-GlcNAc modification and phosphorylation interact / affect each other • Modification is involved in cellular response to nutritional and other stresses • Clear links to Diabetes and Alzheimer Disease and elevated in cancer.

  27. m/z 98 Phosphoric Acid CID Phospho Ser Dehydroalanine m/z 204 GlcNAc oxonium Ion GlcNAcylated Ser CID Unmodified Ser Side-chain Fragmentation Yields Diagnostic Neutral Losses • In CID, O-GlcNAc bond is more labile than peptide backbone, so neutral-loss of sugar occurs prior to peptide fragmentation. • Site assignment often not possible since an unmodified residue remains following neutral-loss of the sugar (so multi-stage activation is ineffective).

  28. MH22+ -GlcNAc CID MH22+ -2GlcNAc MH33+ -GlcNAc GlcNAc MH+ -2GlcNAc ETD z6 z12 c13 c11 z11 z4 z8 z3 z5 c14 c10 c16 z10 z2 CID/ETD MS/MS of Same Doubly GlcNAcylated Peptide GLAGPTtVPAtKASLLR - Protein bassoon Mass difference between z10-z11 identifies one site as residue T2941. Mass difference between c10-c11 identifies other site as residue T2945. m/z 687.046 3+ m/z 687.046 3+ Chalkley, R. J. et al. Proc Natl Acad Sci USA (2009) 106, 22, 8894-8899

  29. Site-localizing ion Phospho Site Ambiguity – S/T L A G G Q/T/S Q|P T T|P L\T s/P Q R L A G G Q/T/S Q|P T T|P L\t S/P Q R

  30. “Resulting sequences were inspected manually …. When the exact site of phosphorylation could not be assigned for a given phosphopeptide, it was tabulated as ambiguous.” “All spectra supporting the final list of assigned peptides used to build the tables shown here were reviewed by at least three people to establish their credibility.” “Assignment of phosphorylation sites was verified manually with the aid of PEAK Studio (Bioinformatics Solutions) software.” “All identified phosphopeptides were manually validated, and localization of phosphorylated residues within the individual peptide sequences were manually assigned…” Reliability of LC/MS/MS Phosphoproteomic Literature Citation Approach Instrument #sites #ambiguous Scores Site Supplem. sites Shown Ambiq Labeled Shown Spectra Ballif, BA,…Gygi, SP 1DGel LCQ Deca XP 546 86 yes yes no 2004 MCP, 3, digest, SCX 1093-1101 LC/MS/MS Rush, J, … Comb, MJ digest lysate LCQ Deca XP 628 0 yes no no 2005, Nat Biotech, 23, pTyr Ab 94-101 LC/MS/MS Collins, MO, …Grant, SGN protein IMAC Q-Tof Ultima 331 42 no yes no 2005, J Biol Chem, 280, peptide IMAC 5972-5982 LC/MS/MS Gruhler, A, … Jensen, ON digest lysate LTQ-FT 729 0 yes no no 2005 MCP, 4, SCX, IMAC 310-327 LC/MS/MS

  31. MCP draft Guideline for publishing PTM data http://www.mcponline.org/ III. POST-TRANSLATIONAL MODIFICATIONS Studies focusing on posttranslational modifications require specialized methodology and documentation to assign the presence and the site(s) of modification. No current MS data analysis software is infallible in the automatic assignment of modification sites in peptides, and these analyses are particularly error prone when multiple possible sites within a peptide are being utilized. For these reasons, additional documentation supporting assignment of PTMs is required. In addition to the tabular presentation(s) of the data described in guideline II: • The site(s) of modification within each peptide sequence must be clearly presented. • An indication of the certainty of localization for each PTM: The manner in which the modification was located (by computation or manually) and a description of the software used, if any. • A justification for any localization score threshold employed. • Ambiguous assignments: Peptides containing ambiguous PTM site localizations must be listed in a separate table from those with unambiguous site localizations. In cases where there are multiple modification sites and at least one is ambiguous, then these peptides should be listed with the ambiguous assignments. Ambiguous assignments must be clearly labeled as such. • Examples of ambiguities include: • Modified peptides in which one or more modification sites are ambiguous. • Instances where the peptide sequence is repeated in the same protein so the specific modification site cannot be assigned. • Instances in which the same peptide is repeated in multiple proteins, e.g. paralogs and splice variants (See also Section IV). • Isobaric modifications (e.g., acetylation vs. trimethylation, phosphorylation vs. sulfonation etc), where the possibilities may not be distinguished. Examples of methods able to distinguish between these include mass spectrometric approaches such as accurate mass determination, observation of signature fragment ions (e.g. m/z 79 vs. m/z 80 in negative ion mode for assignment of phosphorylation over sulfation), or biological or chemical strategies. • Annotated, mass labeled spectra: Spectra for ALL modified peptides must be either submitted to a public repository or accompany the manuscript as described in guideline II.

  32. Phosphosite Localization Scoring http://ascore.med.harvard.edu/ Supports Sequest results only, Linux only Beausoleil SA, Villen J, Gerber SA, Rush J, Gygi SP (2006) Nat Biotechnol 24:1285–1292.

  33. Phosphosite Localization Scoring • P = (k!/[n!(n-k)!] [pk] [(1-p) (n-k) ]) • = (k!/[n!(n-k)!] [0.04k] [(0.96) (n-k) ]) • PTM score = -10 x log (P) • p: 0.04 - use the 4 most intense fragment ions per 100 m/z units • n: total num possible b/y ions in the observed mass range for all possible combinations of PO4 sites in a peptide • k: number of peaks matching n Olsen, J. V.; Blagoev, B.; Gnad, F.; Macek, B.; Kumar, C.; Mortensen, P.; Mann, M. Cell (2006), 127 (3), 635–48. Olsen, J.V., and Mann, M. Proc. Natl. Acad. Sci. USA. (2004) 101, 13417–13422.

  34. True Probability or Just Effective Scores? • Peak selection assumptions • All regions of spectrum equally likely • multiply charged fragements below precursor • some 100-300 m/z values not possible dipeptide AA combinations • Tall and short peak intensities equally diagnostic • Fragment ion type assumptions • All ion types equally probable • Neutral losses ignored, y-H3P04, y-H2O

  35. FIG. 1. Identification of a novel modification on a peptide belonging to human saliva PRP. A, 9-min integrated survey scan showing two ions separated by 12.000 Da. B, CAD spectrum of the lowest mass ion in the survey scan identified as peptide GPPQQGGHQQ from PRP. The inset shows the mass deviation of the fragment masses for this identification. C, CAD spectrum of the 12.000-Da peptide. Note the similarity between this spectrum and the one depicted in B. Full sequence cleavage is achieved, and no fragment mass deviates more than 6 mDa. Spectral Matching if Modified & Unmodified Peptides Present ModifiComb - Savitski, MM; Nielsen, ML; and Zubarev, RA. Mol Cell Proteomics 5, 935–948, 2006.

  36. Software Tools Specializedfor Identifying Modifications and Localizing Sites Ascore Beausoleil SA, Villen J, Gerber SA, Rush J, Gygi SP (2006) Nat Biotechnol. 24, 1285–1292. MaxQuant Cox J, Mann M.(2008)Nat Biotechnol.26, 1367 - 1372. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. (2006) Cell. 127, 635–48. Inspect, MS-Alignment, PTMFinder Tanner S, Payne S, Dasari S, Shen Z, Wilmarth PA, David L, Loomis WF, Briggs SP, Bafna V. (2008) J Proteome Res. 7, 170–181. Payne S, Yau M, Smolka MB, Tanner S, Zhou H, Bafna V. (2008) J Proteome Res. 7, 3373–3381. Tsur D, Tanner S, Zandi E, Bafna V, Pevzner P. (2005) Nat Biotechnol. 23, 1562–1567. Tanner S, Shu H, Frank A, Wang LC,Zandi E, Mumby M, Pevzner P, Bafna V. (2005) Anal Chem. 77, 4626-4639. PhosphoScore Ruttenberg BE, Pisitkun T, Knepper MA, Hoffert JD. (2008) J Proteome Res. 7, 3054-9. Debunker Lu B, Ruse C, Xu T, Park SK, Yates J 3rd. (2007) Anal Chem. 79, 1301-10. SloMo - ETD/ECD Bailey CM, Sweet SM, Cunningham DL, Zeller M, Heath JK, Cooper HJ. (2009) J Proteome Res. 8, 1965-71. ModifiComb Savitski MM, Nielsen ML, Zubarev RA. (2006) Mol Cell Proteomics. 5, 935–48.

  37. Pause for Questions

  38. Expect Woes & Nuisances • Sample Handling Chemistry • Carbamylation +43 nterm, Lys urea in digest buffer • Deamidation +1 N -> D sample in acid • pyroGlutamic acid -17 nterm Q sample in acid • pyroCarbamidomethyl Cys -17 nterm C sample in acid • Oxidized Met +16 M gels • Cys alkylation reagent +x n-term, W side reaction

  39. Stinkers (b-NH3) & Pyroglutamic Acid (R)q L/Q|L|A|Q|E|A|A\Q\K(R) -17 Da Q to q (R)Q L/Q/L/A|Q/E/A|A Q\K(R) P(m/z)-NH3

  40. Deamidation of Asn +1Da Asn –NH + O = Asp ionsource.com

  41. G S/E/S|G|I|F|T|n\T K 18.35 96.9% +0.007 Da G S/E/S|G|I|F|T|D\T K Deamidation G S/E S\G\I\F\T\N/T K 6.62 43.4% +0.986 Da

  42. Carbamylation from Urea in Digest Buffer +43Da CNHO +43Da

  43. Carbamylated N-term I/G/E|G/T/y/G V|V|Y\K unmodified P(m/z)-CNHO +43 b ions N-term Carbamylated P(m/z)-CNHO-H2O

  44. Unimod Resource for Masses of Modifications http://www.unimod.org/modifications_list.php

  45. Delta Mass Resource for Masses of Modifications http://www.abrf.org/index.cfm/dm.home

  46. RESID Resource for Masses of Modifications http://www.ebi.ac.uk/RESID/

  47. Acknowledgements Broad Institute of MIT and Harvard Steven Carr Philipp Mertins Pierre-Alain Binz GeneBio Phenyx Robert Chalkley University of California San Francisco O-GlcNAc John Cottrell Matrix Science Mascot Chris Miller Agilent Technologies Spectrum Mill Sean Seymour Applied Biosystems Protein Pilot, Paragon

  48. iPRG-2010: Proteome Informatics Research Group Study - Phosphopeptide Identification In this study, an LC-MS/MS dataset from a lysate digested with trypsin and enriched for phosphopeptides using strong cation exchange fractionation followed by immobilized metal affinity chromatography (SCX/IMAC) will be provided. Participants are asked to return a list of identified peptides and localized phosphorylation sites Requests to participate must be submitted by e-mail to iPRG2010@gmail.com prior to Monday, November 30, 2009. Please include the words “iPRG Study 2010 request” in the subject line and provide contact name and affiliation in the body of the message. http://www.abrf.org

More Related