290 likes | 409 Views
This presentation by Nathan Edwards from Georgetown University Medical Center discusses the innovative approach of top-down protein characterization for identifying bacteria with unsequenced genomes. Leveraging high-resolution mass spectrometry and advanced software tools like ProSightPC, this technique enhances pathogen detection for both clinical and homeland security applications. It offers rapid microorganism identification, aiding in treatment selection for chronic infections and microbiome analysis. The session further emphasizes the potential of protein-based approaches to compete with genomic methods in identifying elusive bacterial species.
E N D
Top-down characterization of proteins in bacteria with unsequenced genomes Nathan Edwards Georgetown University Medical Center
Microorganism Identification • Homeland-security/defense applications • Long history of fingerprinting approaches • Clinical applications in strain identification: • Selection of treatment and/or antibiotics • New applications in microbiome analysis: • Bacterial colonies in gut, .... • Chronic wound infections • Compete with genomic approaches? • PCR, Next-gen sequencing • Primary sales-pitch is speed.
Microorganism Identifications • Match spectra with proteome (or genome) sequence for (species) identity • Provides robust match with respect to instrumentation and sample prep • Many bacteria will never be sequenced or "finished"... • Pathogen simulants, for example • ...but many have – about 2500 to date.
Microorganism Identifications • Match spectra with proteome (or genome) sequence for (species) identity • Provides robust match with respect to instrumentation and sample prep • Many bacteria will never be sequenced or "finished"... • Pathogen simulants, for example • ...but many have – about 2500 to date. • Can we use the available sequence to identify proteins from unknown, unsequenced bacteria? • Yes, for some proteins in some organisms!
Crude cell lysate Capilary HPLC C8 column LTQ-Orbitrap XL Precursor scan: 30,000 @ 400 m/z Data-dependent precursor selection: 5 most abundant ions 10 second dynamic exclusion Charge-state +3 or greater CAD product ion scan 15,000 @ 400 m/z Intact protein LC-MS/MS
Enterobacteriaceae Protein Sequences • Exhaustive set of all Enterobacteriaceae family protein sequences from • Swiss-Prot, TrEMBL, RefSeq, Genbank, and [CMR] • ...plus Glimmer3 predictions on RefSeq Enterobacteriaceae genomes • Primary and alternative translation start-sites • Filter for intact mass in range 1 kDa – 20 kDa • 253,626 distinct protein sequences, 256 species • Derived from "Rapid Microorganism Identification Database" (RMIDb.org) infrastructure.
ProSightPC 2.0 • Product ion scan decharging • Enabled by high-resolution fragment ion measurements • THRASH algorithm implementation • Absolute mass search mode • 15 ppm fragment ion match tolerance • 250 Da precursor ion match tolerance • "Single-click" analysis of entire LC-MS/MS datafile.
Other tools • Explored using standard search engines: • Decharge and format as charge +1 spectrum • X!Tandem scoring plugin (ProSight, delta M) • OMSSA, Mascot, etc… • MS-Tools: • MS-Deconv, MS-TopDown, • MS-Align, MS-Align+, MS-Align-E!
CID Protein Fragmentation Spectrum from Y. rohdei Match to Y. pestis 50S Ribosomal Protein L32
Phylogeny: Protein vs DNA Protein Sequence 16S-rRNA Sequence
Identified E. herbicola proteins • 30S Ribosomal Protein S19 • m/z 686.39, z 15+, E-value 1.96e-16, Δ 0.007 • Six proteins identified with |Δ| < 0.02
Identified E. herbicola proteins • DNA-binding protein HU-alpha • m/z 732.71, z 13+, E-value 7.5e-26, Δ-14.128 • Eight proteins identified with "large" |Δ|
Identified E. herbicola proteins • DNA-binding protein HU-alpha • m/z 732.71, z 13+, E-value 1.91e-58 • Use "Sequence Gazer" to find mass shift • ΔM mode can "tolerate" one shift for free!
ProSightPC: ΔM mode ExperimentalPrecursor b- and y-ions ΔM Protein Sequence Also: PIITA - Tsai et al. 2009
ProSightPC: ΔM mode Match a single "blind" mass-shift for free! b'- and y'-ions ExperimentalPrecursor b- and y-ions ΔM ΔM Protein Sequence Also: PIITA - Tsai et al. 2009
ProSightPC: ΔM mode Match a single "blind" mass-shift for free! ExperimentalPrecursor b-, b'-, y- and y'-ions ΔM ΔM Protein Sequence Also: PIITA - Tsai et al. 2009
Identified E. herbicola proteins • DNA-binding protein HU-alpha • m/z 732.71, z 13+, E-value 7.5e-26, Δ-14.128 • Extract N- and C-terminus sequence supported by at least 3 b- or y-ions
Phylogenetic placement of E. herbicola Cladogram Phylogram phylogeny.fr – "One-Click"
Genome annotation errors • UniProt: E. coli Cell division protein ZapB • 22 (371) E. coli strains MQFRRGMTMSLEVFEKLEAKVQQAIDTITL… 3 (204) 17 (166) 0 (2)
Genome annotation errors • UniProt: E. coli Cell division protein ZapB • 22 (371) E. coli strains • Need ±1500 Da precursor tolerance… MQFRRGMTMSLEVFEKLEAKVQQAIDTITL… 3 (204) 17 (166) 0 (2)
Conclusions • Protein identification for unsequenced organisms. • Identification and localization for sequence mutations and post-translational modifications. • Extraction of confidently established sequence suitable for phylogenetic analysis. • Genome annotation correction. • New paradigm for phylogenetic analysis?
Acknowledgements • Dr. Catherine Fenselau • Avantika Dhabaria, Joe Cannon*, Colin Wynne* • University of Maryland Biochemistry • Dr. Yan Wang • University of Maryland Proteomics Core • Dr. Art Delcher • University of Maryland CBCB • Funding: NIH/NCI