MULTICOM – A Combination Pipeline for Protein Structure Prediction - PowerPoint PPT Presentation

multicom a combination pipeline for protein structure prediction l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
MULTICOM – A Combination Pipeline for Protein Structure Prediction PowerPoint Presentation
Download Presentation
MULTICOM – A Combination Pipeline for Protein Structure Prediction

play fullscreen
1 / 14
MULTICOM – A Combination Pipeline for Protein Structure Prediction
112 Views
Download Presentation
chung
Download Presentation

MULTICOM – A Combination Pipeline for Protein Structure Prediction

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri, Columbia, MO, USA

  2. MULTICOM Structure Prediction Pipeline Server Predictor Query Sequence 1. Template Identification 2. Multi-Template Combination Human Predictor All CASP8 Server Models 3. Model Generation 4. Model Evaluation 5. Multi-Model Combination Output

  3. MULTICOM Structure Prediction Pipeline Query Sequence • PSI-BLAST • HHSearch • COMPASS • FOLDpro + SPEM 1. Template Identification 2. Multi-Template Combination Query-template alignments: 3. Model Generation 4. Model Evaluation 5. Multi-Model Combination Find a set of good templates / fragments; generate alternative query-template alignments Output

  4. MULTICOM Structure Prediction Pipeline Query Sequence 1. Template Identification Combination 1. Combine top ranked query- template alignment (QTA) with other significant QTAs 2. Take fragments from less significant QTA (Template-free) 2. Multi-Template Combination 3. Model Generation 4. Model Evaluation Don’t try to find the best template; Instead combine multiple good templates / fragments. 5. Multi-Model Combination Output

  5. MULTICOM Structure Prediction Pipeline Query Sequence 1. Template Identification Integrative Model Generation • Modeller • Rosetta for template-free • small domains 2. Multi-Template Combination 3. Model Generation 4. Model Evaluation Domain-level combination of template-based and template-freeapproaches 5. Multi-Model Combination Output

  6. MULTICOM Structure Prediction Pipeline Query Sequence 1. Template Identification Model Ranking by ModelEvaluator 2. Multi-Template Combination 3. Model Generation 4. Model Evaluation 5. Multi-Model Combination Output

  7. ModelEvaluator Ab initio Sequence-Based Structural Feature Prediction 3D Model Secondary Structure Comparison EEEECCEEEHHHHHHHHHHHHEEEECCEEEHHHH Relative Solvent Accessibility eeee-----eeeee----------eeeee------eeeee---eeeeeeee Contact Map Beta-Sheet Pairing Good models ranked at the top. Very effective for template-free models. Input Features Predicted GDT-TS score

  8. MULTICOM Structure Prediction Pipeline Query Sequence • Start from a top ranked model • Combine it with other models • having global similarity (80%, 4Å) • 3. Combine it with the longest similar model fragments 1. Template Identification Global-Local Model Combination 2. Multi-Template Combination Modeller Iterative Modeling 3. Model Generation Average Model 4. Model Evaluation Don’t try to find the best model. Instead combine multiple good models / fragments (2-3% improvement). 5. Multi-Model Combination Output

  9. Good Template-Free Example: T0416_2 Structure MULTICOM (GDT = 0.66, RMSD = 2.5) Combination of 20 models: Zhang-Server Robetta TASSER MULTICOM YASARA forecast Success: rank very good models at top. Superposition (red: model) (Courtesy by Prof. Joel Sussman)

  10. Good Template-Free Example: T0513_2 Structure MULTICOM (GDT = 0.73, RMSD=2.1) Combine Robetta models Better than each one of them Success: rank very good models at top and combination improves modeling. Superposition (blue: model)

  11. Not Good Template-Free Example: T0405_1 Structure (Helix Bundle) MULTICOM GDT = 0.41 Superposition (by Prof. Sussman) (Gray: structure, yellow: best model green: MULTICOM model) Failure: ModelEvaluator fails to identify correct helix orientations.

  12. Concluding Remarks • CASP Community can sometime generate good template-free models (e.g. Rosetta-based tools) • ModelEvaluator can rank good template-free models at the top • Iterative global-local combination of models can improve template-free modeling • Blending of template-free and template-based modeling

  13. Blending of Template-Free and Template-Based Modeling 100% TBM 50% TBM+50%FM 100% FM Protein Modeling Spectrum

  14. Acknowledgements • CASP8 organizers and assessors • CASP8 participants • MU colleagues: Dong Xu, Toni Kazic • My group: Zheng Wang Allison Tegge Xin Deng