1 / 23

TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”. Madhavi Ganapathiraju Graduate student Carnegie Mellon University. Overview. TMpro evaluations on PDBTM, TMPDB and MPTOPO are complete Additional inputs to TMPro are being studied

Download Presentation

TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TM PRO& Comparison of Algorithms for “Protein Stability Prediction Upon Mutations” Madhavi Ganapathiraju Graduate student Carnegie Mellon University

  2. Overview • TMpro evaluations on PDBTM, TMPDB and MPTOPO are complete • Additional inputs to TMPro are being studied • Yule values (not successful) • Evolutionary Profile (promising) • TMPro website has been completed • Evaluation of algorithms to predict protein stability changes upon mutations

  3. Part 1: TM pro

  4. TMPro Evaluations

  5. TMPro web-server is fully functional!Competition for TMpro LogoPrize: See your logo on the web!

  6. Attempts to overcome confusion with globular soluble helices (1) • Yule value features to be added • Yule value features that discriminate amino acid neighbor propensities between TM and nonTM helices were computed earlier • Tried to add these features as input to NN predictor, but could not achieve quantitative improvement • I will discuss this in future when I have any results to present

  7. Attempts to overcome confusion with globular soluble helices (2) • Evolutionary profile information • It is known that knowledge of evolutionary profile of a protein can improve prediction accuracy to a great extent • TMPro is capable of predicting TMs without requiring knowledge of profile • Useful when you cannot extract sequence alignments from known proteins • But where profile is known, we would like to use that additional information

  8. PSSM (i,j) = log(C(i,j)/total counts at position j) log(C(i,j)/unigram count of i in the protein) Profile generation Those of you who have worked with evolutionary analysis before, please give feedback • Get multiple sequence alignments • Compute position specific scoring matrix for each protein • 21 rows (20 amino acids, and 1 row for gaps) • Profile is generated for each protein in the training and test sets

  9. --n------n----n------nnn-----n------n-----------------M----- 2a65 369 --D------E----L------KLS-----R------K-----------------H----- 377 2A65_A 369 --.------.----.------...-----.------.-----------------.----- 377 AAC07817 369 --.------.----.------...-----.------.-----------------.----- 377 YP_001956 364 --E------S----F------G.K-----.------.-----------------T----- 372 -M------M------M------M-------M----------M---------MM------- 2a65 378 -A------V------L------W-------T----------A---------AI------- 385 2A65_A 378 -.------.------.------.-------.----------.---------..------- 385 AAC07817 378 -.------.------.------.-------.----------.---------..------- 385 YP_001956 373 -S------C------.-----------------------------------IL------- 377 Doubts What labels to assign to gaps? • We have labels for training sequences • But when original sequence has gaps when aligned, how to interpret the labels of the gaps? Even TM regions are having gaps such as shown above

  10. XP_659910 47 L-......K.----------...KAP----RSNQV.-..FVAGTMGLASAVGA.AT 86 AAW43619 100 .....A..A-----------KNP----NTTRNV-..FMVGALGALGASSV.ST 136 CAB59195 59 ----.N.RP.-A..VIGSARFAYMAWTRVA 83 XP_466001 107 SKRA.-A.FVLSGGRFIYASLLRLL 130 AAA20832 103 SKRA.-A.FVLTGGRFVYASLVRLL 126 Doubts What do with missing segment info for some sequences • When nothing is shown (gap/alignment) for some sequences, I am counting those as gaps

  11. Experimental observed locations of TM helices Predicted output (nonmembrane=0, membrane =1) Residue Number Using profile for prediction Studied independent of TMpro Neural network with 21 input, 21 hidden and 1 output neurons

  12. Another output

  13. Computed Wavelet Transform Mexican hat wavelet, scale = 10 NN architecture needs to be modified But instead I did post-processing of Neural network output

  14. Some more wavelet outputs Note that these are from the training data itself.. Yet to check how it performs overall

  15. Part 2: Stability upon Mutations

  16. Evaluation of predictions of protein stability changes upon mutations • Effects of mutations on 2 TM proteins are available in our group • The two proteins are rhodopsin and bacteriorhodopsin • Data available for how much mis-folding occurs • How stability of protein is affected • There are algorithms that can also predict these changes • We compared how accurate or reliable the prediction methods are, by comparing their results with our experimental data

  17. 3 Prediction algorithms • I mutant 2.0 • Support vector machine • Features: amino acid neighbors in 9nm sphere, temperature, pH, relative solvent accessibility surface are • http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi • DFIRE • Knowledge based statistical potentials • http://phyyz4.med.buffalo.edu/hzhou/mutation.html • FOLDX • Statistical mechanics.. Account for various energy terms • http://fold-x.embl-heidelberg.de:1100/

  18. Authors’ claims in 3 papers

  19. Our results Rhodopsin (PDB: 1U19) Bacteriorhodopsin (PDB: 1QM8)

  20. Bias in # of mutations that increase/decrease stability Database bias affects apparent accuracies of algorithms I-mutant for example, predicts decrease in stability for a majority of the mutations. Whether the mutations studied through experiments preserve the natural bias of decreasing stability mutations, affects the apparent accuracy of the prediction algorithms

  21. Correlation with known data Reported correlations for these methods are quite large (>0.7) On data compared here the correlations are quite low

  22. Notes .. • Local installation of blast and netblast are on cologne: • /usr1/blast-2.2.13/ • /usr1/netblast-2.2.13/ • Java SDK on Cologne • /usr1/j2sdk1.4.2_11/

  23. Acknowledgements Judith Klein-Seetharaman Christopher Jon Jursa Pitt Information sciences (for developing web interface)

More Related