1 / 18

Prediction of Bacterial Effectors using SVM and Naïve Bayes classifier

Prediction of Bacterial Effectors using SVM and Naïve Bayes classifier. Sneha Joshi MU Informatics Institute November 30, 2009. Effector Prediction. What are effectors: Why predicting effectors Prime candidates involved in Host pathogen interaction Modulate host cell functions

winda
Download Presentation

Prediction of Bacterial Effectors using SVM and Naïve Bayes classifier

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prediction of Bacterial Effectors using SVM and Naïve Bayes classifier Sneha Joshi MU Informatics Institute November 30, 2009

  2. Effector Prediction • What are effectors: • Why predicting effectors • Prime candidates involved in Host pathogen interaction • Modulate host cell functions • What are our goals: • Develop a classifier to classify pathogenic proteins in to effectors or non-effectors • Identify important features of signal • Provide potential drug targets

  3. Available Methods • Experimental: • Translocation assays using fusion proteins of putative effector with reporter gene • Detection of effectors in supernatant • Prior knowledge required to screen effectors using experiment • Computational: • Homology to known effectors • Can not predict novel effectors • Transcriptional co-regulation • Few methods exists – limited to one of the secretion system

  4. SVM prediction Features from N terminal 25 amino acids Features from full length of protein Features from C terminal 25 amino acids SVM 2 SVM 1 SVM 3 Naïve BayesClassifier Effectors Non-Effectors

  5. Features from Protein sequence Dipeptide Composition Secondary structure Dielectric constant MLKYEERKLNNLTLSSFSKVGVSNDARL Charge Amino Acid Composition Relative solvent accessibility Polar, non-polar, charged, acidic, basic amino acids

  6. Features from Nucleotide sequence Distance from known effector

  7. Results: Data

  8. Results: SVM1: Full Length amino acids Precision = TP/(TP+FP) Recall = TP/(TP+FN)

  9. Results: SVM2: N terminal 25 amino acids Precision = TP/(TP+FP) Recall = TP/(TP+FN)

  10. Results: SVM3: C terminal 25 amino acids Precision = TP/(TP+FP) Recall = TP/(TP+FN)

  11. Results • Effect of predicted secondary structure solvent accessibility on prediction accuracy

  12. Results • Effect of serine on prediction accuracy

  13. Feature Selection • Feature space reduction • Correlation based feature selection1 • Hypothesis: Good feature subsets contain features highly correlated with the class yet uncorrelated with each other. • Features space reduced to 36 dimensions for full length, 19 for N terminal, and 25 dimensions for C-terminal. 1 Mark Hall Correlation-based Feature Selection for Machine Learning

  14. Results after feature selection

  15. Case study Xanthomonas oryzae Causes leaf blight of rice Has T2SS and T3SS System detects 2 effectors substrates of type II secretion system along with other 6 effectors of type III secretion system.

  16. Future Work • Naïve Bayes Classifier: • Application to biological system: Mycobacterium tuberculosis • Evolutionary study of effector proteins • Extending beyond bacterial secretion systems • Nematode effector proteins

  17. Acknowledgement • This work was supported by NSF Award #0845196 • Dmitry Korkin • Gavin Conant.

  18. Thank You.

More Related