correlating graph-theoretical centrality indices with interface residue propensity

correlating graph-theoretical centrality indices with interface residue propensity

108 Views

Download Presentation
## correlating graph-theoretical centrality indices with interface residue propensity

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**correlating graph-theoretical centrality indices with**interface residue propensity or: where do things stick together? Stefan Maetschke Teasdale Group**…a bit more specific**• Prediction of interface residues • Protein-RNA interfaces • Machine learning methods • Structural information • Graph-topological features**something for the visual cortex**Protein-RNA complex Binding site Contact graph [JMol,1R3E_A] [Terribilini et al. 2006] [Jung Library]**questions**Most predictors are sequence based: • What impact has structural information on prediction accuracy? • What features are predictive for interface residues?**obvious features**• is on surface => Accessible surface area • has to bind => Physico-chemical prop. • must be stabilized => Contact graph topology • prefers flat surface => not really • is conserved => maybe not that much Interface residue…**accessible surface area (ASA)**http://www.see.ed.ac.uk/~tduren/research/surface_area/ http://www.ysbl.york.ac.uk/~ccp4mg/ccp4mg_help/analysis.html**physico-chemical properties**• AAIndex database • approx. 400 indices • AUC over 144 protein chains4304 binding and 27932 non-bindingsequence similarity < 30% Hydrophobicity Inside/Outside Conformation Partition Coefficient**patch type comparison**• Naïve Bayes • PSI-BLAST Profiles • AUC • 5-fold x-validation • RB144 data set**betweenness-centrality (BC)**s v t http://en.wikipedia.org/wiki/Image:Graph_betweenness.svg**BC for contact graph**• 1FJG_K • AUC = 0.71 • Red: interface residue • Size: betweenness centrality Histogram: binned BC over RB144**combined features**• WRC : distance-weighted retention coefficient • BC : betweenness centrality • ASA : accessible surface area • 5-fold x–validation, RB144 • Patch sizes: sequential->11, topological->19, spatial->19**summary**• Patch size is critical for sequential patches • Spatial/topological patches perform better • Structural information helps – but not much: +5% • Novelty: centrality indices as predictors • SVM superior to NB • Top prediction accuracy – as far as one can tell • Accuracy in general is still low (MCC < 0.4)**what’s next…**• Prediction of disease associated SNPs • Graph-spectral methods • Protein function prediction**acknowledgments**• Zheng Yuan – Data sets and much more … • Karin Kassahn – Aminoacyl-tRNA synthetases http://en.wikipedia.org/wiki/Aminoacyl_tRNA_synthetase