1 / 24

Bayesian Refinement of Protein Functional Site Matching

Bayesian Refinement of Protein Functional Site Matching. Kanti V Mardia , Vysaul B Nyirongo *, Peter J Green, Nicola D Gold, David R Westhead. Presented by Deephan , Mohan. Presentation Flow.

Download Presentation

Bayesian Refinement of Protein Functional Site Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Refinement of Protein Functional Site Matching Kanti V Mardia, Vysaul B Nyirongo*, Peter J Green, Nicola D Gold, David R Westhead Presented by Deephan, Mohan

  2. Presentation Flow Disclaimer : Contrary to the assumption made by the authors, the paper presenter does have a thorough understanding of all the concepts related to the topics of advanced statistical, graph theory and structural genomics discussed in the paper.. Background Conventional Methods Bayesian Refinement Results Conclusion

  3. Motivation • Structural Genomics • Structural Site comparison • Functional Site comparison • Knowledge based methods • Similarity Search Algorithms

  4. Protein Functional Site Matching • Modeled as a graph theoretic problem • Shape analysis of Proteins • Crucial for prediction of molecular interactions • Infer functional relationship of proteins • Classification of Binding Patterns • Resource: SITESDB Database • Contains Protein Structural data • Entries formed from PDB (Protein Data Bank)

  5. The Methodology • Graph Similarity Problem • Objective: Matching Functional sites -Comparing amino acid configurations (Cα and Cβ atoms) • Functional site – Graph • Amino acid positions – Vertices • Refining the Graph Match • Application of Bayesian Strategy • Markov Chain Monte Carlo (MCMC) procedure

  6. Need for Bayesian Refinement?? Bayesian Inference: • Complete Distribution of matches • Solution space • Noise Adaptation • Flexibility • Edge over combinatorial methods

  7. Bayesian Model • Common Tool used in Statistical Inference • Based on Posterior Joint Distribution • Product of Prior density and Likelihood Biologically speaking, • Prior Density - Distribution of Transformation Parameters • Likelihood - Related to matches between functional sites

  8. Representation and Matching Functional sites X and Y represented as Graphs G1 and G2 Vertex sets V1 = {Xj, j = 1, 2, ..., m} , V2 = {Yk, k = 1, 2, ..., n} Xj , Yk - represents coordinates of amino acids in jth and kth positions of X,Y x1j, y1k – Cα coordinates for X,Y x2j, y2k – Cβ coordinates for X,Y x1 = {x1j : j = 1 ..., m}, x2 = {x2j : j = 1 ..., m} y1 = {y1k : k = 1 ..., n}, y2 = {y2k : k = 1 ..., n}

  9. Graph Theoretic Approach • Objective: • Creation of Vertex Product Graph (Hv) • Hv = G1 ○v G2 • VH=V1 x V2 • An edge between two vertices vh = (Xj, Yk), vh' = (Xj', Yk') ∈ VH exists for j ≠ j' and k ≠ k' when • 1. the absolute difference between distances |x1j - x1j'| and |y1k - y1k'| and • 2. also the absolute difference between distances |x2j - x2j'| and | y2k - y2k'| are both less than 1.5Å (matching distance threshold).

  10. Bayesian Alignment 1 if jth amino acid corresponds to kth amino acid 0 otherwise • Matching between amino acids X and Y represented by matrix M, Mjk = • Transformations to bring the configurations into alignment is given by • xij = Ayik + τ for Mjk = 1, i = 1, 2 A – Rotation Matrix, τ – Translation vector

  11. Bayesian Modeling (contd) Joint Posterior Distribution: p(A), p(τ) and p(σ) denote prior distributions for A, τ and σ |A| - Jacobian Transformation presence of Gaussian noise N(0, σ2) in in the atomic positions for x1j and y1k

  12. Bayesian Modeling (contd) Side chains orientation: Extending the model by taking into account the relative orientation of Cα and Cβin matching amino acids

  13. MCMC Refinement Step Markov Chain Monte Carlo (MCMC) – used to sample the full joint distribution function p(M, A, τ, σ, x1, y1, x2, y2) p(M, A, τ, σ, x1, y1, x2, y2) – function of RMSD and anglefororientationdifferencebetween amino acids

  14. Significance of RMSD RMSD – Root Mean Square Distribution Matches of lower RMSD over larger numbers of matching residues are more statistically significant MCMC Refinement improved the RMSD (reduction) and the number of matching residues ( increase)

  15. Decision tree for refining the graph solution by the MCMC method. Boxes with curved corners show processes and their output while boxes with sharp corners are for branching conditions. The procedure starts with graph solution MG. The graph solution's RMSD and number of matches are denoted by RMSDG and LG respectively. MCMC is re-iterated until the MCMC solution: MB is better. The RMSD and number of matches for MB are denoted by RMSDB and LB respectively. MB and MG are compared using 1) RMSDs and the number of matches or 2) P-values for MG and MG, denoted by PG and PB respectively.

  16. Results • Two Binding Sites: • Alcohol dehydrogenase structure (60 amino acids) • 17 – β hydroxysteroiddehydrogenase ( 63 amino acids) • 4 Matching Studies were performed • Each study was performed with and without considering the physico-chemical properties of amino-acids.

  17. Case 1: Site 1hdx_1 matching against its own SCOP family • 125/145 sites produced significant matches – increased to 131/145 (after refinement) • RMSD is improved • from > 1.5Å to less than • 1Å • Increase in the number of matching residues Case-I

  18. Case 2: 17 – β hydroxysteroiddehydrogenase and family • After MCMC Refinement step significant matches increased from 248 to 318 of 326 sites • Increased number of matching residues at a similar RMSD • RMSD improvement in minority of the sites

  19. Case 3: alcohol dehydrogenase and superfamily • Matching sites increased form 200 to 324 • Case 4: Alcohol dehydrogenase and FAD/NAD(P)-binding domain • 12 sites improved after MCMC refinement

  20. Discussion of Results MCMC refinement step provides significant improvement over Graph Matching Techniques Success – Lack of dependence on strict distance matching criteria Computationally expensive Refinements adapts to shape variations in binding sites

  21. Thank You!!!!

More Related