1 / 28

DISTANCE MATRIX-BASED APPROACH TO PROTEIN STRUCTURE PREDICTION

DISTANCE MATRIX-BASED APPROACH TO PROTEIN STRUCTURE PREDICTION. Andrzej Kloczkowski, Robert L. Jernigan, Zhijun Wu, Guang Song, Lei Yang - Iowa State University, USA Andrzej Kolinski, Piotr Pokarowski - Warsaw University, Poland. Matrices containing structural information.

theola
Download Presentation

DISTANCE MATRIX-BASED APPROACH TO PROTEIN STRUCTURE PREDICTION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DISTANCE MATRIX-BASED APPROACH TO PROTEIN STRUCTURE PREDICTION Andrzej Kloczkowski, Robert L. Jernigan, Zhijun Wu, Guang Song, Lei Yang - Iowa State University, USA Andrzej Kolinski, Piotr Pokarowski - Warsaw University, Poland

  2. Matrices containing structural information • Distance matrix (dij) • Matrix of square distances D = (dij2) • Contact matrix C = (cij) cij = 1 if dij > dcutoff otherwise cij = 0 • Laplacian of C (Kirchhoff matrix) Lc = diag(Scij) - C

  3. Lc-1 generalized inverse of Lc in elastic network models defines covariance between fluctuations Similarly we can define Laplacian of D: LD and generalized inverse LD-1

  4. Spectral decomposition of structural matrices A= S lkvkvkT is expressed by eigenvalues and corresponding eigenvectors of A

  5. Spectral decomposition of a square distance matrix Spectral decomposition of a square distance matrix is a complete and simple description of a system of points. It has at most 5 nonzero, interpretable terms: A dominant eigenvector is proportional to r2 - the square distance of points to the center of the mass, and the next three are principal components of the system of points.

  6. CN – contact number PECM – principal eigenvector of the contact matrix GNM – fluctuations of residues computed from the Gaussian Network Model (Bahar et al. 1997) SVR – Support Vector Regression – variant of SVM for continuous variables B-factor – temperature factor from X-ray crystallography

  7. B-factor correlates with the distance from the center of mass r2 – Petsko 1980 Correlation between fluctuations of residues and the inverse of their contact number – Halle 2002

  8. Approximation of distance matrices • A= S lkvkvkT • We used a nonredundnt database of 680 structures from the ASTRAL database • r2 itself approximates structures with DRMS 7.3Å • r2 combined with first principal component approximates structures with DRMS 4.0Å

  9. Current work: Prediction of r2 from the sequence with SVR Prediction of the first structural component from the sequence

  10. Principal Component Analysis of Multiple HIV-1 Proteases Structures • 164 X-ray PDB structures and 28 NMR PDB structures and 10,000 structures (snapshots) from the Molecular Dynamics simulations were analysed. • The Principal Component Analysis of these three different datasets were performed. • The results were compared with normal modes computed from the Anisotropic Network Model – an Elastic Network Model that considers anisotropy of fluctuations of residues in protein.

  11. The a-carbon trace of the HIV-1 structure

  12. Elastic network models • Rubber elasticity (polymers - Flory) • Intrinsic motions of structures (Tirion 1996) • Simple elastic networks of uniform material • Appropriate for largest, most important domain motions of proteins - independent of many structure details • High resolution structures not needed to learn about important motions Rubbery Bodies with Well Defined, Highly Controlled Motions

  13. Elastic Network ModelsCalculating Protein Position Fluctuations Vtot(t) = (g/2) tr [DR(t)TGDR(t)] <DRi.DRj> = (1/ZN) ∫ (DRi.DRj) exp {-Vtot/kT} d{DR} = (3kT/g) [G-1]ij G = Kirchhoff matrix of contacts = G = Compute Normal Modes for Fluctuations and Correlations

  14. HIV Reverse Transcriptase – Slowest Motion Push-pull Hinge

  15. Modes of Motion – HIV Protease Mode 1 Mode 2 Mode 3 Three Ways to Open the Flaps

  16. NMR Structures Fit Elastic Networks Better than X-Ray Structures HIV Protease Overlaps between directions of motions (dot products of vectors) Includes Many Drug Bound Structures Distortions for Drug Binding Are Intrinsic to Protein Structure Results for 164 X-ray and 28 NMR HIV Protease Structures

  17. Cumulative Overlaps with NMR Motions NMR Agreement Better than X-ray

  18. Structural Refinement Using Distribution of Distances • We have developed a method of refining NMR structures using derived distance constraints and mean-force potentials. • The original NMR experimental constraints for the structures were downloaded from BioMagResBank. • The structures were refined using the default dynamic simulated annealing protocol implemented in CNS software (Brunger et al. Yale Univ). • We used also mean-force potentials E = kT ln P(r) by adding them into the energy function of the NMR modeling software CNS. The structures have been improved significantly (in terms of RMSD, their energy, NOEs, etc.) after refinement with the database-derived mean-force potentials.

  19. CASPR 2006 • We have successfully used this method in CASPR 2006 structure refinement experiment. • Figure below shows application of our method for a model of 1WHZ (70 residues) – a refinement from 2.19 Å to 1.80 Å has been obtained.

  20. Distance Intervals The distances are given with their possible ranges. i j NP-hard!

  21. A Generalized Distance GeometryProblem i Dri Root mean square fluctuations B-factors di,j j rj

  22. Protein 1AX8 Data generation: fi : the rms fluctuation of atom i. S = {(i,j) : di,j = ||yi – yj|| < 5Å} li,j = di,j – fi – fj ui,j = di,j + fi + fj Original: Problem solved: ri : the fluctuation radius of atom i. maxx, r∑D ri3 di,j = ||xi – xj|| li,j≤ di,j – Dri – Drj ui,j≥ di,j + Dri +D rj, (i,j) in S Computed: RMSD (x, y) = 3.6 e -07 1017 atoms

  23. Atomic Fluctuations Original fi Dri Computed

  24. Acknowledgments: • NIH support: • 1R01GM081680-01 (AKlo) • 1R01GM073095-01A2 (RLJ) 1R01GM072014-01 (RLJ)

More Related