1 / 1

Abstract # 2624

A STATISTICAL ANALYSIS AND APPROACH TO PROTEIN SURFACE MODELING Luticha Doucette † , James Halavin * , Paul Craig ° , Herbert J. Bernstein ‡ lutichadoucette@gmail.com , jjhsma@rit.edu , pac8612@rit.edu, yaya@dowling.edu

mary
Download Presentation

Abstract # 2624

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A STATISTICAL ANALYSIS AND APPROACH TO PROTEIN SURFACE MODELING LutichaDoucette†, James Halavin*, Paul Craig°, Herbert J. Bernstein‡lutichadoucette@gmail.com, jjhsma@rit.edu, pac8612@rit.edu, yaya@dowling.edu †RIT Life Sciences, *RIT Mathematical Sciences, °RIT Chemistry ‡Dowling College, Mathematics and Computer Science ProgramNo. 978.9 Abstract # 2624 In studying protein-protein interactions it is important to accurately describe the surface of the proteins, as that is where the interactions occur. The most common surface representation in molecular visualization programs is the Lee-Richards (LR) surface, which is generated by rolling a probe representing a solvent molecule on the van der Waals surface of the protein. This approach is slowed computationally because the program must consider 2 or three atoms at a time. The resulting molecular surface is like a Van der Waals surface, but with reentrant surfaces bridging gaps between reasonably close atoms. Also, the LR surface algorithm sometimes incorrectly assigns atoms to the surface of the protein. We are developing a new algorithm with the hope that it will be both more accurate and faster than current approaches. In the algorithm, atoms are first identified by their accessibility to an imaginary water molecules; it then finds the number of these atoms which are within distance epsilon of atom i. The algorithm then explores a sphere with radius epsilon around atom i. The sphere is divided into 8 sectors and the algorithm determines how many of the n atoms contained in the sphere fall into each of the sectors. If all 8 sectors contain at least one atom, then the atom in question is considered an inside atom and given a value of 0. If at least one sector contains no atoms then the atom in question is considered outside and is given a value of 1. This algorithm is currently implemented in Minitab via the use of a global macro. Atomic coordinates were extracted using an awk script and saved to text files. Preliminary results show that the algorithm is flexible and can identify outside atoms over different shapes of proteins. Also a unique pattern emerged: no matter the protein, there was a limit on the percentage of atoms found on the outside; the graphs show identical limits for all proteins studied thus far. The downside to this approach is that it is computationally burdensome. We are currently testing the algorithm with different proteins and are investigating the association of limits with protein size/shape. Our next step will be to program the algorithm in a different language such as python, java or C++ to see if a more robust language can decrease the run time. In conclusion, this new algorithm is effective on a variety of proteins, shows a unique aspect of proteins as seen in the limits which possibly can lead into new insights into protein surfaces. Conclusion Results from Minitab Continued Results from Minitab • The current algorithm is flexible and can handle many different shaped proteins • Proteins show similar asymptotic curves representing the number of atoms that are identified as being on the protein surface • Once implemented in Python, the algorithm runs faster and does not crash like in Minitab Figure 5. 3D Scatter Plot of 1AV1 Figure 1. 3D Minitab Scatter Plot of 1UAQ The new algorithm is flexible in that it can identify outside atoms in unusually shaped proteins as seen in Figure 5. This plot is a different representation of a change in epsilon as seen in figures 1 and 2. Figure 1 shows protein 1UAQ with epsilon = 40. Number of outside atoms found = 434 Future Plans Results From Python • Refine the algorithm in Python • Test the refined algorithm with proteins used in Minitab, compare results • Once the algorithm is refined, expand tests to different types of proteins according to class and size • Incorporate the algorithm into existing ProMol extension as another tool • Publish! Figure 2. 3D Scatter Plot of 1UAQ with Change in Epsilon Figure 6. 1UAQ Implemented in vPython Figure 2 shows how a change in epsilon changes how many atoms are found. Epsilon = 4.6 Number of outside atoms = 1593 Literature Cited From left to right: vPython places a boxel around each atom, boxel is subdivided into 8 sectors, if other atoms are within those sectors Python returns false and that atom is eliminated. Only outside atoms are left with their corresponding boxels as shown in yellow. Goals • Steinkellner, G; Rader, R.; Thallinger, G.G; Kratky, C.; Gruber, K. VASCo: computation and visualization of annotated surface protein surface contacts. BMC Bioinformatics. 2009, 10, 32. http://www.biomedcentral.com/1471-2105/10/32 (accessed June 7, 2010). • Liu, Y.S; Fang, Y; Ramani, K. IDSS: deformation invariant signatures for molecular shape comparison. BMC Bioinformatics 2009, 10, 157. http://www.biomedcentral.com/1471-2105/10/157 (accessed June 7, 2010). • Hoffmann, B.; Zaslavskiy, M.; Vert, J.P.; Stoven, V. A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction. BMC Bioinformatics 2010, 11, 99. http://www.biomedcentral.com/1471-2105/11/99 (accessed June 7, 2010). • Bash, P. A.; Pattabiraman, N.; Huang, C.; Ferrin, T.E; Langridge, R. Van Der Waals Surfaces in Molecular Modeling: Implementation with Real-Time Computer Graphics. Science. New Series. 1983, 222, 4630 pg 1325 – 1327. http://www.jstor.org/stable/1691658 (accessed June 8, 2010). • Kuntz, I.; Blaney, J.M.; Oatley, S. J.; Langridge, R.; Ferrin, T.E. A geometric approach to macromolecule-ligand interactions. J. Molec. Bio. 1982, 161, 2. Pg 269 – 288 doi:10.1016/0022-2836(82)90153-X (accessed June 8, 2010). • Connolly, M.L. Solvent-Accessible Surfaces of Proteins and Nucleic Acids. Science, New Series. 1983, 221,4612 pg 709 – 713. http://www.jstor.org/pss/1691011 (accessed June 8, 2010). • Bernstein, H.J.; Craig, P.A. Efficient molecular surface rendering by linear-time pseudo-Gaussian approximation to Lee-Richards surfaces (PGALRS). Journal of Applied Crystallography. 2010, 43, 2. Pg 356 – 361. • Peter J. Artymiuk, Andrew R. Poirrette, Helen M. Grindley, David W. Rice, Peter Willett, A Graph-theoretic Approach to the Identification of Three-dimensional Patterns of Amino Acid Side-chains in Protein Structures, Journal of Molecular Biology, Volume 243, Issue 2, 20 October 1994, Pages 327-344, ISSN 0022-2836, 10.1006/jmbi.1994.1657. (http://www.sciencedirect.com/science/article/pii/S0022283684716573) Figure 3. ED Scatter Plot of 3CTK • Create a faster, more accurate representation of a protein surface • Implement the algorithm in a more robust language such as Python • Investigate the implications of limits in protein surfaces Figure 7. Python vs. Minitab Figure 3 shows protein 3CTK with epsilon set to 70. Maximum radius was found to be 62.927. Number of outside atoms found = 469 Materials and Methods • Created global macros, used awk script to obtain coordinates of atoms from PDB files • Opened Minitab in Windows. A global macro was created and added to the Minitab file menu. • In Python, reinterpreted the algorithm and tested on the same PDB files as was tested in Minitab Table 1. Limiting Asymptotic Value for 1UAQ On the left is 1UAQ from Python. The curve is not plotted as percent of maximum radius vs percent outside but shows some discrepancies between Minitab and Python. Further refinement of Python should resolve this issue. Acknowledgments • Scott “JT”Mengel • Jon Schull • Eulas Boyd and NSF-LSAMP • The authors gratefully acknowledge the assistance of current and former students who have worked at Dowling College and at RIT on the SBEVSL project.  Funding: This work has been supported in part by National Science Foundation Division of Undergraduate Education grant 0402408, National Institute of General Medical Sciences grants 2R15GM078077-02, 3R15GM078077-02S1. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies. Figure 9. 1AV1 Python Implementation 3D scatter Algorithm Table 1 shows that as epsilon is increased, the number of outside atoms found reaches a limit. An imaginary surface is created and divided into 8 sectors. If one sector is is empty, value = 1. If all sectors filled, value = 0. Figure 4. Plot of Maximum Radius vs Percent Outside On the left is 1AV1 in the next iteration of the algorithm. Done in Python 2.7.5 the blue represents the maximum number of atoms found to be on the outside, while green represents interior atoms. 1649 out of 6588 atoms were found which is 25% of the total, consistent with the asymptotic curves as seen in Figure 4. as well as the 3D scatter plot in Figure 5. On the right, is just the outside atoms, represented in blue. Center (x(i), y(i), z(i)) Proteins studied thus far show very similar asymptotes as seen in Figure 4. All have a sharp decrease until the radius is about 20% of the total number of atoms then it levels off.

More Related