1 / 35

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps. Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry Department University of Wisconsin – Madison USA.

Download Presentation

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry Department University of Wisconsin – Madison USA Presented at the Fourteenth Conference on Intelligent Systems for Molecular Biology (ISMB 2006), Fortaleza, Brazil, August 7, 2006

  2. X-ray Crystallography FFT X-ray beam ProteinCrystal CollectionPlate ElectronDensity Map (“3D picture”)

  3. Given: Sequence + Density Map Sequence + Electron Density Map

  4. Find: Each Atom’s Coordinates

  5. Our Subtask: Backbone Trace Cα Cα Cα Cα

  6. The Unit Cell • 3D density function ρ(x,y,z) provided over unit cell • Unit cell may contain multiple copies of the protein

  7. The Unit Cell • 3D density function ρ(x,y,z) provided over unit cell • Unit cell may contain multiple copies of the protein

  8. Density Map Resolution 2Å 4Å 3Å ARP/wARP (Perrakis et al. 1997) TEXTAL (Ioerger et al. 1999) Resolve (Terwilliger 2002) Our focus

  9. Overview of ACMI (our method) • Local Match • Algorithm searches for sequence-specific 5-mers centered at each amino acid • Many false positives • Global Consistency • Use probabilistic model to filter false positives • Find most probable backbone trace • Global Consistency • Use probabilistic model to filter false positives • Find most probable backbone trace

  10. 5-mer Lookup and Cluster …VKHVLVSPEKIEELIKGY… PDB Cluster 1 Cluster 2 NOTE: can be done in precompute step wt=0.67 wt=0.33

  11. 5-mer Search • 6D search (rotation + translation) forrepresentative structures in density map • Compute “similarity” • Computed by Fourier convolution (Cowtan 2001) • Use tuneset to convert similarity score to probability

  12. NEG POS match to tuneset Bayes’ rule score distributions probability distribution over unit cell P(5-mer at ui|Map) search density map scores ti (ui) Convert Scores to Probabilities 5-mer representative

  13. In This Talk… • Where we are now For each amino acid in the protein, we have a probability distribution over the unit cell • Where we are headed Find the backbone layout maximizing

  14. Pairwise Markov Field Models • A type of undirected graphical model • Represent joint probabilities as product ofvertexand edge potentials • Similar to (but more general than) Bayesian networks y u1 u2 u3

  15. Protein Backbone Model • Each vertexis an amino acid • Each label is location + orientation • Evidence y is the electron density map • Each vertex (or observational) potentialcomes from the 5-mer matching ALA GLY LYS LEU

  16. Protein Backbone Model ALA GLY LYS LEU • Two types of edge (or structural) potentials • Adjacency constraints ensure adjacent amino acids are ~3.8Å apart and in the proper orientation

  17. Protein Backbone Model ALA GLY LYS LEU • Two types of structural (edge) potentials • Adjacency constraints ensure adjacent amino acids are ~3.8Å apart and in the proper orientation • Occupancy constraints ensure nonadjacent amino acids do not occupy same 3D space

  18. Backbone Model Potential Constraints between adjacent amino acids: = x

  19. Backbone Model Potential Constraints between nonadjacent amino acids:

  20. Backbone Model Potential Observational (“amino-acid-finder”) probabilities

  21. Probabilistic Inference • Want to find backbone layout that maximizes • Exact methods are intractable • Use belief propagation (BP) to approximate marginal distributions

  22. Belief Propagation (BP) • Iterative, message-passing method (Pearl 1988) • A message, , from amino acid i toamino acid j indicates where i expects to find j • An approximation to the marginal (or belief),is given as the product of incoming messages

  23. Belief Propagation Example ALA GLY

  24. Technical Challenges • Representation of potentials • Store Fourier coefficients in Cartesian space • At each location x, store a single orientation r • Speeding up O(N2X2) naïve implementation • X = the unit cell size (# Fourier coefficients) • N = the number of residues in the protein

  25. Speeding Up O(N2X2) Implementation • O(X2) computation for each occupancy message • Each message must integrate over the unit cell • O(X log X) as multiplication in Fourier space • O(N2) messages computed & stored • Approx N-3 occupancy messages with a single message • O(N) messages using a message product accumulator • Improved implementation O(NX log X)

  26. 1XMT at 3Å Resolution prob(AA at location) HIGH 0.82 0.17 1.12Å RMSd 100% coverage LOW

  27. 1VMO at 4Å Resolution prob(AA at location) HIGH 0.25 0.02 3.63Å RMSd 72% coverage LOW

  28. 1YDH at 3.5Å Resolution prob(AA at location) HIGH 0.27 0.02 1.47Å RMSd 90% coverage LOW

  29. Experiments • Tested ACMI against other map interpretation algorithms: TEXTAL and Resolve • Used ten model-phased maps • Smoothly diminished reflection intensitiesyielding 2.5, 3.0, 3.5, 4.0 Å resolution maps

  30. RMS Deviation ACMI ACMI Textal Resolve Cα RMS Deviation Density Map Resolution

  31. Model Completeness % chain traced % residues identified ACMI ACMI Textal Resolve Density Map Resolution

  32. Per-protein RMS Deviation TEXTAL RMS Error Resolve RMS Error ACMI RMS Error

  33. Conclusions • ACMI effectively combines weakly-matching templates to construct a full model • Produces an accurate trace even with poor-quality density map data • Reduces computational complexity from O(N2X2) to O(NX log X) • Inference possible for even large unit cells

  34. Future Work • Improve “amino-acid-finding” algorithm • Incorporate sidechain placement / refinement • Manage missing data • Disordered regions • Only exterior visible (e.g., in CryoEM)

  35. Acknowledgements • Ameet Soni • Craig Bingman • NLM grants 1R01 LM008796 and 1T15 LM007359

More Related