1 / 76

Probabilistic Methods for Interpreting Electron-Density Maps

Probabilistic Methods for Interpreting Electron-Density Maps. Frank DiMaio University of Wisconsin – Madison Computer Sciences Department dimaio@cs.wisc.edu. 3D Protein Structure. backbone. backbone sidechain. backbone sidechain C -a l p h a. ALA. LEU. PRO. VAL. ARG. ?. ?. ?.

kolton
Download Presentation

Probabilistic Methods for Interpreting Electron-Density Maps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Methods for Interpreting Electron-Density Maps Frank DiMaio University of Wisconsin – Madison Computer Sciences Department dimaio@cs.wisc.edu

  2. 3D Protein Structure backbone backbone sidechain backbone sidechain C-alpha

  3. ALA LEU PRO VAL ARG ? ? ? 3D Protein Structure … …

  4. High-Throughput Structure Determination • Protein-structure determination important • Understanding function of a protein • Understanding mechanisms • Targets for drug design • Some proteins produce poor density maps • Interpreting poor electron-density maps is very (human) laborious • I aim to automatically interpret poor-quality electron-density maps

  5. … Electron-Density Map Interpretation GIVEN: 3D electron-density map, (linear) amino-acid sequence

  6. … Electron-Density Map Interpretation FIND:All-atom Protein Model

  7. Density Map Resolution 1.0Å 2.0Å 3.0Å 4.0Å Ioerger et al. (2002) Terwilliger (2003) Morris et al. (2003) My focus

  8. Thesis Contributions • A probabilistic approach to protein-backbone tracingDiMaio et al., Intelligent Systems for Molecular Biology (2006) • Improved template matching in electron-density mapsDiMaio et al., IEEE Conference on Bioinformatics and Biomedicine (2007) • Creating all-atom protein models using particle filteringDiMaio et al. (under review) • Pictorial structures for atom-level molecular modelingDiMaio et al., Advances in Neural Information Processing Systems (2004) • Improving the efficiency of belief propagationDiMaio and Shavlik, IEEE International Conference on Data Mining (2006) • Iterative phase improvement in ACMI

  9. ACMI Overview • Phase 1: Local pentapeptide search (ISMB 2006, BIBM 2007) • Independent amino-acid search • Templates model 5-mer conformational space • Phase 2: Coarse backbone model(ISMB 2006, ICDM 2006) • Protein structural constraints refine local search • Markov field (MRF) models pairwise constraints • Phase 3: Sample all-atom models • Particle filtering samples high-prob. structures • Probs. from MRF guide particle trajectories

  10. ACMI Overview • Phase 1: Local pentapeptide search (ISMB 2006, BIBM 2007) • Independent amino-acid search • Templates model 5-mer conformational space • Phase 2: Coarse backbone model(ISMB 2006, ICDM 2006) • Protein structural constraints refine local search • Markov field (MRF) models pairwise constraints • Phase 3: Sample all-atom models • Particle filtering samples high-prob. structures • Probs. from MRF guide particle trajectories

  11. 5-mer Lookup …SAWCVKFEKPADKNGKTE… • ACMI searches map for each template independently • Spherical-harmonic decomposition allows rapid search of all template rotations Protein DB

  12. Spherical-Harmonic Decomposition f (θ,φ)

  13. map-regionsampled in spherical shells sampled region of density in 5A sphere template-densitysampled in spherical shells calculated (expected) density in 5A sphere 5-mer Fast Rotation Search electron density map pentapeptide fragment from PDB (the “template”)

  14. map-region spherical-harmonic coefficients map-regionsampled in spherical shells correlationcoefficientas functionof rotation template-densitysampled in spherical shells template spherical-harmonic coefficients 5-mer Fast Rotation Search fast-rotation function(Navaza 2006, Risbo 1996)

  15. correlation coefficients over density mapti (ui) probability distribution over density map P(5-mer at ui|EDM) Convert Scores to Probabilities Bayes’ rule scan density map for fragment

  16. ACMI Overview • Phase 1: Local pentapeptide search (ISMB 2006, BIBM 2007) • Independent amino-acid search • Templates model 5-mer conformational space • Phase 2: Coarse backbone model(ISMB 2006, ICDM 2006) • Protein structural constraints refine local search • Markov field (MRF) models pairwise constraints • Phase 3: Sample all-atom models • Particle filtering samples high-prob. structures • Probs. from MRF guide particle trajectories

  17. Probabilistic Backbone Model • Trace assigns a position and orientation ui={xi, qi} to each amino acid i • The probability of a trace U={ui} is • This full joint probability intractable to compute • Approximate using pairwise Markov field

  18. ALA GLY LYS LEU SER Pairwise Markov-Field Model • Joint probabilities defined on a graph as product of vertex and edge potentials

  19. ACMI’s Backbone Model ALA GLY LYS LEU SER Observational potentialstie the map to the model

  20. ALA GLY LYS LEU SER ACMI’s Backbone Model • Adjacency constraints ensure adjacent amino acids are ~3.8Å apart and in proper orientation • Occupancy constraints ensure nonadjacent amino acids do not occupy same 3D space

  21. Backbone Model Potential

  22. Backbone Model Potential Constraints between adjacent amino acids × =

  23. Backbone Model Potential Constraints between all other amino acid pairs

  24. Backbone Model Potential Observational (“template-matching”) probabilities

  25. Inferring Backbone Locations • Want to find backbone layout that maximizes

  26. Inferring Backbone Locations • Want to find backbone layout that maximizes • Exact methods are intractable • Use belief propagation (Pearl 1988) to approximate marginal distributions

  27. Belief Propagation Example LYS31 LEU32 mLYS31→LEU32 ˆ ˆ pLYS31 pLEU32

  28. Belief Propagation Example LYS31 LEU32 mLEU32→LYS31 ˆ ˆ pLYS31 pLEU32

  29. Scaling BP to Proteins(DiMaio and Shavlik, ICDM 2006) • Naïve implementation O(N2G2) • N = the number of amino acids in the protein • G = # of points in discretized density map • O(G2) computation for each message passed • O(G log G) as Fourier-space multiplication • O(N2) messages computed & stored • Approx (N-3) occupancy msgs with 1 message • O(N) messages using a message accumulator • Improved implementation O(NG log G)

  30. Scaling BP to Proteins(DiMaio and Shavlik, ICDM 2006) • Naïve implementation O(N2G2) • N = the number of amino acids in the protein • G = # of points in discretized density map • O(G2) computation for each message passed • O(G log G) as Fourier-space multiplication • O(N2) messages computed & stored • Approx (N-3) occupancy msgs with 1 message • O(N) messages using a message accumulator • Improved implementation O(NG log G)

  31. Occupancy Message Approximation • To pass a message occupancy edge potential product of incoming msgs to iexcept from j

  32. Occupancy Message Approximation • To pass a message • “Weak” potentials between nonadjacent amino acids lets us approximate occupancy edge potential product of all incoming msgs to i

  33. Occupancy Message Approximation 1 2 4 5 3 6

  34. Occupancy Message Approximation 1 2 4 5 3 6

  35. Occupancy Message Approximation ACC 1 2 4 5 3 6 Send outgoing occupancy message product to a central accumulator

  36. Occupancy Message Approximation ACC 1 2 4 5 3 6 Then, each node’s incoming message product is computed in constant time

  37. BP Output • After some number of iterations, BP gives probability distributions over Cα locations … … ARG LEU PRO ALA VAL … … …

  38. … ACMI’s Backbone Trace • Independently choose Cα locations that maximize approximate marginal distribution

  39. Example: 1XRI 3.3Å resolution density map 39° mean phase error prob(AA at location) HIGH 0.9 0.1 LOW 0.9009Å RMSd 93% complete

  40. Testset Density Maps (raw data) 75 60 Density-map mean phase error (deg.) 45 30 15 1.0 2.0 3.0 4.0 Density-map resolution (Å)

  41. % backbone correctly placed % amino acids correctly identified Experimental Accuracy 100 80 60 % Cα’s located within 2Å of some Cα / correct Cα 40 20 0 ACMI ARP/wARP Resolve Textal

  42. 100 100 80 80 60 60 40 40 20 20 0 0 0 20 40 60 80 100 0 20 40 60 80 100 Experimental Accuracy on a Per-Protein Basis 100 80 60 ACMI % Cα’s located 40 20 0 0 20 40 60 80 100 ARP/wARP % Cα’s located Resolve % Cα’s located Textal % Cα’s located

  43. ACMI Overview • Phase 1: Local pentapeptide search (ISMB 2006, BIBM 2007) • Independent amino-acid search • Templates model 5-mer conformational space • Phase 2: Coarse backbone model(ISMB 2006, ICDM 2006) • Protein structural constraints refine local search • Markov field (MRF) models pairwise constraints • Phase 3: Sample all-atom models • Particle filtering samples high-prob. structures • Probs. from MRF guide particle trajectories

  44. Probability=0.4 Probability=0.35 Probability=0.25 Maximum-marginal structure Problems with ACMI • Biologists want location of all atoms • All Cα’s lie on a discrete grid • Maximum-marginal backbone model may be physically unrealistic • Ignoring a lot of information • Multiple models may better represent conformational variation within crystal

  45. ACMI with Particle Filtering(ACMI-PF) Idea: Represent protein using a set of static 3D all-atom protein models

  46. Particle Filtering Overview (Doucet et al. 2000) • Given some Markov process x1:KXwith observations y1:K Y • Particle Filtering approximates some posterior probability distribution over Xusing a set of N weighted point estimates

  47. Particle Filtering Overview • Markov process gives recursive formulation • Use importance fn. q(x k |x 0:k-1 ,y k) to grow particles • Recursive weight update,

  48. Particle Filtering for Protein Structures • Particle refers to one specific 3D layout of some subsequence of the protein • At each iteration advance particle’s trajectory by placing an additional amino-acid’s atoms

  49. Particle Filtering for Protein Structures • Alternate extending chain left and right

  50. Particle Filtering for Protein Structures • Alternate extending chain left and right • An iteration alternately places • Cα positionbk+1 given bk • All sidechain atomssk given bk-1:k+1 bk-1 bk bk+1 sk

More Related