1 / 44

Automating Steps in Protein Structure Determination by NMR

Automating Steps in Protein Structure Determination by NMR. CS 296.4 April 13, 2009. Background Steps in NMR protein structure determination The ACE cycle (Assign-Calculate-Evaluate) The assignment problem

tiva
Download Presentation

Automating Steps in Protein Structure Determination by NMR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

  2. Background Steps in NMR protein structure determination The ACE cycle (Assign-Calculate-Evaluate) The assignment problem Algorithms for automated NOE assignment Semi-automated methods More-automated methods Conclusions Outline

  3. 1. Sample preparation 2. Data collection 3. Data evaluation 4. Structure calculation 5. Structure refinement 6. Structure deposition The Steps in Protein Structure Determination by NMR

  4. 1. Sample preparation (a) protein selection (b) gene engineering (c) protein expression (d) protein purification (e) buffer optimization (f ) isotope labeling 2. Data collection 3. Data evaluation 4. Structure calculation 5. Structure refinement 6. Structure deposition (and maybe write a paper and graduate) The Steps in Protein Structure Determination by NMR

  5. 1. Sample preparation (a) protein selection (b) gene engineering (c) protein expression (d) protein purification (e) buffer optimization (f ) isotope labeling 2. Data collection (a) HSQC (b) amide H/D exchange (c) triple-resonance 3. Data evaluation 4. Structure calculation 5. Structure refinement The Steps in Protein Structure Determination by NMR

  6. 1. Sample preparation (a) protein selection (b) gene engineering (c) protein expression (d) protein purification (e) buffer optimization (f ) isotope labeling 2. Data collection (a) HSQC (b) amide H/D exchange (c) triple-resonance 3. Data evaluation (a) spectrum calculation (b) peak picking The Steps in Protein Structure Determination by NMR

  7. 1. Sample preparation 2. Data collection 3. Data evaluation 4. Structure calculation 5. Structure refinement 6. Structure deposition Automatable Steps in Protein Structure Determination by NMR

  8. The Assign Calculate Evaluate cycle in automated NOE assignment and structure calculation. Fig. 2 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert.

  9. Automating NOE Assignments and THE Assignment Problem

  10. Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment 2. NOE Assignment

  11. Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment (interpreting data) 2. NOE Assignment (interpreting data)

  12. Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment 2. NOE Assignment and one major assignment problem. ambiguous assignments Due to the data collection problems of 1. Completeness 2. Uniqueness

  13. Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment 2. NOE Assignment and one major assignment problem. ambiguous assignments Due to the data collection problems of 1. Completeness (missing data points) 2. Uniqueness (unresolvable data points)

  14. Unambiguously assigning a NOESY cross peak from Fig. 3 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert.

  15. Algorithms for automated NOESY assignment Semi-automated methods 1. ASsign NOEs (1993) 2. Structure Assisted NOE Evaluation (2001) Automated NMR Protein structure calculation Peter Guntert (2003)Progress in NMR Spectroscopy, 43, 105-125

  16. Algorithms for automated NOESY assignment Semi-automated methods 1. ASsign NOEs (1993) 2. Structure Assisted NOE Evaluation (2001) More-automated methods 1. NOAH (1995) 2. Ambiguous Restraints Iterative Assignments (1997) 3. AutoStructure (1999) 4. KNOWledge-based NOE assignments (2002) 5. CANDID (2002) Automated NMR Protein structure calculation Peter Guntert (2003)Progress in NMR Spectroscopy, 43, 105-125

  17. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Set of estimated structures User specifies 1.  = max allowed chemical shift error 2. dmax = max interproton distance causing NOE 3. nmin = min # structures with d < dmax ASNO (1993) Guntert, Berndt, & Wuthrich

  18. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Set of estimated structures User specifies 1.  = max allowed chemical shift error 2. dmax = max interproton distance causing NOE 3. nmin = min # structures with d < dmax Algorithm steps 1. each cross peak: find all poss. assignments (1Hj, 1Hk) 2. each (1Hj, 1Hk): n = # of structures with d < dmax 3. Prune all (1Hj, 1Hk) with n < nmin User intervention 1. Manually check and refine NOE assignments (1Hj, 1Hk) 2. Refine set of structures and rerun algorithm ASNO (1993) Guntert, Berndt, & Wuthrich

  19. Fig. 1 (1993) J Biomol NMR, 3, 601, Guntert, Berndt, & Wuthrich. demo: Dendrotoxin K, 7kDa, 57AA, bbRMSD = 0.32Ang

  20. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) User specifies Filters 1. Distance (Set of estimated structures) 2. Chemical Shift ( = max allowed error) 3. Secondary structure (unlikely NOE assignments) 4. Assignment (expected NOE assignments) 5. NOE contribution (same as in ARIA method) SANE (2001) Duggan, Legge, Dyson, & Wright

  21. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) User specifies Filters 1. Distance (Set of estimated structures) 2. Chemical Shift ( = max allowed error) 3. Secondary structure (unlikely NOE assignments) 4. Assignment (expected NOE assignments) 5. NOE contribution (same as in ARIA method) Algorithm steps 1. each cross peak: find all poss. assignments (1Hj, 1Hk) 2. Apply five filters to prune list of (1Hj, 1Hk) 3. Write unique or ambiguous dist restraints, or violations User intervention 1. Violation analysis SANE (2001) Duggan, Legge, Dyson, & Wright

  22. Fig. 1 (2001) J Biomol NMR, 19, 321, Duggan, et al. demo: LFA-1 I-domain, 21.3kDa, 183AA, bbRMSD = 0.29Ang

  23. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Scalar coupling constants (3JNH) Algorithm calculates 1. Distance constraints from NOE assignments 2. Angle constraints from scalar couplings NOAH (1995) Mumenthaler & Braun

  24. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Scalar coupling constants (3JNH) Algorithm calculates 1. Distance constraints from NOE assignments 2. Angle constraints from scalar couplings Algorithm uses 1. Structure-based filter (recognizes correct constraints) 2. Chemical Shift limit ( = max allowed error) 3. Error-tolerant target function in DIAMOD (1994) (minimizes effect of incorrect distance constraints from incorrect NOE assignments) NOAH (1995) Mumenthaler & Braun

  25. Fig. 1 (1995) J Mol Biol, 254, 465, Mumenthaler & Braun demo: 3 proteins ranging from 57 to 74 residues

  26. (1995) J Mol Biol, 254, 465, Mumenthaler & Braun NMRa/b=DEN=57, TEN=74, REP=69 residues

  27. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Assignment cutoff, p, decreases for each cycle 5. (opt) preliminary structures, manual assignments 6. (opt) RDCs, scalar couplings, d-angles, S-S or H-bonds Algorithm calculates in each cycle 1. Unique and partial NOE assignments 2. Unique and ambiguous distance restraints 3. Merges distance restraints with other input data 4. Bundle of refined structures (typically 20) ARIA (1997) Nilges, et al.

  28. Ambiguous restraints An NOE cross peak with more than one possible assignment is considered as a weighted composite of all of them. Ambiguous distance restraints introduced to incorporate dk of each ambiguous NOE assignment. ARIA (1997) Nilges, et al. To reduce the number of assignment possibilities each relative contribution Ck is calculated from dkand the average distance for all possible assignments from the lowest n of 20 conformers from the previous cycle. The largest Ck that add up to the cutoff value, p, for that cycle are kept, the rest are discarded.

  29. Fig. 1 (1997) J Mol Biol, 269, 408, Nilges, et al. demo: -spectrin PH domain, 106 residues

  30. MAN data derived from manual assignments 80ms and 30ms data differ only in mixing times -spectrin PH domain, 106 residues Table 1 (1997) J Mol Biol, 269, 408, Nilges, et al.

  31. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Scalar couplings 5. Slow amide H/D exchange data 6. Preliminary structure 7. Preliminary H-bonded pairs Algorithm calculates 1. Distance restraints 2. Dihedral angle restraints 3. H-bonding pairs 4. Refined structures AutoStructure (1999) Moseley & Montelione

  32. basic fibroblast growth factor (127 residues) 10 NMR-derived structures bbRMSD = 0.7 Ang. between (b) manual and AutoStructure-derived structures Fig. 1 (1999) Curr. Opin. Struct. Biol., 9, 635, Moseley & Montelione. (& Y.J. Huang PhD thesis)

  33. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. NOESY cross peak volume probability distribution 5. Preliminary structure User specifies 1.  = max allowed chemical shift error 2. initial value of dmax = max interproton distance 3. Number, N, of current best structures KNOWNOE (2002) Gronwald, et al.

  34. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. NOESY cross peak volume probability distribution 5. Preliminary structure User specifies 1.  = max allowed chemical shift error 2. initial value of dmax = max interproton distance 3. Number, N, of current best structures Algorithm, working together with CNS, iteratively will 1. build A-list of uniquely assigned NOE cross peaks 2. calculate P(Ak, a | Vo) for all other peaks 3. add to A-list all peaks with P(Ak, a | Vo) < cutoff (0.8-0.9) 4. use current A-list to calculate N structures KNOWNOE (2002) Gronwald, et al.

  35. The problem of ambiguous assignments is addressed with a Bayesian algorithm based on NOE cross peak volume probability distributions derived from 326 spectra. P(Ak, a | Vo) = probability that more than fraction a of cross peak volume Vo is due to assignment k If P(Ak, a | Vo) > cutoff value (typically 0.8 to 0.9) then consider that peak assigned to k for the next cycle. These authors state that their algorithm is “Based on the observation that cross peak volume and correct cross peak assignment are not independent of each other”. KNOWNOE (2002) Gronwald, et al.

  36. Figures 3 & 4 (2002) J. Biomol. NMR, 23, 271, Gronwald, et al. Probability distributions of distance (left) and volume (right)

  37. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Previously assigned NOE distance constraints 5. (opt) other conformational constraints User specifies 1.  = max allowed chemical shift error 2. Cycle-dependent parameters (thresholds, cutoffs, etc.) CANDID (2002) Hermann, Guntert & Wuthrich

  38. from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

  39. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (jj) ) 4. Previously assigned NOE distance constraints 5. (opt) other conformational constraints User specifies 1.  = max allowed chemical shift error 2. Cycle-dependent parameters (thresholds, cutoffs, etc.) Algorithm uses 1. Structure-based filters (like NOAH) 2. Ambiguous distance constraints (like ARIA) 3. Network anchoring (new) 4. Constraint combination (new) CANDID (2002) Hermann, Guntert & Wuthrich

  40. Fig. 1 (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

  41. ways to handle problems caused by no preliminary structure in first cycle 1. Network anchoring “… evaluates the self-consistency of NOE assignments independent of knowledge of the 3D protein structure.” “… a sensitive approach for detecting erroneous ‘lonely’ constraints …” 2.Constraint combination “… an extension of the concept of ambiguous NOE assignments.” “… reduces the impact of unidentified artifact constraints in the input for the first structure calculation.” Result: “The correct fold is obtained in cycle 1 of a de novo structure calculation.” CANDID (2002) Hermann, Guntert & Wuthrich

  42. from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

  43. Questions ? Conclusions

More Related