1 / 22

Protein Design

Protein Design. Crystal structure of top7 – A novel protein structure created with RosettaDesign. CS273: Final Project Charles Kou charlesk@stanford.edu. http://rosettadesign.med.unc.edu/. What is Protein Design.

maine
Download Presentation

Protein Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Design Crystal structure of top7 – A novel protein structure created with RosettaDesign. CS273: Final Project Charles Kou charlesk@stanford.edu http://rosettadesign.med.unc.edu/

  2. What is Protein Design • Opposite of structure prediction: determine low energy sequence that yield given structure. • Computationally difficult: • Search space of 20^n where n = sequence length (20 amino acids) • Major algorithms: Dead-end elimination, genetic algorithms, Monte Carlo, Branch & Bound. http://www.stanford.edu/class/cs273/project/project.html

  3. Major Algorithms • Trade off between thoroughness and computational speed. • Monte Carlo / Genetic Algorithm: • Can sample space with infinite number of solutions • Sidechain identity, side chain orientation and backbone structure can be varied continuously. • No guarantee of reaching global energy minimum. • Dead-End Elimination • Allows only discrete conformations. • Rejection criteria is used to prune the search space. Desjarlais JR, Clarke ND. Computer search algorithms in protein modification and design. Curr Opin Struct Biol. 1998 Aug;8(4):471-5. PubMed

  4. qj qi qN-1 q2 defined over large dimensionalconformation space qN q1 Review: Energy Landscape JC Lantombe, Energy2.ppt

  5. Review: Example Energy Function • E = S bonded terms + S non-bonded terms + S solvation terms • E = (ES + EQ + ES-B + ETor) + (EvdW + Edipole) • Bonded terms - Relatively few • Non-bonded terms- Depend on distances between pairs of atoms -O(n2)  Expensive to compute • Solvation terms- May require computing molecular surface JC Lantombe, Energy2.ppt

  6. Review: Monte Carlo Simulation (MCS) • Random walk through conformation space • At each cycle: • Perturb current conformation at random • Accept step with probability: (Metropolis acceptance criterion) • The conformations generated by an arbitrarily long MCS are Boltzman distributed, i.e., #conformations in V ~ JC Lantombe, Energy2.ppt

  7. Monte Carlo Simulation • Tend to waste time in local min. • May consist of millions of steps. • Energy must be evaluated frequently (computationally expensive). • Use ChainTree to improve performance. Lotan, I., Schwarzer, F., Halperin, D., Latombe, J.C.: Efficient maintenance and self-collision testing for kinematic chains. In: Symposium on Computational Geometry (2002) 43–52 Koehl, P and Levitt, M. De novo protein design. I. In search of stability and specificity. Journal of Molecular Biology, 293, 1161-1181 (1999).

  8. Genetic Algorithm Starts with First generation pool. • Iteratively apply genetic operators (selection, recombination, mutation). • Evloves toward better solution (low energy function). S. M. Larson, J. England, J. DesJarlais, and V. S. Pande. Thoroughly sampling sequence space: large-scale protein design of structural ensembles. Protein Science 11 2804-281 (2002). Protein Science

  9. Selection • Selection function takes into account the value of fitness function. This gives priority to the “fit” organism but also gives chance for “less fit” organisms. http://en.wikipedia.org/wiki/Genetic_algorithm

  10. Selection Method • Roulette Method: probability of selection is proportional to the value of fitness function • Tournament: picks k individuals (tournament size), and choose the individual with probability p. Iterate with probability p*(1-p), then p*(p*(1-p)) … • Higher k = less chance for weaker individual. http://en.wikipedia.org/wiki/Roulette_wheel_selection http://en.wikipedia.org/wiki/Tournament_selection

  11. Recombination, Mutation • Recombination: different segment of the structure which is optimized in parallel can be recombined into the same model. Recombination occurs with a set probability. Otherwise, the population is propogated to the next generation. • Mutation: avoids local minima by mutating the child with a set probability. • Similar to MC: there is no guarantee to converge into global minimum. http://en.wikipedia.org/wiki/Genetic_operator

  12. Genome@home • Genome@Home uses distributed computing and genetic algorithm. • It also incorporates backbone flexibility using Monte Carlo (random perturbation with RMSD<1.0a) which improves the result. http://www.stanford.edu/group/pandegroup/genome/

  13. Dead-end Elimination • Discrete conformational search. • Functionally equivalent to exhustive search. • It uses rejection criteria to prune the search space. • The robustness depends on the discreteness and the rejection criteria used. • Guaranteed convergence to global min. • Initially used for sidechain placement. More difficult for protein design because of high degrees of freedom. Looger LL, Hellinga HW. Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J Mol Biol. 2001 Mar 16;307(1):429-45. PubMed

  14. Energy of conformation • Reformulation of sidechain placement problem: Amino acid identity is used instead of rotamer. • The general DEE allows residue up to 300. • Energy of conformation is defined as sum of interaction among side chains and sum of interaction of sidechain and the backbone. • Rejection criteria is used and iterated until no more rotamers can be eliminated. Convergence occurs, or reduces the problem sufficiently for exhaustive serach.

  15. DEE filter: Rejection Criteria • Simple Criterion: If lowest energy struct that can be found using a given sidechain rotamer (low energy side chain conformation) is higer than the highest energy struct w/ different rotamer, the first rotamer is eliminated.

  16. DEE filter: Rejection Criteria • Goldstein Criteria:if energy struct containing one rotamer is always lowered by changing to a second one, the first one is eliminated.

  17. DEE filter: Rejection Criteria • Generalized Criterion:residues are added in group, eliminated clusters of rotamers in the groups maybe excluded from the minimum operator, in addition to those which form dead-end clusters with c.

  18. Mean Field Theory • Reduce search space. • Self-consistency is sought by placing amino acids at pre-selected positions in a given structure. • Energy function is minimized by mean field. Voigt CA, Mayo SL, Arnold FH, Wang ZG. Computational method to reduce the search space for directed protein evolution. Proc Natl Acad Sci U S A. 2001 Mar 27;98(7):3778-83. PNAS

  19. Review: Branch & Bound • Set of solutions can be partitioned into subsets (branch) • Upper limit on a subset’s solution can be computed fast (bound) Branch & Bound • Select subset with best possible bound • Subdivide it, and compute a bound for each subset S.Batzoglou, Threading2.ppt

  20. Rosetta Design • Initial backbone designed without regard to side-chain packing. • Iterates between sequence design and backbone optimization using Monte Carlo. • Perturbation in random change in the torsional angles of 1-5 random residue, or substitution of backbone torsonal angles of 1-3 consecutive residues with torsional angles from a structure in the PDB. Sidechain optimization. Accept/reject using Metropolis criterion. • 1.17-a backbone atom RMSD between model and structure. Crystal structure of top7 – A novel protein structure created with RosettaDesign. http://rosettadesign.med.unc.edu/ Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard BL, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science. 2003 Nov 21;302(5649):1364-8. PubMed

  21. Using Rosetta Design • Red: PDB 1A1M: Mhc Class I Molecule B*5301 Complexed With Peptide Typdinqml From Gag Protein Of Hiv2 • Blue: Rosetta Stone Designed • Visualized with Deep View / Swiss-PdbViewer. http://us.expasy.org/spdbv/ http://www.rcsb.org/pdb/cgi/explore.cgi?pid=195321117535569&pdbId=1A1M

  22. b.e.a.n.s. • A simple openGL based program was developed to test monte carol and genetic algorithms on designing “chain of jelly beans.” • User is able to vary the initial structure of the “beans” and compare the efficiency of the algorithms via built-in timer.

More Related