1 / 56

Computational Approaches to Receptor Structure Prediction

Computational Approaches to Receptor Structure Prediction. Uğur Sezerman Biological Sciences and Bioengineering Program Sabancı University, Istanbul. Determining Protein Structure. There are O(100,000) distinct proteins in the human proteome.

padma
Download Presentation

Computational Approaches to Receptor Structure Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Approaches to Receptor Structure Prediction Uğur Sezerman Biological Sciences and Bioengineering Program Sabancı University, Istanbul

  2. Determining Protein Structure • There are O(100,000) distinct proteins in the human proteome. • 3D structures have been determined for over 60,000 proteins, from all organisms • Includes duplicates with different ligands bound, etc. • Coordinates are determined by X-ray crystallography or NMR Protein Folding

  3. ~0.5mm X-Ray Crystallography • The crystal is a mosaic of millions of copies of the protein. • As much as 70% is solvent (water)! • May take months (and a “green” thumb) to grow. Protein Folding

  4. X-Ray diffraction • Image is averagedover: • Space (many copies) • Time (of the diffractionexperiment) Protein Folding

  5. Electron Density Maps • Resolution is dependent on the quality/regularity of the crystal • R-factor is a measure of “leftover” electron density • Solvent fitting • Refinement Protein Folding

  6. The Protein Data Bank • http://www.rcsb.org/pdb/ ATOM 1 N ALA E 1 22.382 47.782 112.975 1.00 24.09 3APR 213 ATOM 2 CA ALA E 1 22.957 47.648 111.613 1.00 22.40 3APR 214 ATOM 3 C ALA E 1 23.572 46.251 111.545 1.00 21.32 3APR 215 ATOM 4 O ALA E 1 23.948 45.688 112.603 1.00 21.54 3APR 216 ATOM 5 CB ALA E 1 23.932 48.787 111.380 1.00 22.79 3APR 217 ATOM 6 N GLY E 2 23.656 45.723 110.336 1.00 19.17 3APR 218 ATOM 7 CA GLY E 2 24.216 44.393 110.087 1.00 17.35 3APR 219 ATOM 8 C GLY E 2 25.653 44.308 110.579 1.00 16.49 3APR 220 ATOM 9 O GLY E 2 26.258 45.296 110.994 1.00 15.35 3APR 221 ATOM 10 N VAL E 3 26.213 43.110 110.521 1.00 16.21 3APR 222 ATOM 11 CA VAL E 3 27.594 42.879 110.975 1.00 16.02 3APR 223 ATOM 12 C VAL E 3 28.569 43.613 110.055 1.00 15.69 3APR 224 ATOM 13 O VAL E 3 28.429 43.444 108.822 1.00 16.43 3APR 225 ATOM 14 CB VAL E 3 27.834 41.363 110.979 1.00 16.66 3APR 226 ATOM 15 CG1 VAL E 3 29.259 41.013 111.404 1.00 17.35 3APR 227 ATOM 16 CG2 VAL E 3 26.811 40.649 111.850 1.00 17.03 3APR 228 Protein Folding

  7. A Peek at Protein Function • Serine proteases – cleave other proteins • Catalytic Triad: ASP, HIS, SER Protein Folding

  8. Cleaving the peptide bond Protein Folding

  9. Three Serine Proteases • Chymotrypsin – Cleaves the peptide bond on the carboxyl side of aromatic (ring) residues: Trp, Phe, Tyr; and large hydrophobic residues: Met. • Trypsin – Cleaves after Lys (K) or Arg (R) • Positive charge • Elastase – Cleaves after small residues: Gly, Ala, Ser, Cys Protein Folding

  10. Specificity Binding Pocket Protein Folding

  11. Protein Folding – Biological perspective • “Central dogma”: Sequence specifies structure • Denature – to “unfold” a protein back to random coil configuration • -mercaptoethanol – breaks disulfide bonds • Urea or guanidine hydrochloride – denaturant • Also heat or pH • Anfinsen’s experiments • Denatured ribonuclease • Spontaneously regained enzymatic activity • Evidence that it re-folded to native conformation Protein Folding

  12. PROTEIN FOLDING PROBLEM • STARTING FROM AMINO ACID SEQUENCE FINDING THE STRUCTURE OF PROTEINS IS CALLED THE PROTEIN FOLDING PROBLEM Protein Folding

  13. The Protein Folding Problem • Central question of molecular biology:“Given a particular sequence of amino acid residues (primary structure), what will the tertiary/quaternary structure of the resulting protein be?” • Input: AAVIKYGCAL…Output: 11, 22…= backbone conformation:(no side chains yet) Protein Folding

  14. Folding intermediates • Levinthal’s paradox – Consider a 100 residue protein. If each residue can take only 3x3=9 positions, there are 9100 possible conformations. • Folding must proceed by progressive stabilization of intermediates • Molten globules – most secondary structure formed, but much less compact than “native” conformation. Protein Folding

  15. Protein Packing • occurs in the cytosol (~60% bulk water, ~40% water of hydration) • involves interaction between secondary structure elements and solvent • may be promoted by chaperones, membrane proteins • tumbles into molten globule states • overall entropy loss is small enough so enthalpy determines sign of E, which decreases (loss in entropy from packing counteracted by gain from desolvation and reorganization of water, i.e. hydrophobic effect) • yields tertiary structure Protein Folding

  16. Folding help • Proteins are, in fact, only marginally stable • Native state is typically only 5 to 10 kcal/mole more stable than the unfolded form • Many proteins help in folding • Protein disulfide isomerase – catalyzes shuffling of disulfide bonds • Chaperones – break up aggregates and (in theory) unfold misfolded proteins Protein Folding

  17. Forces driving protein folding • It is believed that hydrophobic collapse is a key driving force for protein folding • Hydrophobic core • Polar surface interacting with solvent • Minimum volume (no cavities) • Disulfide bond formation stabilizes • Hydrogen bonds • Polar and electrostatic interactions Protein Folding

  18. Secondary Structure • non-linear • 3 dimensional • localized to regions of an amino acid chain • formed and stabilized by hydrogen bonding, electrostatic and van der Waals interactions Protein Folding

  19. Common motifs Protein Folding

  20. The Hydrophobic Core • Hemoglobin A is the protein in red blood cells (erythrocytes) responsible for binding oxygen. • The mutation E6V in the  chain places a hydrophobic Val on the surface of hemoglobin • The resulting “sticky patch” causes hemoglobin S to agglutinate (stick together) and form fibers which deform the red blood cell and do not carry oxygen efficiently • Sickle cell anemia was the first identified molecular disease Protein Folding

  21. Sickle Cell Anemia Sequestering hydrophobic residues in the protein core protects proteins from hydrophobic agglutination. Protein Folding

  22. Computational Approaches • Ab initio methods • Threading • Comperative Modelling • Fragment Assembly Protein Folding

  23. Why is ab-initio prediction hard? Protein Folding

  24. Ab-initio protein structure prediction as an optimization problem energy conformation • Define a function that map protein structures to some quality measure. • Solve the computational problem of finding an optimal structure. •  Protein Folding

  25. A dream function Has a clear minimum in the native structure.  Has a clear path towards the minimum.  Global optimization algorithm should find the native structure. Chen Keasar BGU Protein Folding

  26. An approximate function  Easier to design and compute.  Native structure not always the global minimum.  Global optimization methods do not converge. Many alternative models (decoys) should be generated. Protein Folding Chen Keasar BGU

  27. An approximate function  Easier to design and compute.  Native structure not always the global minimum.  Global optimization methods do not converge. Many alternative models (decoys) should be generated. No clear way of choosing among them. Decoy set Protein Folding Chen Keasar BGU

  28. Fold Optimization • Simple lattice models (HP-models) • Two types of residues: hydrophobic and polar • 2-D or 3-D lattice • The only force is hydrophobic collapse • Score = number of HH contacts Protein Folding

  29. Scoring Lattice Models • H/P model scoring: count noncovalent hydrophobic interactions. • Sometimes: • Penalize for buried polar or surface hydrophobic residues Protein Folding

  30. What can we do with lattice models? • For smaller polypeptides, exhaustive search can be used • Looking at the “best” fold, even in such a simple model, can teach us interesting things about the protein folding process • For larger chains, other optimization and search methods must be used • Greedy, branch and bound • Evolutionary computing, simulated annealing • Graph theoretical methods Protein Folding

  31. Learning from Lattice Models • The “hydrophobic zipper” effect: Ken Dill ~ 1997 Protein Folding

  32. Threading: Fold recognition • Given: • Sequence: IVACIVSTEYDVMKAAR… • A database of molecular coordinates • Map the sequence onto each fold • Evaluate • Objective 1: improve scoring function • Objective 2: folding Protein Folding

  33. Protein Fold Families • CATH websitewww.cathdb.info Protein Folding

  34. Secondary Structure Prediction AGVGTVPMTAYGNDIQYYGQVT… A-VGIVPM-AYGQDIQY-GQVT… AG-GIIP--AYGNELQ--GQVT… AGVCTVPMTA---ELQYYG--T… AGVGTVPMTAYGNDIQYYGQVT… ----hhhHHHHHHhhh--eeEE… Protein Folding

  35. Secondary Structure Prediction • Easier than folding • Current algorithms can prediction secondary structure with 70-80% accuracy • Chou, P.Y. & Fasman, G.D. (1974). Biochemistry, 13, 211-222. • Based on frequencies of occurrence of residues in helices and sheets • PhD – Neural network based • Uses a multiple sequence alignment • Rost & Sander, Proteins, 1994 , 19, 55-72 Protein Folding

  36. Chou-Fasman Parameters Protein Folding

  37. HOMOLOGY MODELLING • Using database search algorithms find the sequence with known structure that best matches the query sequence • Assign the structure of the core regions obtained from the structure database to the query sequence • Find the structure of the intervening loops using loop closure algorithms Protein Folding

  38. Homology Modeling: How it works • Find template • Align target sequence • with template • Generate model: • - add loops • - add sidechains • Refine model Protein Folding

  39. Prediction of Protein Structures • Examples – a few good examples actual predicted predicted actual actual predicted actual predicted Protein Folding

  40. Prediction of Protein Structures • Not so good example Protein Folding

  41. 1esr Protein Folding

  42. Protein Folding

  43. Protein Folding

  44. Are we lucky? no yes a bit A C C W K A C V K G + homology C A K C W A ab initio C G fold recognition K V C A K C W A C G K V How can we predict protein structures? Protein Folding

  45. HOMOLOGY MODELLING • Using database search algorithms find the sequence with known structure that best matches the query sequence • Assign the structure of the core regions obtained from the structure database to the query sequence • Find the structure of the intervening loops using loop closure algorithms Protein Folding

  46. Homology Modeling: How it works • Find template • Align target sequence • with template • Generate model: • - add loops • - add sidechains • Refine model Protein Folding

  47. Prediction of Protein Structures • Examples – a few good examples actual predicted predicted actual actual predicted actual predicted Protein Folding

  48. Prediction of Protein Structures • Not so good example Protein Folding

  49. 1esr Protein Folding

  50. Protein Folding

More Related