1 / 68

11/7/05 Protein Structure: Classification, Databases, Visualization

11/7/05 Protein Structure: Classification, Databases, Visualization. Announcements. BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses & tentative presentation schedule to students

jerold
Download Presentation

11/7/05 Protein Structure: Classification, Databases, Visualization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 11/7/05Protein Structure: Classification, Databases, Visualization D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  2. Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses & tentative presentation schedule to students Dec 2 Fri noon - Written project reports due Dec 5,7,8,9 class/lab - Oral Presentations (20') (Dec 15 Thurs = Final Exam) D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  3. Bioinformatics Seminars Nov 7 Mon 12:10 IG FacultySeminarin 101 Ind Ed II Inborn Errors of Metabolism in Humans & Animal Models Matt Ellinwood, Animal Science, ISU Nov 10 Thurs 3:40 Com S Seminarin 223 Atanasoff Computational Epidemiology Armin R. Mikler, Univ. North Texas http://www.cs.iastate.edu/~colloq/#t3 D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  4. Bioinformatics Seminars CORRECTION: Next week - Baker Center/BCB Seminars: (seminar abstracts available at above link) Nov 14 Mon 1:10 PM Doug Brutlag, Stanford Discovering transcription factor binding sites Nov 15 Tues 1:10 PM Ilya Vakser, Univ Kansas Modeling protein-protein interactions both seminars will be in Howe Hall Auditorium D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  5. Protein Structure & Function:Analysis & Prediction Mon Protein structure: classification, databases, visualization Wed Protein structure: prediction & modeling Thurs Lab Protein structure prediction Fri Protein-nucleic acid interactions Protein-ligand docking D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  6. Reading Assignment (for Mon-Fri) • Mount Bioinformatics • Chp 10 Protein classification & structure prediction http://www.bioinformaticsonline.org/ch/ch10/index.html • pp. 409-491 • Ck Errata:http://www.bioinformaticsonline.org/help/errata2.html • Other? Additional reading assignments for BCB 544 D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  7. Review last lecture:RNA Structure PredictionAlgorithms D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  8. RNA structure prediction strategies Secondary structure prediction • Energy minimization (thermodynamics) 2) Comparative sequence analysis (co-variation) 3) Combined experimental & computational D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  9. 1) Energy minimization method What are the assumptions? Native tertiary structure or "fold" of an RNA molecule is (one of) its lowest free energy configuration(s) Gibbs free energy = Gin kcal/mol at 37C = equilibrium stability of structure lower values (negative) are more favorable Is this assumption valid? in vivo? - this may not hold, but we don't really know D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  10. Gibbs free energy: G Gibbs Free energy (G) is formally defined in terms of state functions enthalpy & entropy, & state variable, temperature G = H - TS G= H - TS (for constant temp) Enthalpy(H) = amount of heat absorbed by a system at constant pressure Entropy (S) = measure of the amount of disorder or randomness in a system Note = this is not the same as "entropy" in information theory, but is related, see: http://en.wikipedia.org/wiki/Information_theory D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  11. Gibbs free energy: G Gibbs free energy for formation of an RNA or protein structure = G =equilibrium stability of that structure at a specific temperature (kcal/mol at 37°C) G = -RT lnKeq R = gas constant D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  12. Nearest-neighbor parameters Most methods for free energy minimization use nearest-neighbor parameters (derived from experiment) for predicting stability of an RNA secondary structure (in terms of Gat 37C) & most available software packages use the same set of parameters: Mathews, Sabina, Zuker & Turner, 1999 D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  13. Energy minimization - calculations: Total free energy of a specific conformation for a specific RNA molecule = sum of incremental energy terms for: • helical stacking (sequence dependent) • loop initiation • unpaired stacking (favorable "increments" are < 0) Fig 6.3 Baxevanis & Ouellette 2005 D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  14. But how many possible conformations for a single RNA molecule? Huge number: Zuker estimates (1.8)N possible secondary structures for a sequence of N nucleotides for 100 nts (small RNA…) = 3 X 1025 structures! Solution? Not exhaustive enumeration… • Dynamic programming O(N3) in time O(N2) in space/storage iff pseudoknots excluded, otherwise: O(N6 ), time O(N4 ), space D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  15. Algorithms based on energy minimization For outline of algorithm used in Mfold, including description of dynamic programming recursion, please visit Michael Zuker's lecture:http://www.bioinfo.rpi.edu/~zukerm/lectures/RNAfold-html From this site, you may also download Zuker's lecture as either PDF or PS file. D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  16. 2) Comparative sequence analysis (co-variation) Two basic approaches: • Algorithms constrained by initial alignment Much faster, but not as robust as unconstrained Base-pairing probabilities determined by a partition function • Algorithms not constrained by initial alignment Genetic algorithms often used for finding an alignment & set of structures D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  17. RNA structure prediction strategies Tertiary structure prediction Requires "craft" & significant user input & insight • Extensive comparative sequence analysis to predict tertiary contacts (co-variation) e.g., MANIP - Westhof • Use experimental data to constrain model building e.g., MC-CYM - Major • Homology modeling using sequence alignment & reference tertiary structure (not many of these!) • Low resolution molecular mechanics e.g., yammp - Harvey D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  18. New Last Time: Protein Structure & Function D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  19. Protein Structure & Function • Protein structure - primarily determined by sequence • Protein function - primarily determined by structure • Globular proteins: compact hydrophobic core & hydrophilic surface • Membrane proteins: special hydrophobic surfaces • Folded proteins are only marginally stable • Some proteins do not assume a stable "fold" until they bind to something = Intrinsically disordered • Predicting protein structure and function can be very hard --& fun! D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  20. 4 Basic Levels of Protein Structure D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  21. Primary & Secondary Structure • Primary • Linear sequence of amino acids • Description of covalent bonds linking aa’s • Secondary • Local spatial arrangement of amino acids • Description of short-range non-covalent interactions • Periodic structural patterns: -helix, b-sheet D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  22. Tertiary & Quaternary Structure • Tertiary • Overall 3-D "fold" of a single polypeptide chain • Spatial arrangement of 2’ structural elements; packing of these into compact "domains" • Description of long-range non-covalent interactions (plus disulfide bonds) • Quaternary • In proteins with > 1 polypeptide chain, spatial arrangement of subunits D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  23. "Additional" Structural Levels • Super-secondary elements • Motifs • Domains • Foldons D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  24. New Today: • Protein Structure & Function • Amino acids characteristics • Structural classes & motifs • Protein functions & functional families • not much - more on this later Classification Databases Visualization D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  25. Amino Acids • Each of 20 different amino acids has different "R-Group," side chain attached to Ca D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  26. Peptide bond is rigid and planar D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  27. Hydrophobic Amino Acids D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  28. Charged Amino Acids D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  29. Polar Amino Acids D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  30. Certain side-chain configurations are energetically favored (rotamers) Ramachandran plot: "Allowable" psi & phi angles D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  31. Glycine is smallest amino acidR group = H atom • Glycine residues increase backbone flexibility because they have no R group D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  32. Proline is cyclic • Proline residues reduce flexibility of polypeptide chain • Proline cis-trans isomerization is often a rate-limiting step in protein folding • Recent work suggests it also may also regulate ligand binding in native proteins -Andreotti D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  33. Cysteines can form disulfide bonds • Disulfide bonds (covalent) stabilize 3-D structures • In eukaryotes, disulfide bonds are found only in secreted proteins or extracellular domains D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  34. Globular proteins have a compact hydrophobic core • Packing of hydrophobic side chains into interior is main driving force for folding • Problem? Polypeptide backbone is highly polar (hydrophilic) due to polar -NH and C=O in each peptide unit; these polar groups must be neutralized • Solution? Form regular secondary structures, • e.g., -helix, b-sheet, stabilized by H-bonds D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  35. Exterior surface of globular proteins is generally hydrophilic • Hydrophobic core formed by packed secondary structural elements provides compact, stable core • "Functional groups" of protein are attached to this framework; exterior has more flexible regions (loops) and polar/charged residues • Hydrophobic "patches" on protein surface are often involved in protein-protein interactions D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  36. Protein Secondary Structures • Helix •  Sheets • Loops • Coils D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  37. a- Helix • Most abundant 2' structure in proteins • Average length = 10 aa's (~10 Angstroms) • Length varies from 5-40 aa's • Alignment of H-bonds creates dipole moment (positive charge at NH end) • Often at surface of core, with hydrophobic residues on inner-facing side, hydrophilic on other side D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  38. helix is stabilized by H-bonds between ~ every 4th residue C = black O = red N = blue D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  39. R-groups are on outside of helix D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  40. Types of helices • "Standard" helix: 3.6 residues per turn • H-bonds between C=0 of residue n and • NH of residue n + 4 • Helix ends are polar; almost always on surface of protein • Other types of helices? • n + 5 =  helix • n + 3 = 310 helix D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  41. Certain amino acids are "preferred" & others are rare in helices • Ala, Glu, Leu, Met = good helix formers • Pro, Gly Tyr, Ser = very poor • Amino acid composition & distribution varies, depending on on location of helix in 3-D structure D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  42. -Strands & Sheets • H-bonds formed between 5-10 consecutive residues in one portion of chain with another • set of 5-10 residues farther down chain • Interacting regions may be adjacent (with short loop between) or far apart • -sheets usually have all strands either parallel or antiparallel D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  43. Antiparallel-sheet D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  44. Antiparallel-sheet D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  45. Parallel-sheet D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  46. Mixed-Sheets also occur D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  47. Loops • Connect helices and sheets • Vary in length and 3-D configurations • Are located on surface of structure • Are more "tolerant" of mutations • Are more flexible and can adopt multiple conformations • Tend to have charged and polar amino acids • Are frequently components of active sites • Some fall into distinct structural families (e.g., hairpin loops, reverse turns) D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  48. Coils • Regions of 2' structure that are not helices, sheets, or recognizable turns • Intrinsically disordered regions appear to play important functional roles D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  49. Globular proteins are built from recurring structural patterns • Motifs or supersecondary structures = • combinations of 2' structural elements • Domains = combinations of motifs • Independently folding unit (foldon) • Functional unit D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

  50. A few common structural motifs • Helix-turn-helix e.g., DNA binding • Helix-loop-helix e.g., Calcium binding • b-hairpin 2 adjacent antiparallel strands • connected by short loop • Greek key 4 adjacent antiparallel strands • ba-b2 parallel strands connected by helix D Dobbs ISU - BCB 444/544X: Protein Structure: Classification, Databases, Visualization

More Related