1 / 63

CAP5510 – Bioinformatics Protein Structures

CAP5510 – Bioinformatics Protein Structures. Tamer Kahveci CISE Department University of Florida. What and Why?. Proteins fold into a three dimensional shape Structure can reveal functional information that we can not find from sequence Misfolding proteins can cause diseases

Download Presentation

CAP5510 – Bioinformatics Protein Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CAP5510 – BioinformaticsProtein Structures Tamer Kahveci CISE Department University of Florida

  2. What and Why? • Proteins fold into a three dimensional shape • Structure can reveal functional information that we can not find from sequence • Misfolding proteins can cause diseases • Sickle cell anemia, mad cow disease • Used in drug design Hemoglobin Normal v.s. sickled blood cells E → V HIV protease inhibitor

  3. Goals • Understand protein structures • Primary, secondary, tertiary • Learn how protein shapes are • determined • Predicted • Structure comparison (?)

  4. A Protein Sequence >gi|22330039|ref|NP_683383.1| unknown protein; protein id: At1g45196.1 [Arabidopsis thaliana] MPSESSYKVHRPAKSGGSRRDSSPDSIIFTPESNLSLFSSASVSVDRCSSTSDAHDRDDSLISAWKEEFEVKKDDESQNL DSARSSFSVALRECQERRSRSEALAKKLDYQRTVSLDLSNVTSTSPRVVNVKRASVSTNKSSVFPSPGTPTYLHSMQKGW SSERVPLRSNGGRSPPNAGFLPLYSGRTVPSKWEDAERWIVSPLAKEGAARTSFGASHERRPKAKSGPLGPPGFAYYSLY SPAVPMVHGGNMGGLTASSPFSAGVLPETVSSRGSTTAAFPQRIDPSMARSVSIHGCSETLASSSQDDIHESMKDAATDA QAVSRRDMATQMSPEGSIRFSPERQCSFSPSSPSPLPISELLNAHSNRAEVKDLQVDEKVTVTRWSKKHRGLYHGNGSKM

  5. R O H N C C OH H H Amino Acid Composition • Basic Amino AcidStructure: • The side chain, R,varies for each ofthe 20 amino acids Side chain Aminogroup Carboxylgroup

  6. O O The Peptide Bond • Dehydration synthesis • Repeating backbone: N–C –C –N–C –C • Convention – start at amino terminus and proceed to carboxy terminus

  7. Peptidyl polymers • A few amino acids in a chain are called a polypeptide. A protein is usually composed of 50 to 400+ amino acids. • We call the units of a protein amino acid residues. amidenitrogen carbonylcarbon

  8. Side chain properties • Carbon does not make hydrogen bonds with water easily – hydrophobic • O and N are generally more likely than C to h-bond to water – hydrophilic • We group the amino acids into three general groups: • Hydrophobic • Charged (positive/basic & negative/acidic) • Polar

  9. The Hydrophobic Amino Acids

  10. The Charged Amino Acids

  11. The Polar Amino Acids

  12. More Polar Amino Acids And then there’s…

  13. Psi () – the angle of rotation about the C-C bond. Phi () – the angle of rotation about the N-C bond. The planar bond angles and bond lengths are fixed. Planarity of the Peptide Bond

  14. Primary & Secondary Structure • Primary structure = the linear sequence of amino acids comprising a protein:AGVGTVPMTAYGNDIQYYGQVT… • Secondary structure • Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the -helix and the-sheet • The location of direction of these periodic, repeating structures is known as the secondary structure of the protein

  15. The Alpha Helix     60°

  16. Properties of the Alpha Helix •     60° • Hydrogen bondsbetween C=O ofresidue n, andNH of residuen+4 • 3.6 residues/turn • 1.5 Å/residue rise • 100°/residue turn

  17. Properties of -helices • 4 – 40+ residues in length • Often amphipathic or “dual-natured” • Half hydrophobic and half hydrophilic • If we examine many -helices,we find trends… • Helix formers: Ala, Glu, Leu, Met • Helix breakers: Pro, Gly, Tyr, Ser

  18. The beta strand (& sheet)    135°  +135°

  19. Properties of beta sheets • Formed of stretches of 5-10 residues in extended conformation • Parallel/aniparallel,contiguous/non-contiguous

  20. Anti-Parallel Beta Sheets

  21. Parallel Beta Sheets

  22. Mixed Beta Sheets

  23. Turns and Loops • Secondary structure elements are connected by regions of turns and loops • Turns – short regions of non-, non- conformation • Loops – larger stretches with no secondary structure. • Sequences vary much more than secondary structure regions

  24. Ramachandran Plot

  25. Levels of Protein Structure • Secondary structure elements combine to form tertiary structure • Quaternary structure occurs in multienzyme complexes

  26. Protein Structure Example Beta Sheet Helix Loop ID: 12as 2 chains

  27. Views of a Protein Wireframe Ball and stick

  28. Views of a protein Spacefill Cartoon CPK colors Carbon = green, black, or grey Nitrogen = blue Oxygen = red Sulfur = yellow Hydrogen = white

  29. Common Protein Motifs

  30. Mostly Helical Folding Motifs • Four helical bundle: • Globin domain:

  31. / Motifs • / barrel:

  32. Open TwistedBeta Sheets

  33. Beta Barrels

  34. Determining the Structure of a Protein Experimental Methods X-ray NMR As of August 2013, structure of > 85,000 proteins are determined

  35. X-Ray Crystallography Discovery of X-rays (Wilhelm Conrad Röntgen, 1895) Crystals diffract X-rays in regular patterns (Max Von Laue, 1912) The first X-ray diffraction pattern from a protein crystal (Dorothy Hodgkin, 1934)

  36. X-Ray Crystallography • Grow millions of protein crystals • Takes months • Expose to radiation beam • Analyze the image with computer • Average over many copies of images • PDB • Not all proteins can be crystallized!

  37. NMR • Nuclear Magnetic Resonance • Nuclei of atoms vibrate when exposed to oscillating magnetic field • Detect vibrations by external sensors • Computes inter-atomic distances. • Requires complex analysis. NMR can be used for short sequences (<200 residues) • More than one model can be derived from NMR.

  38. Determining the Structure of a Protein Computational Methods

  39. The Protein Folding Problem • Central question of molecular biology:“Given a particular sequence of amino acid residues (primary structure), what will the secondary/tertiary/quaternary structure of the resulting protein be?” • Input: AAVIKYGCAL…Output: 11, 22…

  40. Structure v.s. Sequence • Observation: A protein with the same sequence (under the same circumstances) yields the same shape. • Protein folds into a shape that minimizes the energy needed to stay in that shape. • Protein folds in ~10-15 seconds.

  41. Secondary Structure Prediction

  42. Chou-Fasman methods • Uses statistically obtained Chou-Fasman parameters. • For each amino acid has • P(a): alpha • P(b): beta • P(t): turn • f(): additional turn parameter.

  43. Chou-Fasman Parameters

  44. C.-F. Alpha Helix Prediction (1) P(a) P(b) • Find P(a) for all letters • Find 6 contiguous letters, at least 4 of them have P(a) > 100 • Declare these regions as alpha helix

  45. C.-F. Alpha Helix Prediction (2) P(a) P(b) • Extend in both directions until 4 consecutive letters with P(a) < 100 found

  46. C.-F. Alpha Helix Prediction (3) P(a) P(b) • Find sum of P(a) (Sa) and sum of P(b) (Sb) in the extended region • If region is long enough ( >= 5 letters) and P(a) > P(b) then declare the extended region as alpha helix

  47. C.-F. Beta Sheet Prediction • Same as alpha helix replace P(a) with P(b) • Resolving overlapping alpha helix & beta sheet • Compute sum of P(a) (Sa) and sum of P(b) (Sb) in the overlap. • If Sa > Sb => alpha helix • If Sb > Sa => beta sheet

  48. C.-F. Turn Prediction P(a) P(b) P(t) f() • An amino acid is predicted as turn if all of the following holds: • f(i)*f(i+1)*f(i+2)*f(i+3) > 0.000075 • Avg(P(i+k)) > 100, for k=0, 1, 2, 3 • Sum(P(t)) > Sum(P(a)) and Sum(P(b)) for i+k, (k=0, 1, 2, 3)

  49. Other Methods for SSE Prediction • Similarity searching • Predator • Markov chain • Neural networks • PHD • ~65% to 80% accuracy

  50. Tertiary Structure Prediction

More Related