Protein structure prediction
1 / 17

Protein Structure Prediction - PowerPoint PPT Presentation

  • Uploaded on

Protein Structure Prediction. Mason Bially. Types of Structure. Primary Structure The linear amino acid sequence. Secondary Structure The local three-dimensional structure. Defined by hydrogen bonding patterns. Tertiary Structure The global three-dimensional structure.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Protein Structure Prediction' - liko

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Types of structure
Types of Structure

  • Primary Structure

    • The linear amino acid sequence.

  • Secondary Structure

    • The local three-dimensional structure.

    • Defined by hydrogen bonding patterns.

  • Tertiary Structure

    • The global three-dimensional structure.

    • Defined in atomic coordinates.

    • The actual function.

  • Quaternary Structure

    • The arrangement of multiple proteins.

How do we find secondary structure
How do we find Secondary Structure?

  • Couple Algorithms:

    • DSSP (Original, Slight Errors)

    • STRIDE (Newer, Sliding Window)

  • Requires the primary and tertiary structure.

    • Because of this they are exact, not guesswork.

  • Finds hydrogen bonds.

    • Uses potential energy functions.

      • Based on amino acid locations and orientations.

      • STRIDE’s is slightly more accurate

    • Returns one of 8 types of secondary structure for each amino acid.

      • 3 helix types

      • 2 beta-sheet types

      • 2 turn types

      • and ‘other’

X ray crystallography
X-Ray Crystallography

  • Shoot X-rays through a crystal and depending on how the X-rays come back, angle and intensity, the structure can be determined.

  • Some proteins are challenging to crystallize (intrinsic membrane proteins).

  • Can handle arbitrarily large sizes.

Nmr protein spectroscopy
NMR Protein Spectroscopy

  • Uses Nuclear Magnetic Resonance a phenomena by which atomic nuclei in a magnetic field respond to electromagnetic radiation by reemitting it.

  • Has difficulty with large proteins.

  • Works on almost anything. (Including proteins with unstable tertiary structure)

Why do we need structure prediction
Why do we need Structure Prediction?

  • Experimentally Finding tertiary structure has problems.

    • Slow, difficult, hard.

    • Some proteins can’t be found experimentally.

  • We need to cover more ground, quicker.

    • Drug design.

    • Bioinformatics tool development.

    • More detailed Interactome information.

But isn t it computationally hard
But isn’t it computationally hard?

  • Yes.

  • Secondary structure prediction.

    • Machine learning methods.

  • Tertiary structure prediction.

    • Homology Modeling

    • Fold Recognition (AKA Protein Threading )

    • From scratch (AKA de novo, AKA ab initio)

Basis for prediction comparative modeling
Basis for Prediction(Comparative Modeling)

  • Protein structure (Secondary and Tertiary) is evolutionarily more conserved than the DNA or amino acid sequence.

    • Structure is function; changing it would prevent the protein from doing it’s job.

  • Therefore proteins will probably share structure with each other.

Secondary structure prediction
Secondary Structure prediction

  • Early attempts. (~60%)

    • Chou-Fasman

      • Uses the probability of a secondary structure containing an amino acid.

    • GOR

      • Bayesian inference applied to the same basic idea.

  • Machine learning methods. (~70%)

    • Neural networks.

    • Support vector machines.

    • Hidden Markov models.

  • Future.

    • Secondary structure is also based on the environment the protein is folded in.

    • Including this metadata to attempt to improve methods.

Homology modeling1
Homology Modeling

  • Requires primary structure and a template tertiary structure.

    • Relies on the idea that if one protein has a specific structure, so do other proteins.

  • Only works with relatively similar sequences.

    • Sequence identity above 50% is high quality.

      • Low quality x-ray crystallography.

    • Sequence identity above 30% is medium quality.

      • Anything lower degrades rapidly.

    • Limited by availability of suitable templates.

    • Limited by the ability to accurately align and choose distant templates.

  • Sometimes function/structure will diverge for seemingly similar targets and templates.

    • Happily generates models against incorrect templates.

Homology modeling2
Homology Modeling

  • Template selection and Sequence alignment

    • Crucial, but relatively simple if a similar sequence exists (BLAST).

    • For edge cases:

      • PSI-Blast, HMM or profile-profile alignment based.

  • Model Generation

    • Multiple methods.

    • Construct the model by placing the amino acids where the aligned template suggests.

    • Then refine by going back to the chemistry/physics and fixing errors.

  • Model Assessment

    • Make sure the resulting fold is correct.

    • Detects errors in alignments and template selection.

    • Sometimes chooses the best of many potential models.

Fold recognition aka protein threading
Fold Recognition(AKA Protein Threading)

  • Requires primary structure and a library of tertiary structures.

    • Relies on the idea that there are (relatively) few folds (tertiary structure) of proteins.

  • Often feeds final structure back to Homology Modeling techniques as template to get final model.

  • Can use a number of different scoring algorithms.

    • Most popular is free energy.

  • Attempts to find which templates in the library minimize the scoring algorithm

    • Threading

    • Dynamic Programming. (Optimization technique)

    • Machine Learning.

  • Often finds a large number of results.

How do we know these models work
How do we know these models work?

  • CASP (Critical Assessment of Techniques for Protein Structure Prediction)

    • Every two years.

    • Tests blind prediction algorithms.

      • In many different categories.

    • Since 1994.

  • Other variations.


  • Mix it all together!

  • Including evolutionary information.

    • Improves alignment.

    • Helps find better folds.

  • Structural information.

    • Predicted secondary structure can help.

  • Mixing with ab initio/de novo methods.



    • By Torsten Schwede and Manuel C Peitsch

  • Images from Wikipedia or sources.