Proteins secondary structure predictions
This presentation is the property of its rightful owner.
Sponsored Links
1 / 39

Proteins Secondary Structure Predictions PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on
  • Presentation posted in: General

Structural Bioinformatics. Proteins Secondary Structure Predictions. Structure Prediction Motivation. Better understand protein function Broaden homology Detect similar function where sequence differs (only ~50% remote homologies can be detected based on sequence) Explain disease

Download Presentation

Proteins Secondary Structure Predictions

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Proteins secondary structure predictions

Structural Bioinformatics

Proteins SecondaryStructure Predictions


Structure prediction motivation

Structure Prediction Motivation

  • Better understand protein function

  • Broaden homology

    • Detect similar function where sequence differs

      (only ~50% remote homologies can be detected based on sequence)

  • Explain disease

    • Explain the effect of mutations

    • Design drugs


Proteins secondary structure predictions

Myoglobin – the first high resolution protein structure

Solved in 1958 by Max Perutz John Kendrew of Cambridge University.

Won the 1962 and Nobel Prize in Chemistry.

“ Perhaps the most remarkable features of the molecule are its complexity and its lack of symmetry. The arrangement seems to be almost totally lacking in the kind of regularities which one instinctively anticipates.”


Proteins secondary structure predictions

MERFGYTRAANCEAP….

Predicting the three dimensional structure from sequence of a protein is very hard

(some times impossible)

However we can predict with relative high precision the secondary structure


What do we mean by secondary structure

What do we mean bySecondary Structure ?

Secondary structure are the building blocks of the protein structure:

=


What do we mean by secondary structure1

What do we mean bySecondary Structure ?

Secondary structure is usually divided into three categories:

Anything else – turn/loop

Alpha helix

Beta strand (sheet)


Proteins secondary structure predictions

Alpha Helix: Pauling (1951)

  • A consecutive stretch of 5-40 amino acids (average 10).

  • A right-handed spiral conformation.

  • 3.6 amino acids per turn.

  • Stabilized by H-bonds

3.6 residues

5.6 Å


Proteins secondary structure predictions

Beta Strand: Pauling and Corey (1951)

  • Different polypeptide chains run alongside each

  • other and are linked together by hydrogen bonds.

  • Each section is called β -strand,

  • and consists of 5-10 amino acids.

β -strand


Proteins secondary structure predictions

3.25Å

4.6Å

3.47Å

4.6Å

Beta Sheet

The strands become adjacent to each other, forming beta-sheet.

Antiparallel

Parallel


Loops

Loops

  • Connect the secondary structure elements.

  • Have various length and shapes.

  • Located at the surface of the folded protein and therefore may have important role in biological recognition processes.


Three dimensional tertiary structure

Three dimensional Tertiary Structure

Describes the packing of alpha-helices, beta-sheets and random coils with respect to each other on the level of one whole polypeptide chain


Proteins secondary structure predictions

Secondary

Tertiary

RBP

Globin


How do the secondary and tertiary structures relate to the primary protein sequence

How do the (secondary and tertiary) structures relate to the primary protein sequence??


Proteins secondary structure predictions

SEQUENCE

STRUCTURE

-Early experiments have shown that the sequence of the protein is sufficient to determine its structure (Anfisen)

- Protein structure is more conserved than protein sequence and more closely related to function.


How can different amino acid sequence determine similar protein structure

How (CAN) Different Amino Acid Sequence Determine Similar Protein Structure ??

Lesk and Chothia 1980


Proteins secondary structure predictions

The Globin Family


Proteins secondary structure predictions

Different sequences can result in similar structures

1ecd

2hhd


Proteins secondary structure predictions

We can learn about the important features which determine structure and function by comparing the sequences and structures ?


Proteins secondary structure predictions

The Globin Family


Proteins secondary structure predictions

Why is Proline 36 conserved in all the globin family ?


Where are the gaps

Where are the gaps??

The gaps in the pairwise alignment are mapped to the loop regions


Proteins secondary structure predictions

retinol-binding

protein

odorant-binding

protein

apolipoprotein D

How are remote homologs related in terms of their structure?

RBD

b-lactoglobulin


Proteins secondary structure predictions

PSI-BLAST alignment of RBP and b-lactoglobulin: iteration 3

Score = 159 bits (404), Expect = 1e-38

Identities = 41/170 (24%), Positives = 69/170 (40%), Gaps = 19/170 (11%)

Query: 3 WVWALLLLAAWAAAERD--------CRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQ 54

V L+ LA A + S V+ENFD ++ G WY + K

Sbjct: 1 MVTMLMFLATLAGLFTTAKGQNFHLGKCPSPPVQENFDVKKYLGRWYEIEKIPASFE-KG 59

Query: 55 DNIVAEFSVDETGQMSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQ 114

+ I A +S+ E G + K V + ++ +PAK +++++ +

Sbjct: 60 NCIQANYSLMENGNIEVLNKELSPDGTMNQVKGE--AKQSNVSEPAKLEVQFFPL----- 112

Query: 115 KGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPNGLPPEA 164

+WI+ TDY+ YA+ YSC + ++ R+P LPPE

Sbjct: 113 MPPAPYWILATDYENYALVYSCTTFFWL--FHVDFFWILGRNPY-LPPET 159


Proteins secondary structure predictions

The Retinol Binding Protein

b-lactoglobulin


Structure prediction motivation1

Structure Prediction: Motivation

  • Hundreds of thousands of gene sequences translated to proteins (genbanbk, SW, PIR)

  • Only about ~50000 solved protein structures

  • Experimental methods are time consuming and not always possible

  • Goal: Predict protein structure based

    on sequence information


Prediction approaches

Prediction Approaches

  • Tow stage

    1. Primary (sequence) to secondary structure

    2. Secondary to tertiary

  • One stage

    - Primary to tertiary structure


Proteins secondary structure predictions

According to the most simplified model:

  • In a first step, the secondary structure is predicted based on the sequence.

  • The secondary structure elements are then arranged to produce the tertiary structure, i.e. the structure of a protein chain.

  • For molecules which are composed of different subunits, the protein chains are arranged to form the quaternary structure.


Secondary structure prediction

Secondary Structure Prediction

  • Given a primary sequence

    ADSGHYRFASGFTYKKMNCTEAA

    what secondary structure will it adopt ?


Secondary structure prediction methods

Secondary Structure Prediction Methods

  • Chou-Fasman / GOR Method

    • Based on amino acid frequencies

  • Machine learning methods

    • PHDsec and PSIpred

  • HMM (Hidden Markov Model)


Chou and fasman 1974

Chou and Fasman (1974)

Name P(a) P(b) P(turn) Alanine 142 83 66

Arginine 98 93 95

Aspartic Acid 101 54 146

Asparagine 67 89 156

Cysteine 70 119 119

Glutamic Acid 151 037 74

Glutamine 111 110 98

Glycine 57 75 156

Histidine 100 87 95

Isoleucine 108 160 47

Leucine 121 130 59

Lysine 114 74 101

Methionine 145 105 60

Phenylalanine 113 138 60

Proline 57 55 152

Serine 77 75 143

Threonine 83 119 96

Tryptophan 108 137 96

Tyrosine 69 147 114

Valine 106 170 50

The propensity of an amino acid to be part of a certain secondary structure (e.g. – Proline has a low propensity of being in an alpha helix or beta sheet  breaker)

Success rate of 50%


Secondary structure method improvements

Secondary Structure Method Improvements

‘Sliding window’ approach

  • Most alpha helices are ~12 residues longMost beta strands are ~6 residues long

  • Look at all windows of size 6/12

  • Calculate a score for each window. If >threshold  predict this is an alpha helix/beta sheet

TGTAGPOLKCHIQWMLPLKK


Improvements since 1980 s

Improvements since 1980’s

  • Adding information from conservation in MSA

  • Smarter algorithms (e.g. Machine learning, HMM).

Success -> 75%-80%


Machine learning approach for predicting secondary structure phd psipred

Machine learning approach for predicting Secondary Structure (PHD, PSIpred)

Query

Step 1:

Generating a multiple sequence alignment

SwissProt

Query

Subject

Subject

Subject

Subject


Proteins secondary structure predictions

Query

Step 2:

Additional sequences are added using a profile. We end up with a MSA which represents the protein family.

seed

MSA

Query

Subject

Subject

Subject

Subject


Step 3

Step 3:

Query

The sequence profile of the protein family is compared (by machine learning methods) to sequences with known secondary structure.

seed

Machine

Learning

Approach

MSA

Known

structures

Query

Subject

Subject

Subject

Subject


Proteins secondary structure predictions

HMM approach for predicting

Secondary Structure (SAM)

  • HMM enables us to calculate the probability of assigning a sequence to a secondary structure

TGTAGPOLKCHIQWML

HHHHHHHLLLLBBBBB

p = ?


Proteins secondary structure predictions

Beginning with an α-helix

The probability of observing Alanine as part of a β-sheet

α-helix followed by α-helix

The probability of observing a residue which belongs to an α-helix followed by a residue belonging to a turn = 0.15

Table built according to large database of known secondary structures


Proteins secondary structure predictions

  • The above table enables us to calculate the probability of assigning secondary structure to a protein

  • Example

TGQ

HHH

p = 0.45 x 0.041 x 0.8 x 0.028 x 0.8x 0.0635 = 0.0020995


  • Login