Knowledge-based protocols for protein structure prediction
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Jarek Meller Division of Biomedical Informatics, PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on
  • Presentation posted in: General

Knowledge-based protocols for protein structure prediction : from protein threading to solvent accessibility prediction and back to protein structure prediction by threading. Jarek Meller Division of Biomedical Informatics,

Download Presentation

Jarek Meller Division of Biomedical Informatics,

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Jarek meller division of biomedical informatics

Knowledge-based protocols for protein structure prediction:from protein threading to solvent accessibility prediction and back to protein structure prediction by threading

Jarek Meller

Division of Biomedical Informatics,

Children’s Hospital Research Foundation

& Department of Biomedical Engineering, UC

JM - http://folding.chmcc.org


Outline of the talk

Outline of the talk

  • Protein structure and complexity of conformational search: from de novo structure prediction to similarity based methods

  • Protein structure prediction by sequence-to-structure matching (threading and fold recognition)

  • Secondary structure and solvent accessibility prediction

  • Improving fold recognition and de novo simulations with accurate solvent accessibility prediction

  • A story from our backyard: predicting interaction between pVHL and RNA Pol II

JM - http://folding.chmcc.org


Polypeptide chains backbone and side chains

Polypeptide chains: backbone and side-chains

N-ter

C-ter

JM - http://folding.chmcc.org


Distinct chemical nature of amino acid side chains

Distinct chemical nature of amino acid side-chains

C-ter

PHE

N-ter

CYS

VAL

GLU

ARG

JM - http://folding.chmcc.org


Hydrogen bonds and secondary structures

Hydrogen bonds and secondary structures

b-strand

a-helix

JM - http://folding.chmcc.org


Tertiary structure and long range contacts annexin

Tertiary structure and long range contacts: annexin

JM - http://folding.chmcc.org


Domains interactions complexes vhl

Domains, interactions, complexes: VHL

JM - http://folding.chmcc.org


Multiple alignment and pssm

Multiple alignment and PSSM

JM - http://folding.chmcc.org


Protein folding problem

Protein folding problem

  • The protein folding problem consists of predicting three-dimensional structure of a protein from its amino acid sequence

  • Hierarchical organization of protein structures helps to break the problem into secondary structure, tertiary structure and protein-protein interaction predictions

  • Computational approaches for protein structure prediction: similarity based and de novo methods

JM - http://folding.chmcc.org


Ab initio or de novo folding simulations

Ab initio (or de novo) folding simulations

  • Ab initio folding simulations consist of conformational search with an empirical scoring function (“force field”) to be maximized (minimized)

  • Computational bottleneck: exponential search space and sampling problem (global optimization!)

  • Fundamental problem: inaccuracy of empirical force fields and scoring functions (folding potentials)

  • Importance of mixed protocols, such as Rosetta by D. Baker and colleagues (Monte Carlo fragment assembly)

JM - http://folding.chmcc.org


Similarity based approaches to structure prediction from sequence alignment to fold recognition

Similarity based approaches to structure prediction: from sequence alignment to fold recognition

  • High level of redundancy in biology: sequence similarity is often sufficient to use the “guilt by association” rule: if similar sequence then similar structure and function

  • Multiple alignments and family profiles can detect evolutionary relatedness with much lower sequence similarity, hard to detect with pairwise sequence alignments: Psi-BLAST by S. Altschul et. al.

  • Many structures are already known (see PDB) and one can match sequences directly with structures to enhance structure recognition: fold recognition (not for new folds!)

  • For both, fold recognition and de novo simulation, prediction of intermediate attributes such secondary structure or solvent accessibility helps to achieve better sensitivity and specificity

JM - http://folding.chmcc.org


Why fold recognition

Why “fold recognition”?

  • Divergent (common ancestor) vs. convergent (no ancestor) evolution

  • PDB: virtually all proteins with 30% seq. identity have similar structures, however most of the similar structures share only up to 10% of seq. identity !

JM - http://folding.chmcc.org


Going beyond sequence similarity threading and fold recognition

Going beyond sequence similarity: threading and fold recognition

When sequence similarity is not

detectable use a library of known

structures to match your query

with target structures.

One needs a scoring (“energy”) function

that measures compatibility

between sequences and structures.

JM - http://folding.chmcc.org


Scoring alternative conformations with empirical knowledge based folding potentials

Scoring alternative conformations with empirical (knowledge-based) folding potentials

Ideally, each misfolded structure should have an energy higher than the native energy, i.e. :

Emisfolded - Enative > 0

E

misfolded

native

JM - http://folding.chmcc.org


Simple contact model for protein structure prediction

Simple contact model for protein structure prediction

Each amino acid is represented by a point in 3D space and two amino acids are said to be in contact if their distance is smaller than a cutoff distance, e.g. 7 [Ang].

JM - http://folding.chmcc.org


Sequence to structure matching with contact models

Sequence-to-structure matching with contact models

  • Generalized string matching problem: aligning a string of amino acids against a string of “structural sites” characterized by other residues in contact

  • Finding an optimal alignment with gaps using inter-residue pairwise models:

    E = Sk< lek l ,

    is NP-hard because of the non-local character of scores at a given structural site (identity of the interaction partners may change depending on location of gaps in the alignment)

    R.H. Lathrop, Protein Eng. 7 (1994)

JM - http://folding.chmcc.org


Hydrophobic contact model and sequence to structure alignment

Hydrophobic contact model and sequence-to-structure alignment

-

HPHPP

  • Solutions to this yet another instance of the global optimization problem:

  • Heuristic (e.g. frozen environment approximation)

  • “Profile” or local scoring functions (folding potentials)

JM - http://folding.chmcc.org


Implementing threading protocols loopp

Implementing threading protocols: LOOPP

  • LOOPP in CAFASP4

  • About average for all fold recognition targets

  • (missing some easy targets, recognized by PsiBlast)

  • Third best server in the category of difficult targets

  • Best predictions among the servers for 3 difficult

  • targets

  • Further improvements necessary to make the

  • predictions more robust

  • Joint work with Ron Elber

JM - http://folding.chmcc.org


Jarek meller division of biomedical informatics

Using sequence similarity, predicted secondary structures and contact potentials: fold recognition protocols

  • In practice fold recognition methods are often mixtures of sequence matching and threading, with compatibility between a sequence and a structure measured by:

  • sequence alignment

  • contact potentials

  • predicted secondary structures (compared to the secondary structure of a template)

JM - http://folding.chmcc.org


Predicting 1d protein profiles from sequences secondary structures and solvent accessibility

Predicting 1D protein profiles from sequences: secondary structures and solvent accessibility

a) Multiple alignment and family profiles improve prediction of local

structural propensities

b) Use of advanced machine learning techniques, such as Neural

Networks or Support Vector Machines improves results as well

B. Rost and C. Sander were first to achieve more than 70%

accuracy in three state (H, E, C) classification, applying a) and b).

SABLE server

http://sable.cchmc.org

POLYVIEW server

http://polyview.cchmc.org

JM - http://folding.chmcc.org


Predicting 1d protein profiles from sequences secondary structures and solvent accessibility1

Predicting 1D protein profiles from sequences: secondary structures and solvent accessibility

PDB

Sable

PsiPred

Prof

Relative solvent accessibility prediction is typically cast as a classification problem

JM - http://folding.chmcc.org


Jarek meller division of biomedical informatics

Variability in surface exposure for structurally equivalent residues does not support classification

JM - http://folding.chmcc.org


Neural network based regression for relative solvent accessibility rsa prediction

Neural Network-based regression for relative solvent accessibility (RSA) prediction

...SDEWACSGNTL...

JM - http://folding.chmcc.org


Accuracy of predictions depends on the level of surface exposure error measures and fine tuning

Accuracy of predictions depends on the level of surface exposure: error measures and fine tuning

JM - http://folding.chmcc.org


Overall accuracy of different regression models

Overall accuracy of different regression models

Non-linear models: Rafal Adamczak; Linear models: Michael Wagner;

Datasets and servers: Aleksey Porollo and Rafal Adamczak

JM - http://folding.chmcc.org


Regression vs two class classification

Regression vs. two-class classification

JM - http://folding.chmcc.org


Predicting transmembrane domains

Predicting transmembrane domains

JM - http://folding.chmcc.org


Predicting transmembrane domains1

Predicting transmembrane domains

JM - http://folding.chmcc.org


Now back to threading and folding simulations

Now back to threading and folding simulations

  • Applications in filtering out incorrect models in both de novo simulations and fold recognition

  • Domain structure prediction, protein-protein interactions

  • Better sensitivity in finding correct matches in threading: one story as an example

JM - http://folding.chmcc.org


Jarek meller division of biomedical informatics

Modeling the RNA Polymerase II Interaction with the von Hippel-Lindau Protein: from experimental clues to structure prediction and back to experiment.

Jarek Meller

Children’s Hospital Research Foundation

Joint work with M. Czyzyk-Krzeska and her group, College of Medicine,University of Cincinnati

JM - http://folding.chmcc.org


A play of life script and beyond

A play of life (script and beyond):

  • Stage: protein society or proteosome

  • Rules of life: proteins are assembled and degraded:

    nursery (ribosome) vs. police and gillotine (ubiquitination and proteasome)

  • Social order: one look at the equilibrium in the system:

Army of scribers (middle class proteins)

Transcription

Translation

Law and oppression

Holy scriptures (DNA)

Temple priests (selected proteins)

“I think we need to adjust

the interpretation of the script … “

(regulation of replication and transcription)

JM - http://folding.chmcc.org


Hypoxia induced stabilization of hif 1a

Hypoxia-induced stabilization of Hif-1a

Graphics from R.K. Bruick and S.L.McKnight, Science 295

JM - http://folding.chmcc.org


Experimental clues

Experimental clues:

  • Observation: correlation between pVHL levels and transcript elongation of the tyrosine hydroxylase gene (M. Czyzyk-Krzeska)

  • Could pVHL influence the transcription by interaction with elongation complex co-factors ?

  • Where to start? Experiment without a model is usually not a very good idea. Could in silico study and bioinformatics help?

JM - http://folding.chmcc.org


Searching for pvhl interaction targets

Searching for pVHL interaction targets:

  • Hif-1a ODD interacts with pVHL – other pVHL targets should have domains structurally resembling that of Hif1-a ODD

  • Use the Hif-1a ODD sequence as a query in order to find other structures that are compatible with it

Rpb1

Rpb6

Hif-1a ODD

pVHL

Pro-OH


Rna polymerase ii in the act of transcription gnatt kornberg et al science 292 2001

RNA Polymerase II in the act of transcription,Gnatt, Kornberg et. al., Science 292 (2001)

JM - http://folding.chmcc.org


Jarek meller division of biomedical informatics

C-ter Rpb1

Rpb6

The C-terminal of Rpb1 and Rpb6 form a pocket on the surface of RNA Polymerase II complex. C-ter of Rpb1 and Rpb6 represented by cartoons.

JM - http://folding.chmcc.org


Could the hif odd fragment resemble c terminal fragment of rna polymerase ii

Could the Hif ODD fragment resemble C-terminal fragment of RNA Polymerase II ?

  • A motif similar to that of ODD found, but that could occur by chance. We used sequence alignments and threading to measure similarity between these fragments.

  • Sequences about 25% identical for a short fragment of about 50 aa – not significant.

  • Predicted secondary structures similar.

  • Suggestive but still not significant similarity.

  • However, a weak match between the adjacent Rpb6 and the consecutive part of the Hif-1a sequence was observed in threading (3D-PSSM, Loopp).

  • Prediction: the ODD shares 3D structure with C-ter fragment of Rpb1 and Rpb6.

  • Implication: VHL is likely to interact with Rpb1/Rpb6!

JM - http://folding.chmcc.org


Experimental results mck

Experimental results (MCK):

  • RNA Pol II peptides suggested by computational analysis do bind to pVHL and this binding is controlled by hydroxylation of the critical PRO residue.

  • Co-immunoprecipitations of hyper-phosphorylated RNA Pol II and pVHL observed: interaction confirmed.

  • Ubiquitination of Rpb1 confirmed.

  • Biological meaning?

JM - http://folding.chmcc.org


  • Login