Modeling protein function
Download
1 / 34

Modeling Protein Function - PowerPoint PPT Presentation


  • 198 Views
  • Updated On :

Modeling Protein Function. MED260 Philip E. Bourne Department of Pharmacology, UCSD [email protected] http://www.sdsc.edu/pb Slides on-line at: http://www.sdsc.edu/pb/edu/med260/med260.ppt. Agenda. Why model protein function? Where does it fit as a technique in modern medical research?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Modeling Protein Function' - wei


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Modeling protein function l.jpg

Modeling Protein Function

MED260

Philip E. Bourne

Department of Pharmacology, UCSD

[email protected]

http://www.sdsc.edu/pb

Slides on-line at:

http://www.sdsc.edu/pb/edu/med260/med260.ppt

MED260 Modeling Protein Function - October 11, 2006


Agenda l.jpg
Agenda

  • Why model protein function?

  • Where does it fit as a technique in modern medical research?

  • The data deluge as a motivator

  • The extent of what can be modeled

  • Ontologies – establishing order from chaos

  • Examples of what can be learnt

  • Accuracy – a word of caution

MED260 Modeling Protein Function - October 11, 2006


Why model protein function l.jpg
Why Model Protein Function

  • The rate of discovery of new proteins far outweighs our ability to functionally characterize them

  • Functional discovery of new proteins has implications in:

    • Drug discovery

    • Biomarker identification

    • Understanding of biological processes

    • Identification of disease states and treatment regimes

MED260 Modeling Protein Function - October 11, 2006

Why model protein function?


Slide4 l.jpg

SCIENTIFIC RESEARCH

& DISCOVERY

Anatomy

Migratory

Sensors

Organisms

Physiology

Ventricular

Modeling

Organs

Cell Biology

Electron

Microscopy

Cells

Macromolecules

Biopolymers

X-ray

Crystallography

Proteomics

Genomics

Medicinal

Chemistry

Protein

Docking

Atoms & Molecules

EXAMPLE

UNITS

REPRESENTATIVE

DISCIPLINE

REPRESENTATIVE

TECHNOLOGY

MRI

Heart

Neuron

Structure

Sequence

Protease

Inhibitor

Where does it fit as a technique

in modern medical research?


Slide5 l.jpg

SCIENTIFIC RESEARCH

& DISCOVERY

Anatomy

Migratory

Sensors

Organisms

Physiology

Ventricular

Modeling

Organs

Cell Biology

Electron

Microscopy

Cells

Macromolecules

Biopolymers

X-ray

Crystallography

Proteomics

Genomics

Medicinal

Chemistry

Protein

Docking

Atoms & Molecules

EXAMPLE

UNITS

REPRESENTATIVE

DISCIPLINE

REPRESENTATIVE

TECHNOLOGY

MRI

Heart

Translational

Medicine

Neuron

Structure

Sequence

Protease

Inhibitor

Where does it fit as a technique

in modern medical research?


Slide6 l.jpg
The Ability to Model Protein Function Influences and can be Influenced by Any Level of Biological Complexity - Examples

  • Genome - rapid increase in sequenced genomes provides new raw material

  • Proteome – large increase in the number of 3D structures highlights new functions

  • Interactome – identification of a binding partner points to a new function

  • Metabolome – isolation of a protein within a metabolic pathway

  • Cell - localization points to function

  • Organ – gene expression in heart tissue points to function

  • Organism – different physiology observed in species can be related to protein functions

MED260 Modeling Protein Function - October 11, 2006

Where does it fit as a technique

in modern medical research?


Slide7 l.jpg

SCIENTIFIC RESEARCH Influenced by Any Level of Biological Complexity - Examples

& DISCOVERY

Anatomy

Migratory

Sensors

Organisms

Ventricular

Modeling

Physiology

Organs

Cell Biology

Electron

Microscopy

Cells

Macromolecules

Biopolymers

X-ray

Crystallography

Proteomics

Genomics

Medicinal

Chemistry

Protein

Docking

Atoms & Molecules

EXAMPLE

UNITS

REPRESENTATIVE

DISCIPLINE

REPRESENTATIVE

TECHNOLOGY

MRI

Heart

Neuron

We will focus here

Structure

Sequence

Protease

Inhibitor

MED260 Modeling Protein Function - October 11, 2006


Slide8 l.jpg

At All Levels We Are Being Driven By Data Influenced by Any Level of Biological Complexity - Examples

Biological Experiment Data Information KnowledgeDiscovery

Collect Characterize Compare Model Infer

Complexity

Technology

Data

Higher-life

1

10 100

1000

100000

Computing

Power

Organ

Brain

Mapping

Cardiac

Modeling

Virtual

Communities

Cellular

Model Metaboloic

Pathway of E.coli

Sub-cellular

102

106

1

Neuronal

Modeling

# People/Web Site

Ribosome

Assembly

Virus

Structure

Genetic

Circuits

Structure

Human

Genome

Project

Yeast

Genome

E.Coli

Genome

C.Elegans

Genome

1 Small

Genome/Mo.

Sequencing

Technology

ESTs

Gene Chips

Human

Genome

Sequence

90

95

00

05

Year

The Data Deluge


Metagenomics a first look l.jpg

New type of genomics Influenced by Any Level of Biological Complexity - Examples

New data (and lots of it) and new types of data

17M new (predicted proteins!) 4-5 x growth in just few months and much more coming

New challenges and exacerbation of old challenges

Metagenomics A First Look

MED260 Modeling Protein Function - October 11, 2006

The Data Deluge


Metagenomics first results l.jpg

More then 99.5% of DNA in very environment studied represent unknown organisms

Culturable organisms are exceptions, not the rule

Most genes represent distant homologs of known genes, but there are thousands of new families

Everything we touch turns out to be a gold mine

Environments studied:

Water (ocean, lakes)

Soil

Human body (gut, oral cavity, human microbiome)

Metagenomics: First Results

MED260 Modeling Protein Function - October 11, 2006

The Data Deluge


Metagenomics new discoveries environmental red vs currently known ptpases blue l.jpg
Metagenomics New Discoveries unknown organismsEnvironmental (red) vs. Currently Known PTPases (blue)

1

2

3

4

Higher eukaryotes

MED260 Modeling Protein Function - October 11, 2006

The Data Deluge


The good news and the bad news l.jpg
The Good News and the Bad News unknown organisms

  • Good news

    • Data pointing towards function are growing at near exponential rates

    • IT can handle it on a per dollar basis

  • Bad news

    • Data are growing at near exponential rates

    • Quality is highly variable

    • Accurate functional annotation is sparse

MED260 Modeling Protein Function - October 11, 2006

The Data Deluge


Genomes 2004 l.jpg
Genomes - 2004 unknown organisms

  • We all know about the human – what is not so well known is:

    • 191 completed microbial genomes

    • 44 archaea

    • 727 bacteria

    • 785 eukaryotes (complete or in progress)

    • Viroids ….

MED260 Modeling Protein Function - October 11, 2006

The Data Deluge


Proteome l.jpg
Proteome unknown organisms

  • We are reasonably good at finding proteins in genomes with intergenic regions but not perfect – eg alternative initiation codons

  • Regulatory elements provide a different set of challenges

  • We are not so good at assigning functions to those proteins

  • Moreover the devil is in the details

MED260 Modeling Protein Function - October 11, 2006

The Extent of What Can Be Modeled


Estimated functional roles by of proteins of the proteome in a complex organism l.jpg
Estimated Functional Roles (by % of Proteins) of the Proteome in a Complex Organism

MED260 Modeling Protein Function - October 11, 2006

The Extent of What Can Be Modeled


Functional nomenclature needs to be consistent for orderly progress enter ec and go l.jpg
Functional Nomenclature Needs to be Consistent for Orderly Progress – Enter EC and GO

  • EC classifies all enzymes - http://www.chem.qmul.ac.uk/iubmb/enzyme/

  • Gene Ontology Consortium characterizes by molecular function, biochemiscal process and cellular location http://www.geneontology.org/

Ontologies –

establishing order from chaos

MED260 Modeling Protein Function - October 11, 2006


Functional coverage of the human genome l.jpg
Functional Coverage of the Progress – Enter EC and GOHuman Genome

40% covered

http://function.rcsb.org:8080/pdb/function_distribution/index.html

The Extent of What Can Be Modeled


Step 1 learn what you can from the protein sequence l.jpg
Step 1. Learn What You Can from the Protein Sequence Progress – Enter EC and GO

  • Find it

  • Pay attention to the quality of the functional annotation – errors are transitive

  • Understand its 1-D structure – domain organization, {signatures, fingerprints}

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Step 2 is there a 3d structure if so what can you learn from that l.jpg
Step 2. Is there a 3D Structure? If so What Can You Learn from That?

  • Find it

  • Understand it

  • Characterize it

  • Understand its function(s) – these follow a power law at the fold level – some folds are promiscuous (many functions) others are solitary or of unknown function

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Slide20 l.jpg

(a) myoglobin (b) hemoglobin (c) lysozyme (d) transfer RNA from That?

(e) antibodies (f) viruses (g) actin (h) the nucleosome

(i) myosin (j) ribosome

Courtesy of David Goodsell, TSRI


First why bother with structure an example protein kinase a l.jpg
First Why Bother with Structure? from That?An Example: Protein Kinase A

This “molecular scene”

for cAMP dependant

protein kinase depicts

years of collective

knowledge.

Beyond basics, only the atomic coordinates are captured by the PDB.

Functional annotation requires the literature

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


What did that picture tell us l.jpg

Two domains with associated functions from That?

ATP binding & substrate binding

Through conserved residues and their spatial location details of the ATP and substrate binding and mechanism of the phospho transfer reaction

So is structure the answer to functional modeling?

What Did that Picture Tell Us?

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Enter structural genomics enter structure prediction l.jpg

Question: So is structure the answer to functional modeling? Answer: Partly - The number of unique protein sequences still outnumbers the number of unique structures by 100:1

Enter Structural Genomics

Enter Structure Prediction

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Slide24 l.jpg

The Structural Genomics Pipeline

(X-ray Crystallography)

Basic Steps

  • Crystallomics

  • Isolation,

  • Expression,

  • Purification,

  • Crystallization

Target

Selection

Data

Collection

Structure

Solution

Structure

Refinement

Functional

Annotation

Publish

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Structural genomics will give us l.jpg
Structural Genomics Will Give Us..

  • Good news

    • More structures (definitely)

    • New folds (some but not as anticipated)

    • New understanding of specific diseases and pathways (maybe)

    • Representatives from each major protein family (maybe)

  • Bad news

    • Many new structures that are functionally unclassified (definitely)

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


What about structure prediction l.jpg
What About Structure Prediction?

  • Current rule

    We will be able to predict a structure when we know all the structures 

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Slide27 l.jpg

Why is Structure Prediction so Hard?

Random 1000 structurally similar PDB polypeptide chains with z > 4.5

(% sequence identity vs alignment length)

Twilight Zone

Midnight Zone

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Approaches to structure prediction l.jpg
Approaches to Structure Prediction

  • Homology modeling

  • Threading (aka fold recognition)

  • Ab initio

  • How well do we do? – see CASP

  • Consensus servers

    • Eva - http://cubic.bioc.columbia.edu/eva/

    • LiveBench - http://bioinfo.pl/meta/

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Step 3 what can be got from structure when you have it l.jpg
Step 3. What Can Be Got from Structure When You Have it?

From Structural Bioinformatics

Ed Bourne and Weissig p394 Wiley 2002

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Specific example l.jpg
Specific Example

  • Mj0577 – putative ATP molecular switch

    Mj0577 is an open reading frame (ORF) of previously unknown function from Methanococcus jannaschii. Its structure was determined at 1.7Å (Figure 7a) (Zarembinski et al, 1998). The structure contains a bound ATP molecule, picked up from the E. coli host. The presence of bound ATP led to the proposition that Mj0577 is either an ATPase, or an ATP-binding molecular switch. Further experimental work showed that Mj0577 cannot hydrolyse ATP by itself, and can only do so in the presence of M. jannaschii crude cell extract. Therefore it is more likely to act as a molecular switch, in a process analogous to ras-GTP hydrolysis in the presence of GTPase activating protein.

From Structural Bioinformatics

Ed Bourne and Weissig p402 Wiley 2002

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Step 4 proteins do not function in isolation but are part of complex interaction networks l.jpg
Step 4. Proteins Do Not Function in Isolation But are Part of Complex Interaction Networks

http://www.genome.jp/kegg/

MED260 Modeling Protein Function - October 11, 2006

Examples of what can be learnt


Accuracy a word of caution l.jpg
Accuracy - A Word of Caution of Complex Interaction Networks

  • Errors are transitive

    • Proteins A and B are observed to have similar functions through sequence homology

    • Proteins B and C are observed to have similar functions through sequence homology

    • Is protein A related to protein C?

    • Up to 30% of current annotation may be wrong

MED260 Modeling Protein Function - October 11, 2006

Accuracy - A Word of Caution


Questions l.jpg

Questions? of Complex Interaction Networks

MED260 Modeling Protein Function - October 11, 2006


Demo of steps 1 4 l.jpg
Demo of Steps 1-4 of Complex Interaction Networks

  • Step 1. Learn What You Can from the Protein Sequence

  • Step 2. Is there a 3D Structure? If So, What Can You Learn from That?

  • Step 3. What Can Be Got from Structure When You Have it?

  • Step 4. Proteins Do Not Function in Isolation But are Part of Complex Interaction Networks

MED260 Modeling Protein Function - October 11, 2006


ad