slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
PREDICTING PROTEIN STRUCTURE AND BEYOND …. PowerPoint Presentation
Download Presentation
PREDICTING PROTEIN STRUCTURE AND BEYOND ….

Loading in 2 Seconds...

play fullscreen
1 / 32

PREDICTING PROTEIN STRUCTURE AND BEYOND …. - PowerPoint PPT Presentation


  • 176 Views
  • Uploaded on

PREDICTING PROTEIN STRUCTURE AND BEYOND …. P. V. Balaji Biotechnology Center I.I.T., Bombay. Organization of the talk. 1. Why predict the structure?. 2. Methods for structure prediction. 3. What next?. Genome Size is not Proportional to the Complexity of the Organism. Complexity.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'PREDICTING PROTEIN STRUCTURE AND BEYOND ….' - ostinmannual


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

PREDICTING PROTEIN STRUCTURE AND BEYOND ….

P. V. Balaji

Biotechnology Center

I.I.T., Bombay

slide2

Organization of the talk

1. Why predict the structure?

2. Methods for structure prediction

3. What next?

slide4

English

  • 26-Letter alphabet
  • Only one grammar
  • Extremely diverse literature

Molecular Logic of Life is Same

Genome

  • 4-Letter alphabet
  • Only one grammar
  • Extremely diverse organisms

Biochemically, all things living – animals, plants, bacteria, viruses, etc. – are remarkably similar

slide5

Genome Sequencing and Analysis: One of the Key Steps in Deciphering the Logic of Life

Even minute details have to be analyzed

Hang him, not let him go

Hang him not, let him go

Humans: NeuNAc

Chimpanzees: NeuNGc

–CH3

–CH2OH

slide6

Innovations in Technology Have Made Genome Sequencing a Routine Affair

Genome sequencing

Completed: ~70 organisms

In the pipeline: Several more

“ … it is unlikely that the base sequence of more than a few percent of such a complex DNA will ever be determined …”

C W Schmid & W R Jelinek, Science, June 1982

slide7

One Aspect of Genome Sequence Analysis is to Assign Functions to Proteins

(Reverse Genetics)

Proteins are workhorses of the cell

Are involved in every aspect of living systems

slide8

Function of a Protein can be Defined at Different Levels

Example: Lysozyme

Biochemical level: Hydrolyzes C—O bond

Physiological level: Breaks down the cell wall

Cellular level: Defense against infection

Different Analysis Tools Provide Functions at Different Levels

slide9

Hallmark of Proteins: Specificity

Know exactly which small molecule (ligand) they should bind to or interact with

Also know which part of a macromolecule they should bind to

slide10

Origin of Specificity

Function is critically dependent on structure

1ruv.pdb

slide11

Structure – Key to Dissect Function

Location of Mutants Conserved Residues SNPs

Clefts (active sites)

Dynamics (breathing)

Surface Shape & Charge

Antigenic Sites, surface patches

Structure

Crystal Packing

Functional Oligomerization

Relative Juxtaposition

Fold

Interaction Interfaces

Catalytic Clusters

Motifs

Catalytic Mechanism

Evolutionary Relationships

slide12

Sequence Determines Structure

1KETAAAKFERQHMDSSTSAASSSNYCNQMMKSRNLTKDRCKPVNTFVHES

LADVQAVCSQKNVACKNGQTNCYQSYSTMSITDCRETGSSKYPNCAYKTT

QANKHIIVACEGNPYVPVHFDASV124

1ruv.pdb

Christian B. Anfinsen: Nobel Prize in Chemistry (1972)

slide13

Sequence

Functional Genomics

Function

Structure

How Does Sequence Specify Structure?

?

The Protein Folding Problem

(second half of the genetic code)

Structure has to be determined experimentally

slide14

X-ray crystallography

Provides a static picture

Solubilization of the over-expressed protein

Obtaining crystals that diffract

Nuclear Magnetic Resonance spectroscopy

Provides a Dynamic picture

Size-limit is a major factor

Experimental Methods of Structure Determination

Solubilization of the over-expressed protein

slide15

Limitations of Experimental Methods: Consequences

Annotated proteins in the databank: ~ 100,000

Total number including ORFs: ~ 700,000

Proteins with known structure: ~5,000 !

Dataset for analysis

ORF, or Open Reading Frame, is a region of genome that codes for a protein

Have been identified by whole genome sequencing efforts

ORFs with no known function are termed orphan

slide16

Structural Biology Consortia:

Brute Force Approach Towards Structure Elucidation

*

Aim to solve about 400 structures a year

Employ battalions of Ph.Ds & Post-doctorals

Large-scale expression & crystallization attempts

Basic strategies remain the same

No (known) new tricks

“Unrelenting” ones will be ignored

+

Enhances the statistical base for inferring sequence – structure relationships

slide17

Predicting Protein Structure:

1. Comparative Modeling

(formerly, homology modeling)

Homologous

KQFTKCELSQNLYDIDGYGRIALPELICTMFHTSGYDTQAIVENDESTEYGLFQISNALWCKSSQSPQSRNICDITCDKFLDDDITDDIMCAKKILDIKGIDYWIAHKALCTEKLEQWLCEKE

KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDYGILQINSRWWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNRCKGTDVQAWIRGCRL

Share Similar Sequence

1alc

?

Use as template & model

8lyz

slide18

Comparative Modeling

Basis

Limited applicability

*

A large number of proteins and ORFs have no similarity to proteins with known structure

*

Structure is much more conserved than sequence during evolution

*

Higher the similarity, higher is the confidence in the modeled structure

slide19

Predicting Protein Structure:

Alternative Methods

*

Both these methods depend heavily on the analysis of known protein structures

Threading or Fold Recognition

Ab initio

*

In addition, establishing sequence  structure relationship is also important

*

Input from people trained in statistics, pattern recognition and related areas of computer science is very critical

slide20

Statistical Analysis of Protein Structures: Microenvironment Characterization

Describe structures at multiple levels of detail using a comprehensive set of properties

Atom based properties

Type, Hydrophobicity, Charge

Residue based properties

Type, Hydrophobicity

Chemical group

Hydroxyl, Amide, Carbonyl, etc.

Secondary structure

a-Helix, b-Strand, Turn, Loop

Other properties

VDW volume, B-factor, Mobility, Solvent accessibility

slide21

Predicting Protein Structure:

2. Threading or Fold Recognition

Basis

*

*

It is estimated there are only around 1000 to 10 000 stable folds in nature

Irrespective of the amino acid sequence, a protein has to adopt one of these folds

*

*

Select the best sequence-fold alignment using a fitness scoring function

NP-complete problem

*

Fold recognition is essentially finding the best fit of a sequence to a set of candidate folds

slide22

Fold of a Protein

Refers to the spatial arrangement of its secondary structural elements (a-helices and b-strands)

1l45.pdb

4bcl.pdb

1mbl.pdb

a/b-barrel

b-barrel

a/b-sandwich

slide23

Threading: Basic Strategy

Library of folds

Scoring & selection

Spatial Interactions

Template

Sequence

dhgakdflsdfjaslfkjsdlfjsdfjasd

Query

slide24

Predicting Protein Structure:

3. Ab Initio Methods

Tertiary structure

Sequence

Prediction

Secondary structure

Low energy structures

Predicted structure

Validation

Mean field potentials

Energy Minimization

slide25

Small molecules and/or metal ions are an integral part of certain proteins

1a6g.pdb

Predicting the structure of such proteins is an entirely different challenge

slide26

Proof of the Pudding: CASP Meetings

Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction – 4

Predictions; not Post-dictions

Easy and medium targets: ~100% success

Hard targets: ~50% success

Significant increase from CASP3

slide27

OK, I can predict the structure correctly! is that it?

Well, no!!

Detailed biochemical characterization is required

Strict structure – function correlation exists only for a subset of proteins

Some folds (ferredoxin, TIM barrel, …) are very popular – several protein families, with diverse functions, adopt these folds

Despite high similarity in sequence and structure, may act on different substrates (hence different functions) – due to subtle changes in active site (b13-GalT and b13-GlcNAcT)

slide28

Similar structure, mutually exclusive function: Lysozyme & a-lactalbumin

Same function, completely different structures: Carbonic anhydrases from M. thermophila and mouse

“Moonlighting” proteins – one structure(?), multiple functions

8lyz.pdb, 1alc.pdb

Gal1p – Kinase as well as regulator of Gal-gene expression

Gal3p – 70% similar; does not have kinase activity

1thj.pdb

1dmx.pdb

Inferring Function from Structure: Caveats

Glyceraldehyde 3-phosphate dehydrogenase

Glycolysis

Binding protein for plasmin, fibronectin and lysozyme

Transcriptional control of gene expression, DNA replication and repair

Flocculation

slide29

Same fold, different oligomerization

Dimerization

Tetramerization

ConA

ConA

PNA

PNA, GSIV

slide30

Ligand Induced Conformational Changes are Quite Common

Binding of first substrate redefines the active site and creates the binding pocket for the second substrate and the metal ion

Flexible loop

After

Before

slide31

Predicting Protein Structure is a key component of genome sequence analysis

Structure is a very important link in deciphering the function

New tools are required? Or larger training dataset is required?

Take Home Message

slide32

Organizers for giving me this opportunity

Sujatha and Jayadeva Bhat for helping me put together this talk

Acknowledgement

Few Useful Links

http://guitar.rockefeller.edu/modeller/modeller.html

http://www.biochem.ucl.ac.uk/bsm/cath-new/index.html

http://predictioncenter.llnl.gov/

http://insulin.brunel.ac.uk