1 / 86

Bologna Winter School 2007

Protein Function. Bologna Winter School 2007. How do proteins evolve changed or novel functions? Given the amino acid sequences of proteins inferred from genomic sequences, how can we assign functions to them?. Basic questions:. Genomics gives us many new protein sequences.

ling
Download Presentation

Bologna Winter School 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Function Bologna Winter School 2007

  2. How do proteins evolve changed or novel functions? Given the amino acid sequences of proteins inferred from genomic sequences, how can we assign functions to them? Basic questions:

  3. Genomics gives us many new protein sequences • Often there is little experimental information about the proteins themselves • What can we deduce about proteins from their amino acid sequences? … from the amino acid sequence of one protein alone? … from comparisons of amino acid sequences of related proteins from different species?

  4. What properties of proteins do we want to learn about and how do we measure and analyse them? • amino acid sequence • three-dimensional structure • FUNCTION • expression pattern • regulation

  5. Can we learn these properties by studying purified proteins in isolation? • amino acid sequence – yes, in principle • three-dimensional structure -- certainly • FUNCTION -- ?????? • expression pattern – yes if we had to • regulation – probably not

  6. How do we learn these? • amino acid sequence – genomic sequences • three-dimensional structure – X-ray, NMR, ... modelling • FUNCTION – experiment? inference? • expression pattern -- microarrays • regulation – chip/chip experiments

  7. Does knowledge about related proteins help? • amino acid sequence – possibly • three-dimensional structure – MR, modelling • FUNCTION – YES! BUT, HOW?? • expression pattern – maybe • regulation -- maybe

  8. Function is difficult • Sequence determines structure determines function • From knowing sequence and structure of one protein alone, can we deduce its function? • Identify binding site? • Identify catalytic residues? • Identify ligand? • Analogy to drug-design problem.

  9. Given a protein structure can we predict function directly? • Sometimes… To some extent … • What are reasonable goals? • Sometimes structure gives general idea, guiding laboratory work to pin it down • Some examples from H. influenzae structural genomics project

  10. HI1679 • α/β- hydrolase fold, putative remote homology to L-2-haloacid dehydrogenases • Several substrates tried. • HI1679 cleaved 6-phosphogluconate, phosphotyrosine

  11. HI1434 • related to a region in tRNA synthetases. • contains putative binding site, likely to bind nucleotide • no specific ligand has yet been identified

  12. Nuclear Transport Factor-2 • Protein known to be involved in traffiicking across nuclear membrane • Crystal structure determined • Mechanism of function not obvious • ???

  13. NtF-2 homologous to scytalone dehydratase • Alexei Murzin spotted a similarity of fold between NTF-2 and scytalone dehydratase • This structure shows scytalone dehydratase binding an inhibitor

  14. Scytalone dehydratase Scytalone dehydratase is an enzyme in the pathway for melanin synthesis

  15. NTF-2 Superposition

  16. Search for ligands • On the basis of the structural similarity, many ligands were designed and tested • So far, none has shown any binding or catalyzed reactivity • Conclusion: structural similarity is useful guide to hypotheses about function, but doesn’t always work …

  17. But many similar proteins have similar functions, don't they? • In many cases closely-related proteins have closely-related functions. • Example: human and horse haemoglobin • 43 residue differences out of 446 (α+β chains) • 96% residue identity • SAME FUNCTION

  18. Function assignment from homology? • OK, if the sequences differ greatly then the function may differ • But if the sequences are similar, the functions will be the same – WON'T THEY? • Well, sometimes ...

  19. 'Homology modelling' of function? • Sequence determines structure determines function • Small changes in sequence produce small changes in structure • BUT: dependence of function on sequence (and even on structure) doesn't have simple ‘topology’

  20. Similar sequences produce similar structures

  21. Recruitment • In many cases, similar proteins retain similar functions (example: mammalian globins) • Distantly-related proteins can retain function or diverge in function • But closely-related proteins can have very different functions • Even identical proteins can carry out different functions

  22. Avian eye-lens proteins • In the duck, crystallins have identical sequences to liver enolase and lactate dehydrogenase • They never see the substrates in the eye • In other birds, sequences have changed enough to lose catalytic activity. This proves that enzymatic activity not necessary in eye

  23. Proteinase do = DegP • Chaperone at low temperatures • Proteinase at high temperatures • Logic: moderate stress – try to rescue proteins • more extreme stress – give up and recycle

  24. Function annotation in databases • Proteins appear in databases when their sequences are known • Annotation of function? • Experimental evidence for function • Transfer of function from homologue • How well does this work? • How can we tell? • Requires measure of distance between functions

  25. Two goals of this kind of work • To study how protein function diverges as amino acid sequence diverges • To evaluate the accuracy of transfer of annotation among homologous proteins Problems associated with goal 2 make goal 1 harder

  26. How do proteins change function as their sequences diverge • Divergence v. recruitment • Divergence: • Change in specificity (chymotrypsin, trypsin) • Change in regulation (myoglobin, haemoglobin) • Related functions with similar mechanisms (adaptation of catalytic site) (Gerlt & Babbitt)

  27. Gene duplication and divergence • General way to develop new functions • Very old theory about how metabolic pathways developed – new protein developed to provide substrate for current initial step: • Now growing on B (BCD…ATP) • Medium runs out of B. • BC enzyme duplicates, diverges to catalyze AB • Now you can grow on A (ABCD…ATP) • Attractive because: • BC enzyme has binding site for B • explains gene organization in operon WRONG: mechanism of AB in general different from BC, needs different structure, catalytic residues

  28. Derivation of function from coordinates analysis of sequence and structure • Homologous proteins may have diverged in sequence and function (leave aside recruitment) • Assume no strong sequence similarity to protein of known function • Align sequences • Use structure to get better alignments • Check for conservation of binding site, catalytic residues

  29. Structure-based function assignment • Extract functional residues from structures of known function • Residues contributing to function of entire homologous family conserved in whole family • Residues contributing to specific function of subfamily conserved only in subfamily

  30. Several groups have applied these ideas • Cohen & Lichtarge, ‘Evolutionary Trace Method’ (J. Mol. Biol. 1996) • Irving, Whisstock, Lesk (Proteins 2001) • Hannenhalli & Russell (J. Mol. Biol. 2000) • Sternberg and coworkers (PNAS 2004, Phil. Trans. Roy. Soc. 2006) • See also: Automated Function Prediction, ISMB Special Interest Group Meeting, 2005

  31. How could we test predictions of function?

  32. How to measure distance between functions? • For sequences and structures, there are natural measures of divergence • Sequence: count identical residues • Structures: r.m.s.d. of well-fitting parts (Specialists may argue about details, or propose alternatives, but basically the answers aren't too different.) • Function: no natural measure of difference

  33. Enzyme Commission / EC numbers • (EC numbers NOT European Commission) • Authorized by International Union of Biochemistry and Commission on Enzyme Nomenclature • EC set up by International Union of Biochemistry in 1955. • Report in 1961, modified 1964, several supplements since then. • Published as book, now available on web

  34. What does EC classify • Enzyme nomenclature • Classification of reactions catalysed by enzymes • NOT a set of assignment of function to proteins – That is a different task • (Note that Gene Ontology – another classification scheme – also does not assign functions to proteins)

  35. Enzyme Commission numbers • Four-level hierarchy • Example: isopentenyl-diphosphate ∆-isomerase EC number 5.3.3.2: • 5 = general category (of isomerases) • 5.3 = intramolecular isomerases • 5.3.3 = enzymes that transpose C=C bonds • 5.3.3.2 = specific reaction • EC classifies reactions, names enzymes that catalyse reactions, does not name proteins.

  36. Gene Ontology • EC limited to enzymes • Gene Ontology consortium produced new, more general classification of protein function • Three independent categories: • Molecular function (overlaps EC) • Biological process • Subcellular location • GO: not tree structure, directed acyclic graph

  37. Gene Ontology project • Initiated by Michael Ashburner (early 1990’s). • Has since grown, become de facto standard • References: • Lewis, S.E. (2004). Gene Ontology: looking backwards and forwards.Genome Biology 6:103. • Ashburner, M. (2006). Won for All / How the Drosophila Genome was Sequenced.  Cold Spring Harbor Laboratory Press.

  38. What is an ontology? • Specification of how to describe a body of knowledge • Nomenclature (fixed vocabulary) • Rules of syntax of terms • Types of relationships among entities: • ‘Is a’: for instance: ‘A catis amammal.’ • ‘Part of’: for instance: ‘A tail is part of a cat.’

  39. What is an ontology? • Types of relationships among entities: • ‘Is a’: for instance: ‘A catis amammal.’ • ‘Part of’: for instance: ‘A tail is part of a cat.’ • Note that ‘A cat is a mammal. A mammal is an animal’ implies that ‘A cat is an animal’ • But ‘A tail is part of a cat. A cat is a mammal.’ does NOT imply that a tail is a mammal.

  40. Gene Ontology • EC limited to enzymes • Gene Ontology consortium produced new, more general classification of protein function • Three independent categories: • Molecular function (overlaps EC) • Biological process • Subcellular location • GO: not tree structure, directed acyclic graph

  41. Gene Ontology • EC limited to enzymes • Gene Ontology consortium produced new, more general classification of protein function • Three independent categories: • Molecular function (overlaps EC) • Biological process • Subcellular location • GO: not tree structure, directed acyclic graph

  42. GO classification of isopentenyl-diphosphate ∆-isomerase

  43. Several groups have measured relationship between sequence divergence and functional divergence using EC classification • Example: Todd, Orengo & Thornton, JMB 2001 • For enzymes, sequence identity > 40%, all four EC numbers conserved • sequence identity > 30% three levels of EC numbers conserved for 70% of pairs • How can this work be extended to GO classification?

  44. Several groups have measured relationship between sequence divergence and functional divergence using EC classification • How to define metric on functions? • Distal GO-IDs • How to measure distance between SETS of GO-IDs

  45. How to define metric on functions?

  46. Distal GO-IDs

  47. How to measure distance between SETS of GO-IDs

  48. Dependence of function divergence on sequence divergence: the EF-hand family Fraction of pairs GO distance

More Related