1 / 31

BMMB597E Protein Evolution

BMMB597E Protein Evolution. Protein classification. Protein families. The first protein structures determined by X-ray crystallography, myoglobin and haemoglobin , were solved (in 1959—60) before the amino acid sequences were determined

tara
Download Presentation

BMMB597E Protein Evolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BMMB597EProtein Evolution Protein classification

  2. Protein families • The first protein structures determined by X-ray crystallography, myoglobin and haemoglobin, were solved (in 1959—60) before the amino acid sequences were determined • It came as a surprise that the structures were quite similar • Soon it became clear, on the basis of both sequences and structures, that there were families of proteins

  3. myoglobin haemoglobin

  4. 50 years earlier, there were some hints … • E.T. Reichert & A.P. Brown. The differentiation and specificity of corresponding proteins and other vital substances in relation to biological classification and organic evolution: the crystallography of hemoglobins. (Carnegie Institution of Washington, 1909) • Crystallography 3 years before discovery of X-ray diffraction?

  5. Reichert and Brown studied interfacial angles in haemoglobin crystals • Stenö’s law (1669): different crystals of the same substance may have differerent sizes and shapes, but the angles between faces are constant for each substance • They found that the angles differed from species to species • Similarities in values of interfacial angles were consistent with classical taxonomic tree • They even found differences between oxy- and deoxyhaemoglobin

  6. Most premature scientific result ever? • These results implied: • That proteins adopted (or at least could adopt) unique structures, to form a crystal • That protein structures varied between species • That this variation was parallel with the evolution of the species • That proteins could change structure as a result of changes in state of ligation • In 1909!

  7. M.O. Dayhoff • Pioneer of bioinformatics • Collected protein sequences • First curated ‘database’ • Recognized that proteins form families, on the basis of amino acid sequences • Computational sequence alignments • First evolutionary tree • First amino-acid substitution matrix (later replaced by BLOSUM)

  8. Can relationships among proteins be extended beyond families? • Families = sets of proteins with such obvious similarities that we assume that they are related • One question: how much similarity do we need to believe in a relationship? • How far can evolution go? • Convergent evolution? • Cautionary tale: chymotrypsin / subtilisin

  9. Chymotrypsin-subtilisin • Both proteolytic enzymes • Chymotrypsin mammalian • subtilisin from B. subtilis • Both have catalytic triads • Same function – same mechanism • Sequences 12% similar (near noise level) • However, structures show them to be unrelated

  10. Chymotrypsin / Subtilisin

  11. Catalytic triad in serine proteinases

  12. Chymotrypsin and subtilisin have similar catalytic triads

  13. How can we classify proteins that belong to families? • Align sequences • Calculate phylogenetic tree (various ways to do this, depend on sequence alignment) • Usually, phylogenetic tree of homologous proteins from different species follow phylogenetic tree based on classical taxonomy • That is reassuring • But what happens as divergence proceeds?

  14. How can we classify proteins that do not obviously belong to families? • Base this on structure rather than sequence • Structural similarities are maintained as divergence proceeds, better than sequence similarities • For closely related proteins, expect no difference between sequence-based and structure based classification • How far can classification be extended?

  15. SCOP Structural Classification of Proteins • Idea of A.G. Murzin, based on old work by C. Chothia and M. Levitt • Even if two proteins are not obviously homologous, they may share structural features, to a greater or lesser degree. • For instance, the secondary structures of some proteins are only -helices • Others, have -sheets but no -helices

  16. SCOP • SCOP is a database that gives a hierarchical classification of all protein domains • Recall that a domain is a compact subunit of a protein structure that ‘looks as if’ it would have independent stability Fragment of fibronectin

  17. Dissection of structure into domains • It is not always quite so obvious how to divide a protein into domains • There is some (not a lot) of room for argument • Note that sometimes the chain passes back and forth between domains • In these cases one or both domains do not consist entirely of a consecutive set of residues

  18. lactoferrin

  19. SCOP, CATH, DALI Database classify protein structures • SCOP (Structural Classification of Proteins) • CATH (Class, Architecture, Topology, Homologous superfamily) • DALI Database • These web sites have many useful features: • information-retrieval engines, includingsearch by keyword or sequence • presentation of structure pictures • links to other related sites including bibliographical databases.

  20. SCOPhttp://www.scop.mrc-lmb.cam.ac.uk • SCOP organizes protein structures in a hierarchy according to evolutionary origin and structural similarity. • Domains -- extracted from the Protein Data Bank entries. • Sets of domains are grouped into families: sets domains for which imilaritiesin structure, function and sequence imply a common evolutionary origin.  

  21. The SCOP hierarchy • Families that share a common structure, or even a common structure and a common function, but lack adequate sequence similarity – so that the evidence for evolutionary relationship is suggestive but not compelling – are grouped into superfamilies • Superfamilies that share a common folding topology, for at least a large central portion of the structure, are grouped as folds. • Finally, each fold group falls into one of the general classes.

  22. Major classes in SCOP •  – secondary structure all helical •  – secondary structure all sheet • / – helices and sheets, but in different parts of structure • + – contain -- supersecondary structure • ‘small proteins’ – which often have little secondary structure and are held together by disulphide bridges or ligands; for instance, wheat-germ agglutinin)

  23. Summary of SCOP hierarchy • Class • Fold • Superfamily • Family • Domain

  24. SCOP classification of flavodoxin Protein: Flavodoxin from Clostridium beijerinckii[TaxId: 1520] Lineage: Root: scop Class: Alpha and beta proteins (a/b) [51349] Mainly parallel beta sheets (beta-alpha-beta units) Fold: Flavodoxin-like [52171] 3 layers, a/b/a; parallel beta-sheet of 5 strand, order 21345 Superfamily: Flavoproteins [52218] Family: Flavodoxin-related [52219] binds FMN Protein: Flavodoxin [52220] Species: Clostridium beijerinckii[TaxId: 1520] [52226] PDB Entry Domains: 5nulcomplexedwith fmn; mutantchain a [31191] 2faxcomplexedwith fmn; mutantchain a [31194] … many others

  25. Clostridium beijerinckiiFlavodoxin(stereo pair)

  26. Flavodoxin NADPH-cytochrome P450 reductasesame superfamily, different family

  27. Flavodoxin CHEY same fold, different superfamily

  28. Flavodoxin Spinach ferredoxinreductasesame class, different folds

  29. Flavodoxin in the SCOP hierarchy • To give some idea of the nature of the similarities expressed by the differentlevels of the hierarchy • Flavodoxin fromClostridium beijerinckiiand NADPH-cytochrome P450 reductaseare in the same superfamily, but different families. • Flavodoxinand the signal transduction protein CHEY are in the same fold category, but different superfamilies.  • Flavodoxin and Spinach ferredoxinreductase are in the same class – + – but have different folds.

  30. CATH presents a classification scheme similar to that of SCOP • CATH = Class, Architecture, Topology, Homologous superfamily, the levels of its hierarchy. • In CATH, proteins with very similar structures, sequences and functions are grouped into sequence families. • A homologous superfamily contains proteins for which similarity of sequence and structure gives evidence of common ancestry • A topology or fold family comprises sets of homologous superfamiliesthat share the spatial arrangement and connectivity of helices and strands • Architectures are groups of proteins with similar arrangements of helices and sheets, but with different connectivity.  For instance, different four -helix bundles with different connectivities would share the same architecture but not the same topology in CATH • General classesof architectures in CATH are:. , - (subsuming the / and + classes of SCOP), and domains of low secondary structure content.

  31. Do different classification schemes agree? • To classify protein structures (or any other set of objects) you need to be able to measure the similarities among them.  • The measure of similarity induces a tree-like representation of the relationships. • CATH, SCOP, DALI and the others, agree, for the most part, on what is similar, and the tree structures of their classifications are therefore also similar.  • However, even an objective measure of similarity does not specify how to define the different levels of the hierarchy.  • These are interpretative decisions, and any apparent differences in the names and distinctions between the levels disguise the underlying general agreement about what is similar and what is different.

More Related