1 / 42

CS177 Lecture 6 Computational Aspects of Protein Structure II

CS177 Lecture 6 Computational Aspects of Protein Structure II. Tom Madej 10.17.05. Research news ( Nature 10.21.04). Another milestone for the Human Genome Project. Fills in approx. 99% of the “gene rich” portion of the genome (10% more than the 2001 drafts).

base
Download Presentation

CS177 Lecture 6 Computational Aspects of Protein Structure II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS177 Lecture 6Computational Aspects of Protein Structure II Tom Madej 10.17.05

  2. Research news (Nature 10.21.04) • Another milestone for the Human Genome Project. • Fills in approx. 99% of the “gene rich” portion of the genome (10% more than the 2001 drafts). • Only 341 remaining gaps, formerly hundreds of thousands. • New estimate of the number of genes: 20,000-25,000. • Megabase deletions result in viable mice! • Researchers deleted 1.5 Mb and 0.8 Mb portions of the mouse genome, non-coding regions, and the mice seem to be fine!

  3. Nature Oct. 21, 2004, 931-945

  4. Overview of lecture • Protein structure • General principles • Structure hierarchy • Supersecondary structures • Superfolds and examples: TIM barrels, OB fold • Protein structure comparison algorithms • VAST (Vector Alignment Search Tool) • CE (Combinatorial Extension) • Protein fold classification databases • SCOP (Structural Classification of Proteins) • CATH (Class, Architecture, Topology, Homologous superfamily)

  5. General principles • Most protein structures are composed of two types of regular structural elements interconnected by less well-structured regions. • Regular secondary structure elements (SSEs): α-helices and β-strands. • Irregular regions: loops or coil. • A pair of SSEs positioned next to each other in space may be parallel or anti-parallel.

  6. General principles (cont.) • Helices are stabilized by “internal” hydrogen bonds. • Hydrogen bonds will form between an adjacent pair of strands. • Strands will form larger structures such as β-sheets or β-barrels. • Due to the residue side chains, there are favored packing angles between helices/helices, helices/sheets, and sheets/sheets.

  7. Examples of protein architecture β-sheet with all pairs of strands parallel Architecture refers to the arrangement and orientation of SSEs, but not to the connectivity. β-sheet with all pairs of strands anti-parallel

  8. Examples of protein topology Topology refers to the manner in which the SSEs are connected. Two β-sheets (all parallel) with different topologies.

  9. Exercise • Take a look at 1r7sA in Cn3D. • Draw a topology diagram showing the way the strands are connected.

  10. Angles between SSEs in contact • The data on the next 3 slides gives the cosine of angles between a pair of SSE vectors. • The SSE’s were required to be “in contact”, i.e. within 10 Å of each other. • Note: The SSEs are not necessarily consecutive in the sequence!

  11. General packing of SSEs… • SSEs tend to be oriented either parallel or anti-parallel to each other. • For strand-strand packing there is a stronger tendency to be parallel or anti-parallel, than for helix-helix. • For helix-strand packing there is a strong tendency to be anti-parallel. • This applies to SSEs that are relatively close to each other.

  12. Examples of structures formed by β-strands • Triosphosphate isomerase 7timA • Retinol binding protein 1rbp • Porin 1oh2P

  13. Higher level organization • A single protein may consist of multiple domains. Examples: 1liy A, 1bgc A. The domains may or may not perform different functions. • Proteins may form higher-level assemblies. Useful for complicated biochemical processes that require several steps, e.g. processing/synthesis of a molecule. Example: 1l1o chains A, B, C.

  14. Example: Replication Protein A RPA binds to ssDNA, is involved in recombination, replication, and repair. It is a heterotrimer, consisting of three subunit proteins that bind together. See structure 1l1o. E. Bochkareva et al. The EMBO Journal (2002) 21 1855-1863

  15. Supersecondary structures • β-hairpin • α-hairpin • βαβ-unit • β4 Greek key • βα Greek key

  16. Supersecondary structure: simple units G.M. Salem et al. J. Mol. Biol. (1999) 287 969-981

  17. Supersecondary structure: Greek key motifs G.M. Salem et al. J. Mol. Biol. (1999) 287 969-981

  18. Examples of β4 Greek key motif • 1hk0 Human Gamma-D Crystallin; residues 32 thru 64 in domain 1. • OB fold (we’ll see this fold later).

  19. Examples of βα Greek key motif • 1bgw Topoisomerase; residues 487 thru 540 in domain 5. • 1ris Ribosomal protein S6.

  20. Protein folds • There is a continuum of similarity! • Fold definition: two folds are similar if they have a similar arrangement of SSEs (architecture) and connectivity (topology). Sometimes a few SSEs may be missing. • Fold classification: To get an idea of the variety of different folds, one must adjust for sequence redundancy and also try to correctly assign homologs that have low sequence identity (e.g. below 25%).

  21. Superfolds (Orengo, Jones, Thornton) • Distribution of fold types is highly non-uniform. • There are about 10 types of folds, the superfolds, to which about 30% of the other folds are similar. There are many examples of “isolated” fold types. • Superfolds are characterized by a wide range of sequence diversity and spanning a range of non-similar functions. • It is a research question as to the evolutionary relationships of the superfolds, i.e. do they arise by divergent or convergent evolution?

  22. Globin 1hlm sea cucumber hemoglobin; 1cpcA phycocyanin; 1colA colicin α-up-down 2hmqA hemerythrin; 256bA cytochrome B562; 1lpe apolipoprotein E3 Trefoil 1i1b interleukin-1β; 1aaiB ricin; 1tie erythrina trypsin inhibitor TIM barrel 1timA triosephosphate isomerase; 1ald aldolase; 5rubA rubisco OB fold 1quqA replication protein A 32kDa subunit; 1mjc major cold-shock protein; 1bcpD pertussis toxin S5 subunit α/β doubly-wound 5p21 Ras p21; 4fxn flavodoxin; 3chy CheY Immunoglobulin 2rhe Bence-Jones protein; 2cd4 CD4; 1ten tenascin UB αβ roll 1ubq ubiquitin; 1fxiA ferredoxin; 1pgx protein G Jelly roll 2stv tobacco necrosis virus; 1tnfA tumor necrosis factor; 2ltnA pea lectin Plaitfold (Split αβ sandwich) 1aps acylphosphatase; 1fxd ferredoxin; 2hpr histidine-containing phosphocarrier Superfolds and examples

  23. TIM barrels • Classified into 21 families in the CATH database. • Mostly enzymes, but participate in a diverse collection of different biochemical reactions. • There are intriguing common features across the families, e.g. the active site is always located at the C-terminal end of the barrel.

  24. N. Nagano et al. J. Mol. Biol. (2002) 321 741-785

  25. TIM barrel evolutionary relationships(Nagano, Orengo, Thornton) • Sequence analysis with advanced programs such as PSI-BLAST and IMPALA have identified further relationships among the families. • Further interesting similarities observed from careful comparison of structures, e.g. a phosphate binding site commonly formed by loops 7, 8 and a small helix. • In summary, there is evidence for evolutionary relationships between 17 of the 21 families.

  26. OB (oligonucleotide/oligosaccharide-binding) fold • 5-stranded β-barrel with Greek key topology. • All OB folds have the same binding face that is involved in their biochemistry.

  27. V. Arcus Curr. Opinion Struct. Biol. (2002) 12 794-801

  28. OB evolutionary relationships • SCOP lists 9 superfamilies. • Bacterial enterotoxin superfamily consists of two families, almost certainly evolutionarily related. • Nucleic acid-binding superfamily has 11 families, if evolutionarily related the ancestral protein would come from the LUCA (Last Universal Common Ancestor). • Evidence for common ancestry of all OB folds is probably weaker than for TIM barrels.

  29. Protein structure comparison • How to compare 3D protein structures? • Analogous computational considerations to sequence comparison, e.g. accuracy, efficiency for database searches, statistical significance of results, etc. • Additional complication: working with atomic coordinates in 3D space!

  30. Some protein structure comparison methods • VAST (Vector Alignment Search Tool, NCBI) • CE (Combinatorial Extension, RCSB/PDB) • DALI (EBI)

  31. VAST outline • Parse protein structures into SSEs (helices and strands). • Fit vectors to SSEs. • To compare a pair of proteins attempt to superpose as many vectors as possible, subject to constraints. • Evaluate the vector alignment for statistical significance( computer an E-value). • If the vector alignment is significant then proceed to a more detailed residue-to-residue alignment (“refined alignment”).

  32. Two protein with vectors assigned to SSEs 3chy 1ipf A

  33. VAST comparison of 3chy and 1ipfA Vector superposition Refined alignment

  34. SCOP (Structural Classification of Proteins) • http://scop.mrc-lmb.cam.ac.uk/scop/ • Levels of the SCOP hierarchy: • Family: clear evolutionary relationship • Superfamily: probable common evolutionary origin • Fold: major structural similarity

  35. CATH (Class, Architecture, Topology, Homologous superfamily) • http://www.biochem.ucl.ac.uk/bsm/cath/

More Related