cs177 lecture 6 computational aspects of protein structure ii l.
Skip this Video
Loading SlideShow in 5 Seconds..
CS177 Lecture 6 Computational Aspects of Protein Structure II PowerPoint Presentation
Download Presentation
CS177 Lecture 6 Computational Aspects of Protein Structure II

Loading in 2 Seconds...

play fullscreen
1 / 42

CS177 Lecture 6 Computational Aspects of Protein Structure II - PowerPoint PPT Presentation

  • Uploaded on

CS177 Lecture 6 Computational Aspects of Protein Structure II. Tom Madej 10.17.05. Research news ( Nature 10.21.04). Another milestone for the Human Genome Project. Fills in approx. 99% of the “gene rich” portion of the genome (10% more than the 2001 drafts).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

CS177 Lecture 6 Computational Aspects of Protein Structure II

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
research news nature 10 21 04
Research news (Nature 10.21.04)
  • Another milestone for the Human Genome Project.
    • Fills in approx. 99% of the “gene rich” portion of the genome (10% more than the 2001 drafts).
    • Only 341 remaining gaps, formerly hundreds of thousands.
    • New estimate of the number of genes: 20,000-25,000.
  • Megabase deletions result in viable mice!
    • Researchers deleted 1.5 Mb and 0.8 Mb portions of the mouse genome, non-coding regions, and the mice seem to be fine!
overview of lecture
Overview of lecture
  • Protein structure
    • General principles
    • Structure hierarchy
    • Supersecondary structures
    • Superfolds and examples: TIM barrels, OB fold
  • Protein structure comparison algorithms
    • VAST (Vector Alignment Search Tool)
    • CE (Combinatorial Extension)
  • Protein fold classification databases
    • SCOP (Structural Classification of Proteins)
    • CATH (Class, Architecture, Topology, Homologous superfamily)
general principles
General principles
  • Most protein structures are composed of two types of regular structural elements interconnected by less well-structured regions.
  • Regular secondary structure elements (SSEs): α-helices and β-strands.
  • Irregular regions: loops or coil.
  • A pair of SSEs positioned next to each other in space may be parallel or anti-parallel.
general principles cont
General principles (cont.)
  • Helices are stabilized by “internal” hydrogen bonds.
  • Hydrogen bonds will form between an adjacent pair of strands.
  • Strands will form larger structures such as β-sheets or β-barrels.
  • Due to the residue side chains, there are favored packing angles between helices/helices, helices/sheets, and sheets/sheets.
examples of protein architecture
Examples of protein architecture

β-sheet with all pairs

of strands parallel

Architecture refers

to the arrangement

and orientation of

SSEs, but not to the


β-sheet with all pairs

of strands anti-parallel

examples of protein topology
Examples of protein topology

Topology refers to

the manner in which

the SSEs are


Two β-sheets (all

parallel) with different


  • Take a look at 1r7sA in Cn3D.
  • Draw a topology diagram showing the way the strands are connected.
angles between sses in contact
Angles between SSEs in contact
  • The data on the next 3 slides gives the cosine of angles between a pair of SSE vectors.
  • The SSE’s were required to be “in contact”, i.e. within 10 Å of each other.
  • Note: The SSEs are not necessarily consecutive in the sequence!
general packing of sses
General packing of SSEs…
  • SSEs tend to be oriented either parallel or anti-parallel to each other.
  • For strand-strand packing there is a stronger tendency to be parallel or anti-parallel, than for helix-helix.
  • For helix-strand packing there is a strong tendency to be anti-parallel.
  • This applies to SSEs that are relatively close to each other.
examples of structures formed by strands
Examples of structures formed by β-strands
  • Triosphosphate isomerase 7timA
  • Retinol binding protein 1rbp
  • Porin 1oh2P
higher level organization
Higher level organization
  • A single protein may consist of multiple domains. Examples: 1liy A, 1bgc A. The domains may or may not perform different functions.
  • Proteins may form higher-level assemblies. Useful for complicated biochemical processes that require several steps, e.g. processing/synthesis of a molecule. Example: 1l1o chains A, B, C.
example replication protein a
Example: Replication Protein A

RPA binds to ssDNA, is involved in recombination, replication, and repair.

It is a heterotrimer, consisting of three subunit proteins that bind together.

See structure 1l1o.

E. Bochkareva et al. The EMBO Journal (2002) 21 1855-1863

supersecondary structures
Supersecondary structures
  • β-hairpin
  • α-hairpin
  • βαβ-unit
  • β4 Greek key
  • βα Greek key
supersecondary structure simple units
Supersecondary structure: simple units

G.M. Salem et al. J. Mol. Biol. (1999) 287 969-981

supersecondary structure greek key motifs
Supersecondary structure: Greek key motifs

G.M. Salem et al. J. Mol. Biol. (1999) 287 969-981

examples of 4 greek key motif
Examples of β4 Greek key motif
  • 1hk0 Human Gamma-D Crystallin; residues 32 thru 64 in domain 1.
  • OB fold (we’ll see this fold later).
examples of greek key motif
Examples of βα Greek key motif
  • 1bgw Topoisomerase; residues 487 thru 540 in domain 5.
  • 1ris Ribosomal protein S6.
protein folds
Protein folds
  • There is a continuum of similarity!
  • Fold definition: two folds are similar if they have a similar arrangement of SSEs (architecture) and connectivity (topology). Sometimes a few SSEs may be missing.
  • Fold classification: To get an idea of the variety of different folds, one must adjust for sequence redundancy and also try to correctly assign homologs that have low sequence identity (e.g. below 25%).
superfolds orengo jones thornton
Superfolds (Orengo, Jones, Thornton)
  • Distribution of fold types is highly non-uniform.
  • There are about 10 types of folds, the superfolds, to which about 30% of the other folds are similar. There are many examples of “isolated” fold types.
  • Superfolds are characterized by a wide range of sequence diversity and spanning a range of non-similar functions.
  • It is a research question as to the evolutionary relationships of the superfolds, i.e. do they arise by divergent or convergent evolution?
superfolds and examples
Globin 1hlm sea cucumber hemoglobin; 1cpcA phycocyanin; 1colA colicin

α-up-down 2hmqA hemerythrin; 256bA cytochrome B562; 1lpe apolipoprotein E3

Trefoil 1i1b interleukin-1β; 1aaiB ricin; 1tie erythrina trypsin inhibitor

TIM barrel 1timA triosephosphate isomerase; 1ald aldolase; 5rubA rubisco

OB fold 1quqA replication protein A 32kDa subunit; 1mjc major cold-shock protein; 1bcpD pertussis toxin S5 subunit

α/β doubly-wound 5p21 Ras p21; 4fxn flavodoxin; 3chy CheY

Immunoglobulin 2rhe Bence-Jones protein; 2cd4 CD4; 1ten tenascin

UB αβ roll 1ubq ubiquitin; 1fxiA ferredoxin; 1pgx protein G

Jelly roll 2stv tobacco necrosis virus; 1tnfA tumor necrosis factor; 2ltnA pea lectin

Plaitfold (Split αβ sandwich) 1aps acylphosphatase; 1fxd ferredoxin; 2hpr histidine-containing phosphocarrier

Superfolds and examples
tim barrels
TIM barrels
  • Classified into 21 families in the CATH database.
  • Mostly enzymes, but participate in a diverse collection of different biochemical reactions.
  • There are intriguing common features across the families, e.g. the active site is always located at the C-terminal end of the barrel.
tim barrel evolutionary relationships nagano orengo thornton
TIM barrel evolutionary relationships(Nagano, Orengo, Thornton)
  • Sequence analysis with advanced programs such as PSI-BLAST and IMPALA have identified further relationships among the families.
  • Further interesting similarities observed from careful comparison of structures, e.g. a phosphate binding site commonly formed by loops 7, 8 and a small helix.
  • In summary, there is evidence for evolutionary relationships between 17 of the 21 families.
ob oligonucleotide oligosaccharide binding fold
OB (oligonucleotide/oligosaccharide-binding) fold
  • 5-stranded β-barrel with Greek key topology.
  • All OB folds have the same binding face that is involved in their biochemistry.
ob evolutionary relationships
OB evolutionary relationships
  • SCOP lists 9 superfamilies.
  • Bacterial enterotoxin superfamily consists of two families, almost certainly evolutionarily related.
  • Nucleic acid-binding superfamily has 11 families, if evolutionarily related the ancestral protein would come from the LUCA (Last Universal Common Ancestor).
  • Evidence for common ancestry of all OB folds is probably weaker than for TIM barrels.
protein structure comparison
Protein structure comparison
  • How to compare 3D protein structures?
  • Analogous computational considerations to sequence comparison, e.g. accuracy, efficiency for database searches, statistical significance of results, etc.
  • Additional complication: working with atomic coordinates in 3D space!
some protein structure comparison methods
Some protein structure comparison methods
  • VAST (Vector Alignment Search Tool, NCBI)
  • CE (Combinatorial Extension, RCSB/PDB)
  • DALI (EBI)
vast outline
VAST outline
  • Parse protein structures into SSEs (helices and strands).
  • Fit vectors to SSEs.
  • To compare a pair of proteins attempt to superpose as many vectors as possible, subject to constraints.
  • Evaluate the vector alignment for statistical significance( computer an E-value).
  • If the vector alignment is significant then proceed to a more detailed residue-to-residue alignment (“refined alignment”).

VAST comparison of 3chy and 1ipfA

Vector superposition

Refined alignment

scop structural classification of proteins
SCOP (Structural Classification of Proteins)
  • http://scop.mrc-lmb.cam.ac.uk/scop/
  • Levels of the SCOP hierarchy:
    • Family: clear evolutionary relationship
    • Superfamily: probable common evolutionary origin
    • Fold: major structural similarity