bmi 731 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
BMI 731 PowerPoint Presentation
Download Presentation
BMI 731

Loading in 2 Seconds...

play fullscreen
1 / 43

BMI 731 - PowerPoint PPT Presentation


  • 108 Views
  • Uploaded on

BMI 731. Protein Structures and Related Database Searches. Protein. DNA (Genotype). Biology … Protein…. A single amino acid substitution in a protein causes sickle-cell disease…. What the.....!?. Why do we care about structure?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'BMI 731' - adelio


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
bmi 731

BMI 731

Protein Structures and Related Database Searches

why do we care about structure
Why do we care about structure?
  • In the factory of living cells, proteins are the workers, performing a variety of biological tasks.
  • Each protein has a particular 3-D structure that determines its function.
  • Protein structure is more conserved than protein sequence, and more closely related to function.
  • Sequence -> Structure -> Function
structural information
Structural Information
  • Protein Data Bank: maintained by the Research Collaboratory of Structural Bioinformatics (RCSB)
    • http://www.rcsb.org/pdb/
    • > 15,000 structures of proteins
    • Also contains of structures of Protein/Nucleic Acid Complexes, Nucleic Acids, Carbohydrates
  • Most structures are determined by X-ray crystallography. Other methods are NMR and electron microscopy (EM). Some structures are also theoretically predicted.
protein
Protein?
  • Protein are linear heteropolymers: one or more polypeptide chains
  • Building blocks: 20(?) amino acid residues.
  • Range from a few 10s-1000s
  • Three-dimensional shapes (“fold”) adopted vary enormously.
basic measurements on structures
Basic measurements on structures…
  • Bond lengths
  • Bond angles
  • Dihedral (torsion) angles
bond length
Bond Length
  • The distance between bonded atoms is constant
  • Depends on the “type” of the bond
  • Varies from 1.0 Å(C-H) to 1.5 Å(C-C)
  • BOND LENGTH IS A FUNCTION OF THE POSITION OF TWO ATOMS.
bond angle
Bond Angle…
  • All bond angles are determined by chemical makeup of the atoms involved, and are constant.
  • Depends on the type of atom, and number of electrons available for bonding.
  • Ranges from 100° to 180°
  • BOND ANGLES IS A FUNCTION OF THE POSITION OF THREE ATOMS.
dihedral angles
Dihedral Angles
  • These are usually variable
  • Range from 0-360° in molecules
  • Most famous are , ,  and 
  • DIHEDRAL ANGLES ARE A FUNCTION OF THE POSITION OF FOUR ATOMS.

http://www.colby.edu/chemistry/OChem/DEMOS/dihedral.html

dihedral angles15
Dihedral Angles

A torsion angles is defined by 4 atoms, A, B, C and D.

When atoms A, B, C and D are mainchain atoms (ie. the carboxylic carbon, C1; the alpha carbon, C2 or C-alpha; and the amide group nitrogen, N), There are THREE repeating torsion angles along the backbone chain called phi, psi and omega.

http://bmbiris.bmb.uga.edu/wampler/tutorial/prot2.html

ramachandran phi psi plot
Ramachandran / phi-psi plot

http://www.biochem.ucl.ac.uk/~roman/procheck/manual/examples/plot_01.html

levels of structure
Levels of Structure…

1 - Primary structure

2 - Secondary structure

3 - Tertiary structure

4 - Quaternary structure

primary structure
Primary structure…
  • This is simply the amino acid sequences of polypeptide chains
secondary structure
Secondary structure
  • Local organization of protein backbone: -helix, -strand (which assemble into -sheet), turn and interconnecting loop.
the helix
The -helix
  • One of the most closely packed arrangement of residues.
  • Turn: 3.6 residues
  • Pitch: 5.4 Å/turn
the sheet
The -sheet
  • Backbone almost fully extended, loosely packed arrangement of residues.
tertiary structure
Tertiary structure…
  • Packing the secondary structure elements into a compact spatial unit
  • “Fold” or domain– this is the level to which structure prediction is currently possible.
quaternary structure
Quaternary structure…
  • Assembly of homo or heteromeric protein chains.
  • Usually the functional unit of a protein, especially for enzymes
classification
Classification…
  • Class
  • Fold/Architecture
  • Superfamily
databases of structural classification
Databases of structural classification
  • SCOP
    • Murzin AG, Brenner SE, Hubbard T, Chothia C
    • Structural classification of protein structures
    • Manual assembly by inspection
    • All nodes are annotated (e.g.. All-, /)
    • Structural similarity search using 3dSearch(Singh and Brutlag)
  • CATH
    • Dr. C.A. Orengo, Dr. A.D. Michie, etc
    • Class-Architecture-Topology-Homologous superfamily
    • Manual classification at Architecture level
    • Automated topology classification using the SSAP algorithms
    • No structural similarity search
databases of structural classification27
Databases of structural classification
  • FSSP
    • L.L. Holm and C. Sander
    • Fully automated using the DALI algorithms (Holm and Sander)
    • No internal node annotations
    • Structural similarity search using DALI
  • Pclass
    • A. Singh, X. Liu, J. Chang, D. Brutlag
    • Fully automated using the LOCK and 3dSearch algorithms
    • All internal nodes automatically annotated with common terms
    • JAVA based classification browser
    • Structural similarity search using 3dSearch
why structure alignment
Why Structure Alignment?
  • For homologous proteins (similar ancestry), this provides the “gold standard” for sequence alignment—elucidates the common ancestry of the proteins.
  • For nonhomologous proteins, allows us to identify common substructures of interest.
  • Allows us to classify proteins into clusters, based on structural similarity.
how do we recognize structural similarities
How do we recognize structural similarities?
  • By eye (Alexei Murzin)

SCOP--Gold standard for structure classification!

  • Algorithmically

Growth of PDB demands automated techniques

for classification and fold detection

algorithms for structure alignment
Algorithms for Structure Alignment
  • Distance based methods
    • DALI (Holm and Sander): Aligning scalar distance plots
    • STRUCTAL (Gerstein and Levitt): Dynamic programming using pairwise inter-molecular distances
    • SSAP (Orengo and Taylor): Dynamic programming using intra-molecular vector distance
  • Vector based methods
    • VAST (Bryant): Graph theory based secondary structure alignment
    • 3dSearch (Singh and Brutlag): Fast secondary structure index lookup
  • Both vector and distance based
    • LOCK (Singh and Brutlag): Hierarchically uses both secondary structures vectors and atomic distances
slide31
DALI
  • Based on aligning 2-D intra-molecular distance matrices
  • Computes the best subset of corresponding residues from the two proteins such that similarity between the 2-D distance matrices is maximized.
  • Searches through all possible alignments of residues using Monte-Carlo algorithms
vast vector alignment search tool
VAST-Vector Alignment Search Tool
  • Aligns only secondary structure elements (SSE)
  • Represents each SSE as a vector
  • Finds all possible pairs of vectors from the two structures that are similar
  • Uses a graph theory algorithms to find maximal subset of similar vectors
  • Overall alignment scores is based on the number of similar pairs of vectors between the two structures.
slide33
LOCK
  • Define local secondary structures
  • Find an initial superposition by using DP to align secondary structure vectors.
  • Use greedy algorithms to find nearest neighbors and minimize RMSD between the C- atoms from query and target.
  • Find the core of aligned C- atoms and minimize RMSD between them.
where is the data

GenBank

Where is the data?

DB are equivalent

slide35

RefSeq

NCBI Reference Sequences

GenPeptDatabase

http://inn.weizmann.ac.il/databanks/genpept.html

http://www.expasy.org/sprot/

STATS: http://www.expasy.org/sprot/relnotes/relstat.html

http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html

http://www.rcsb.org/pdb/

PIR International Protein Sequence Database

http://pir.georgetown.edu/pirwww/search/textpsd.shtml

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein

slide37

A Flow chart for structure prediction

Protein

sequence

Database similarity search

Protein family, domain, cluster analysis

Does sequence align with protein of known 3D structure?

no

Predicted

three dimensional structure

3D comparative modeling

Relation-ship to known structure?

yes

no

3D analysis in laboratory

Is there a predicted structure?

Structural analysis

no

images
Images..
  • 3-dimensional model showing the electron density in a molecule of buckminsterfullerene, an allotrope of carbon (C60).
images40
Images…

Computer generated image, showing 3-D structure of uteroglobin, a protein secreted in the uterus of mammals.

images nmr epr
Images… (NMR… EPR…)

A computer image of the charge density over the molecule chymosin, an important enzyme in cheese making. Overall negative charge is depicted as red, overall positive charge is shown in blue.

thanks
Thanks

Thanks to Selnur Erdal for preparing initial versions of these slides.