1 / 67

Biology 4900

Biology 4900. Biocomputing. Chapter 10. Protein Analysis and Proteomics. Composition of living organisms. 5 major components Proteins Nucleic acids Lipids (fats) Water Carbohydrates. Pevsner, Bioinformatics and Functional Genomics, 2009. Roles of DNA and Proteins.

yukio
Download Presentation

Biology 4900

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biology 4900 Biocomputing

  2. Chapter 10 Protein Analysis and Proteomics

  3. Composition of living organisms 5 major components Proteins Nucleic acids Lipids (fats) Water Carbohydrates Pevsner, Bioinformatics and Functional Genomics, 2009

  4. Roles of DNA and Proteins • If we think of constructing an organism like building a house, DNA would be the blueprint and proteins would be most of the construction materials • Protein functions include: • Structural roles (e.g., actin in the cytoskeleton) • Enzyme catalysts (e.g., trypsin, a serine protease) • Intra- and intercellular transporters • Molecular signaling • Cellular regulation (e.g., Nrf2) Pevsner, Bioinformatics and Functional Genomics, 2009

  5. Amino Acids • Organic compounds with amino and carboxylate functional groups • Each AA has unique side chain (R) attached to alpha (α) carbon • Crystalline solids with high MP’s • Highly-soluble in water • Exist as dipolar, charged zwitterions (ionic form) • Exist as either L- or D- enantiomers • Almost without exception, biological organisms use only the L enantiomer Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011; Berg JM, Tymoczko JL, Stryer L, Biochemistry, 5th Edition, 2002

  6. Formation of Peptides/Proteins • Proteins and polypeptides are biochemical compounds consisting of amino acids • Chains of amino acids bonded together by peptide bonds between the carboxyl and amino groups of adjacent amino acid residues • Proteins • Longer and more complex than polypeptides • Typically folded into a globular or fibrous form • Structure facilitates a biological function Peptide linkages Amino acid Protein Polypeptide

  7. Proteins have different levels of structure • Primary (1°): Sequence of amino acids • Determines 3D structure • Secondary (2°): H-bonding interactions between AA residues begin to produce regular, identifiable structures • Alpha (α) helices • Beta (β) strands • Random coil • Tertiary (3°): Overall structure of single protein in 3 dimensions • Quaternary (4°): Assemblies of multiple polypeptides and/or proteins http://protein-pdb.com/2011/10/04/primary-protein-structure/

  8. Protein Secondary Structure Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011

  9. Proteins 2° Structure: The α-helix • Backbone N-H groups form H-bonds with C=O group four residues away in sequence • AA’s in an α helix arranged in a right-handed helix • Each amino acid residue is rotated 100° relative to previous residue in helix • Helix has 3.6 residues per turn http://simplygeology.wordpress.com/tag/s-waves/

  10. Proteins 2° Structure: The β-sheet • Beta (β) sheets formed by H-bond connected strands • β strands are elongated helices without helical H-bonds • β Sheets may be parallel or antiparallel http://www.chembio.uoguelph.ca/educmat/phy456/456lec01.htm

  11. Proteins 2° Structure: Random Coils and Loops • Proteins typically contain regions lacking either sheet or helical structures. These regions may be classified as: • Random Coils • Loops • Loops may perform important structural and functional roles, including: • Connecting β strands form antiparallel sheets • Increasing flexibility (hinge motion) • Binding metal ions or other biomolecules to alter protein function http://www.chembio.uoguelph.ca/educmat/phy456/456lec01.htm

  12. Proteins 3° Structure • Protein function determined by 3D shape • Tertiary structure results from residue interactions: • H-bonding • Disulfide Bridges • Salt Bridges • Hydrophobic Interactions Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011

  13. Proteins 3° Structure • Polar and charged residues tend to be on surface of protein, exposed to water, while hydrophobic residues tend to be buried Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011

  14. Proteins 4° Structure • Functional proteins may contain two or more polypeptide chains held together by the same forces that control 3° structure: • H-bonding • Disulfide Bridges • Salt Bridges • Hydrophobic Interactions • Each chain is a subunit of structure • Each subunit has its own 1°, 2° and 3° structure Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011

  15. Proteins are Large Macromolecules • Proteins are extremely large • MW of glucose is 180 u, compared with 65,000 u for hemoglobin • Proteins synthesized inside cells remain inside cells • The presence of intracellular proteins in blood or urine can be used to test for certain diseases Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011

  16. Protein Functions • Catalytic Function: • Enzymes are proteins that catalyze biological functions • Structural function: • Most human structural materials (excluding bone) are comprised of proteins • Collagen (bundled helices) • 25-35% of total protein in body • Tendons • ligaments • Skin • Cornea • Cartilage • Bone • blood vessels • gut • Keratin (bundled helices) • Chief constituent of hair, skin, fingernails http://www.imb-jena.de/~rake/Bioinformatics_WEB/proteins_classification.html

  17. Protein Functions • Storage Function: • Storage of small molecules or ions • Ovalbumin • Main protein in egg whites • Can be broken down into amino acids for use by developing embryos • Ferritin • Globular complex of 24 protein subunits • Buffers iron concentration in cells Ovalbumin (chicken egg white) ferritin http://www.stagleys.demon.co.uk/explorers/genesandproteins/page6.html; http://ferritin.blogspot.com/

  18. Protein Functions Immunoglobulin • Protective Function: • Protection against external foreign substances • Antibodies • Very large proteins • Combine with, and destroy viruses, bacteria • blood clotting/Coagulation • thrombin • Protease responsible for platelet aggregation and formation of fibrin Harris, L. J., Larson, S. B., Hasel, K. W., Day, J., Greenwood, A., McPherson, A. Nature 1992, 360, 369-372; http://courses.washington.edu/conj/immune/antibody.htm; http://www.colorado.edu/intphys/Class/IPHY3430-200/014blood.htm

  19. Protein Functions • Regulatory Function: • Protein hormones • Insulin • Protein hormone that directs cells in the liver, muscle, and fat to take up glucose from the blood and store it as glycogen • Forms hexamer bound together by Zn Insulin http://en.wikipedia.org/wiki/File:InsulinHexamer.jpg; Seager SL, Slabaugh MR, Chemistry for Today: General, Organic and Biochemistry, 7th Edition, 2011

  20. Protein Functions • Nerve impulse transmission: • Rhodopsin • Protein found in rods cells of eye retina • Converts light events into nerve impulses sent to the brain http://cherfan2010biology12assessment.wikispaces.com/The+Retina

  21. Protein Functions • Movement function: • Proteins involved in muscle contraction • Myosin • Actin http://www.sigmaaldrich.com/life-science/metabolomics/enzyme-explorer/learning-center/structural-proteins/actin.html

  22. Protein Functions • Transport function: • Transport ions or molecules throughout the body • Serum albumin: Transports fatty acids between fat and other tissues • Hemoglobin: Transports O2 from lungs to other tissues (e.g., muscles) • Transferrin: Transports iron in blood plasma Serum albumin hemoglobin transferrin http://en.wikipedia.org/ ; http://www.pdb.org/pdb/101/motm.do?momID=37

  23. Protein Databases • NCBI RefSeq • UniProt/Swiss-Prot TrEMBL (merged with PIR) (http://www.ebi.ac.uk/uniprot/) • Ensembl (http://useast.ensembl.org/index.html) • Protein DataBank Some of these DB’s have been consolidated over the years. Efforts are being made to develop community standards for reporting protein data  HUPO

  24. The Human Proteome Organisation (HUPO) Proteomics Standards Initiative (PSI) http://www.psidev.info/ • HUPO organized into working groups that focus on different aspects of protein research • Gel Electrophoresis • Mass Spectrometry • Molecular Interactions • Protein Modifications • Proteomics Informatics • Sample Processing • Goals: Defining standards for proteomic data representation to facilitate the comparison, exchange, and verification of data • Controlled vocabularies • MIAPE: Minimum information about a proteomics experiment

  25. Techniques to Identify Proteins Direct Protein Sequencing – Edman degradation • Useful for identifying short sequences (>50 residues) for protein concentrations of 1-10 picomoles http://en.wikibooks.org/wiki/Structural_Biochemistry/Proteins/Protein_sequence_determination_techniques; http://en.wikipedia.org/wiki/Edman_degradation

  26. Techniques to Identify Proteins Mass Spectrometry • Proteins digested into fragments by enzymes • Passed through LC column then sprayed into MS through narrow positively-charge nozzle that further fragments the pieces into ions. • Mass-to-charge ratio of the fragments are calculated to determine amino acid sequence. • Unlike Edman degradation, MS does not have an absolute upper size limit for proteins, but larger proteins are computationally more difficult to sequence. http://www.magnet.fsu.edu/education/tutorials/tools/ionization_esi.html

  27. Outline: Protein analysis and proteomics Perspectives on Individual proteins Perspective 1: Protein families (domains and motifs) Perspective 2: Physical properties (3D structure) Perspective 3: Localization Perspective 4: Function

  28. Perspective 1: Protein domains and motifs Page 389

  29. Definitions Signature: a protein category such as a domain or motif Domain: a region of a protein that can adopt a 3D structure (a fold) Examples: • zinc finger domain • immunoglobulin domain Family: a group of proteins that share a domain Motif (or fingerprint): A short, conserved region of a protein; typically 10 to 20 contiguous amino acid residues Pevsner, Bioinformatics and Functional Genomics, 2009

  30. 15 most common domains (human) Zn finger, C2H2 type 1093 proteins Immunoglobulin 1032 EGF-like 471 Zn-finger, RING 458 Homeobox 417 Pleckstrin-like 405 RNA-binding region RNP-1 400 SH3 394 Calcium-binding EF-hand 392 Fibronectin, type III 300 PDZ/DHR/GLGF 280 Small GTP-binding protein 261 BTB/POZ 236 bHLH 226 Cadherin 226 Page 391 Source: Integr8 at EBI website

  31. EBI Integr8 site • Go to the Integr8 site: http://www.ebi.ac.uk/proteome/ • Browse species; choose Homo sapiens. • Click “Proteome analysis” • Click on “Genomics Statistics to obtain a variety of statistics, such as common repeats, domains, average protein length

  32. Integr8: AA Composition Source: Integr8 at EBI website (updated 7/09)

  33. Analysis of full-length proteins [fragments excluded]Avg protein length : 412 +/- 548 amino acid residues Size range: 4 - 34942 amino acid residues Source: Integr8 at EBI website (updated 7/09)

  34. Definitions of a domain According to InterPro at EBI (http://www.ebi.ac.uk/interpro/): A domain is an independent structural unit, found alone or in conjunction with other domains or repeats. Domains are evolutionarily related. According to SMART (http://smart.embl-heidelberg.de): A domain is a conserved structural entity with distinctive secondary structure content and a hydrophobic core. Homologous domains with common functions usually show sequence similarities. Page 390

  35. Varieties of protein domains Extending along the length of a protein Occupying a subset of a protein sequence Occurring one or more times Pevsner, Bioinformatics and Functional Genomics, 2009

  36. Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2) MBD TRD The protein includes a methylated DNA binding domain (MBD) and a transcriptional repression domain (TRD). MeCP2 is a transcriptional repressor. Mutations in the gene encoding MeCP2 cause Rett Syndrome, a neurological disorder affecting girls primarily. Pevsner, Bioinformatics and Functional Genomics, 2009

  37. Blastp search for MeCP2 (human) These domains comprise a family and are homologous, even if the rest of the protein is quite different domain

  38. Example of a multidomain protein: HIV-1 pol • Multi-domain proteins such as HIV-1 gag-pol are common • Pol (NP_789740), 995 amino acids long • Gag-Pol (NP_057849), 1435 amino acids • cleaved into three proteins with distinct activities: • -- aspartyl protease • -- reverse transcriptase • -- integrase • We will explore HIV-1 pol through UniProt. Pevsner, Bioinformatics and Functional Genomics, 2009

  39. www.uniprot.org • Three protein databases merged to form UniProt: • SwissProt • TrEMBL (translated European Molecular Biology Lab) • Protein Information Resource (PIR) • You can search for information on your favorite protein • there; a BLAST server is provided. Pevsner, Bioinformatics and Functional Genomics, 2009

  40. ExPASyUniProt/SwissProt • Go to ExPASy (http://www.expasy.ch/) • Enter search name or SwissProt accession number. • Ex. Search for HIV-1 gag-pol

  41. EMBL-EBI Uniprot (trEmbl, PIR, SwissPRot) • Go to EMBL-EBI • Enter search name or accession number. • Ex. Search for HIV-1 gag-pol Extensive results Select This

  42. Results of Search, UniProtKB • Sequence • Secondary Structure • Link to PDB 3D Structure • Links to databases (Pfam, PROSITE)

  43. From UniProtKB to Pfam

  44. Pfam Features Integrase Zinc binding domain Integrase core domain

  45. Pfam Features: Domains Select This

  46. O16305 Calmodulin Pfam Features: Domains Students to perform this in class • Search for EFHand (PF00036) • Select link to Interpro Calmodulin EF Hand-like domain EF Hand 1 (binding site) Motifs are typically subsets of domains

  47. Definition of a motif • Motif (or fingerprint): A short, conserved region of a protein (10 to 20 amino acids). • Simple motifs include (but are not limited to): • transmembrane domains • phosphorylation sites • calcium-binding sites • These do not imply homology when found in a group of proteins. • PROSITE (www.expasy.org/prosite) is a dictionary of motifs. • In PROSITE, a pattern is a qualitative motif description (a protein either matches a pattern, or not). • In contrast, a profile is a quantitative motif description. We will encounter profiles in Pfam, ProDom, SMART, and other databases. Pevsner, Bioinformatics and Functional Genomics, 2009

  48. Calcium-binding protein sequence patterns

  49. Perspective 2: Physical properties of proteins

  50. Physical properties of proteins Many websites are available for the analysis of individual proteins. ExPASy is an excellent resource. The accuracy of these programs varies. Predictions based on primary amino acid sequence (such as molecular weight prediction) are likely to be more trustworthy. For many other properties (such as posttranslational modification of proteins by specific sugars), experimental evidence may be required rather than prediction algorithms. Pevsner, Bioinformatics and Functional Genomics, 2009

More Related