Introduction to Proteomics

Introduction to Proteomics

What is Proteomics? Proteomics - A newly emerging field of life science research that uses High Throughput (HT) technologies to display, identify and/or characterize all the proteins in a given cell, tissue or organism (I.e. the proteome).

3 Kinds of Proteomics • Expressional Proteomics • Electrophoresis, Protein Chips, DNA Chips, SAGE • Mass Spectrometry, Microsequencing • Functional Proteomics • HT Functional Assays, Ligand Chips • Yeast 2-hybrid, Deletion Analysis, Motif Analysis • Structural Proteomics • High throughput X-ray Crystallography/Modelling • High throughput NMR Spectroscopy/Modelling

Expressional Proteomics 2-D Gel QTOF Mass Spectrometry

Expressional Proteomics Prostate tumor Normal

Expressional Proteomics

Why Expressional Proteomics? Concerned with the display, measurement and analysis of global changes in protein expression Monitors global changes arising from application of drugs, pathogens or toxins Monitors changes arising from developmental, environmental or disease perturbations Applications in medical diagnostics and therapeutic drug monitoring

Examples Jungblut PR et al., “Proteomics in Human Disease: Cancer, Heart and Infectious Disease” Electrophoresis 20:100-110 (1999) Zhukov TA et al., “Discovery of distinct protein profiles specific for lung tumors and pre-malignant lung lesions by SELDI”Lung Cancer 40:267-279 (2003) Ghaemmaghami S, et al., “Global analysis of protein expression in yeast” Nature 425:737-741 (2003).

Functional Proteomics

Functional Proteomics (in vitro) • Multi-well plate readers • Full automation/robotics • Fluorescent and/or chemi-luminescent detection • Small volumes (mL) • Up to 1536 wells/plate • Up to 200,000 tests/day • Mbytes of data/day

Functional Proteomics

Functional Proteomics In silico methods (bioinformatics) Genome-wide Protein Tagging Genome-wide Gene Deletion or Knockouts Random Tagged Mutagenisis or Transposon Insertion Yeast two-hybrid Methods Protein (Ligand) Chips

Why Functional Proteomics? Concerned with the identification and classification of protein functions, activities locations and interactions at a global level To compare organisms at a global level so as to extract phylogenetic information To understand the network of interactions that take place in a cell at a molecular level To predict the phenotypic response of a cell or organism to perturbations or mutations

Examples • Uetz P et al.,“A Comprehensive Analysis of Protein-Protein Interactions in Saccharomyces cerevisiae” Nature 403:623-627 (2000) • First example of whole proteome analysis • 957 putative interactions • 1004 of 6100 predicted proteins involved

Examples • Huh, K et al.,“Global analysis of protein localization in budding yeast” Nature, 425:686-691(2003) • Used a collection of yeast strains expressing full-length, chromosomally tagged green fluorescent protein (GFP) fusion proteins • Localized 75% of the yeast proteome, into 22 distinct subcellular localization categories • Provided localization information for 70% of previously unlocalized proteins

Examples • Edwards JS & Palsson BO“Systems properties of the H. influenzae Rd metabolic genotype” J. Biol. Chem. 274:17410-17416 (1999) • First example of metabolic/phenotypic prediction using proteome-wide information • Converting sequence data to differential equations so as to predict biology/behavior

Structural Proteomics • High Throughput protein structure determination via X-ray crystallography, NMR spectroscopy or comparative molecular modeling

Structural Proteomics:The Goal

Structural Proteomics: The Motivation 200000 180000 160000 140000 120000 100000 Sequences Structures 80000 60000 40000 20000 0

The Protein Fold Universe 500? 2000? 10000? How Big Is It??? 8 ?

Protein Structure Initiative Organize all known protein sequences into sequence families Select family representatives as targets Solve the 3D structures of these targets by X-ray or NMR Build models for the remaining proteins via comparative (homology) modeling

Protein Structure Initiative Organize and recruit interested structural biologists and structure biology centres from around the world Coordinate target selection Develop new kinds of high throughput techniques Solve, solve, solve, solve….

Why Structural Proteomics? • Structure Function • Structure Mechanism • Structure-based Drug Design • Solving the Protein Folding Problem • Keeps Structural Biologists Employed

Structural Proteomics - Status 20 registered centres (~30 organisms) 82700 targets have been selected 52705 targets have been cloned 29855 targets have been expressed 12311 targets are soluble 1493 X-ray structures determined 502 NMR structures determined 1743 Structures deposited in PDB

Structural Proteomics - Status 543 structures deposited by Riken 265 structures deposited by Mid-West 187 structures deposited by North-East 179 structures deposited by New York 178 structures deposited by JCSG (UCSD) 52 structures deposited by Berkeley 31 structures deposited by Montreal/Kingston

Bioinformatics & Proteomics Agriculture Medicine Bioinformatics Proteomics Genomics

Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with another? How to include functional/activity/pathway information in databases? How to extract functional motifs from sequence data? How to predict phenotype from proteotype?

Bioinformatics & Expressional Proteomics How to correlate changes in protein expression with disease? How to distinguish important from unimportant changes in expression? How to compare, archive, retrieve gel data? How to rapidly, accurately identify proteins from MS and 2D gel data? How to include expression info in databases?

Bioinformatics & Structural Proteomics How to predict 3D structure from 1D sequence? How to determine function from structure? How to classify proteins on basis of structure? How to recognize 3D motifs and patterns? How to use bioinformatics databases to help in 3D structure determination? How to predict which proteins will express well or produce stable, folded molecules?

Homework • Download RASMOL • Download PDB file from Protein Data Bank • Provide functional protein information/characteristics from the PDB file as opened using RASMOL • Characteristics • Protein name • Sequence • Number of: • Chains • Bonds • Amino acids • Alpha helices • Beta strands

Introduction to Proteomics