1 / 40

Genome Function Project

Genome Function Project. UCSC George Church 24 Aug 2001. We thank for support: Government and private grant agencies: NHLBI, NSF, ONR, DOE, DARPA, HHMI, Lipper, Armenise Corporate collaborators & sponsors: Affymetrix, GTC, Mosaic, Aventis, Dupont. Post-Structural Genomics Data.

hume
Download Presentation

Genome Function Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome Function Project UCSC George Church 24 Aug 2001 We thank for support: Government and private grant agencies:NHLBI, NSF, ONR, DOE, DARPA, HHMI, Lipper, Armenise Corporate collaborators & sponsors: Affymetrix, GTC, Mosaic, Aventis, Dupont

  2. Post-StructuralGenomics Data gcggatttagctcagttgggag agcgc cagact gaaga tttgga ggtcctgtgtt cgatc cacagaattcgcacca

  3. Post-300 Genome Sequences 0.5 to 7 Mbp 10 Mbp to 1000 Gbp figure

  4. Function Genomics Measures & Models Environment Metabolites Interactions RNA DNA Protein Growth rate Expression

  5. Exponential technologies 1993 first browser 1994 commercial www

  6. Agenda 1. mapping human variation (haplotype map) 2. obtaining a complete and validated set of human genes including - multiple alleles, transcripts, protein or structural RNA products - regulatory elements 3. understanding the diversity of life through genomic analysis of many organisms, and understanding how one organism works by comparative genomics with others - how genomes evolved 4. creating a new quantitative systems biology, beyond drawing circles and arrows on paper and labeling them with names nobody can remember - mapping the key interactions - mathematical/computational models of pathways and systems - dealing with multiple levels from atoms to cells

  7. In vitro minigenome Steve Blackwell, HMS: pure IF, EF Tony Forster, BWH: tRNAs & modified bases Manz Ehrenberg, Dieter Soll : tRNA-synthetases Josh LaBaer, HMS-HIP: Expression constructs Jingdong Tian, HMS: Protein synthesis Rob Mitra & Xiaohua Huang, HMS: Polymerases, RCA Gloria Culver, Iowa State: ribosomal proteins & rRNA Harry Noller, UCSC: ribosomes

  8. In vitro minigenome A) From atoms to evolving minigenomes and cells. This could improve in vitro transcription/translation/replication systems and conceptually link atomic (mutational) changes via molecular and systems modeling to population evolution. The synthesis of pure systems of proteins with natural or novel modifications would be or great significance. This could give an incredible focus to structural genomics. B) From cells to tissues. Modeling the effects of combinations of membrane signals and genome-programming on RNA and protein expression profiles, would allow, among other things, manipulating stem-cell fate and stability. Stability would be key to both cell culture and to long-term avoidance of cancerous stem-cell proliferation. The ability of "programmed" cells to replace or augment small molecule drugs could be rigorously assessed. C) From tissues to systems Computational programming of cell and tissue morphology can develop quantitative concepts in complexity, chaos, robustness, evolvability to engineer useful models such as sensor-effector neural feedback systems where macro aspects of the system determine the past (Darwinian) or future (prosthetic) function of the altered genomes.

  9. Grand Challenges: goals (& details) • The Manhattan Project ’43-45: Nuclear chain reaction (without igniting the atmosphere) • The Apollo Project ’62-69: Send a person to the moon (& back) • The Smallpox Eradication ’66-77: from the whole globe (including freezers) • The Human Genome Project ’90-05: 3 billion bases (at 99.99% accuracy & searchable)

  10. Grand Challenges: goals (& details) • The Manhattan Project ’43-45: Nuclear chain reaction (without igniting the atmosphere) • The Apollo Project ’62-69: Send a person to the moon (& back) • The Smallpox Eradication ’66-77: from the whole globe (including military freezers?) • The Human Genome Project ’90-05: 3 billion bases (at 99.99% accuracy with comparisons) • The BioSystems Project ’02- ??

  11. Potential BioSystems Project Challenges Programming smart biomaterials 1. 0.1 nanometer positioning at 1kHz in a 50nm cube (Foresight Feynman Challenge) 2. I/O to sub-nano memory in DNA Programming cells & populations: 3. 10 sec. mini-cell cycle, 85kbp genome 4. Bioremediation microbial populations Programming ourselves: 5. Drug structure-activity prioritization 6. Universal, non-aging human stem cells

  12. Potential BioSystems Project Challenges Programming smart biomaterials 1. 0.1 nanometer positioning at 1kHz in a 50nm cube (Foresight Feynman Challenge) 2. I/O to sub-nano memory in DNA Programming cells & populations: 3. 10 sec. mini-cell cycle, 85kbp genome 4. Bioremediation microbial populations Programming ourselves: 5. Drug structure-activity prioritization 6. Universal, non-aging human stem cells

  13. Why the genome project worked Ulam’61-74, Staden’79, Lipman’87, Myers’87, Green’93... Sequence searching Hood’75-00, Hunkapiller’77-00, Carruthers’79... Polymer synthesis & sequencing Tabor’93, Karger’94, Mathies’96, Mullis’84... Chemistry Shotgun & mapping Sanger’77, Brenner’72-02, Sulston’90, Olson’80-00... Infrastructure Wada’82, DeLisi’84, Gilbert’87, Watson’88, Venter’91...

  14. Metrics for structural & functional data Automate Data Model Similarity quality quality search X-ray 1960 resolution |o-c|/o DALI,etc. diffraction < 0.2nm R < 0.2 Sequence 1988 discrepancy conserved BLAST bp <0.01% proteins Expression 1999 cc, t-test shared motifs, Biclustering shared function Interact/growth outliers optimality as above?

  15. Types of Systems Interaction Models Quantum Electrodynamics subatomic Quantum mechanics electron clouds Molecular mechanics spherical atoms nm-fs Master equations stochastic single molecules Fokker-Planck approx. stochastic Macroscopic rates ODE Concentration & time (C,t) Flux Balance Optima dCik/dt optimal steady state Thermodynamic models dCik/dt = 0 k reversible reactions Steady State SdCik/dt = 0 (sum k reactions) Metabolic Control Analysis d(dCik/dt)/dCj (i = chem.species) Spatially inhomogenous dCi/dx Population dynamics as above km-yr Increasing scope, decreasing resolution

  16. Capillary electrophoresis $300,000 (DNA Sequencing) : 0.4Mb/day Chromatography-Mass Spectrometry (eg. peptide LC-ESI-MS) : 20Mb/day Microarray scanners (eg. RNA) : 300 Mb/day mpg Reagent costs: mpg Electrophoresis (DNA Sequencing) : 10 ul per 0.5 Kb Microarray reactions: 10 ul per 1000 Kb Sources of Data for BioSystems Modeling: Intel cmos microscope $99

  17. RNA quantitation Aach, Rindone, Church, (2000) Genome Research 10: 431-445. experiment ORF • R/G ratios • R, G values • quality indicators control • Microarrays1 • Affymetrix2 • SAGE3 ORF • Averaged PM-MM • “presence” • feature statistics • 25-mers PM MM ORF SAGE Tag • Counts of SAGE 14-mers sequence tags for each ORF concatamers 1 DeRisi, et.al., Science278:680-686 (1997) 2 Lockhart, et.al., Nat Biotech14:1675-1680 (1996) 3 Velculescu, et.al, Serial Analysis of Gene Expression, Science270:484-487 (1995)

  18. Array opportunities • 22 bp ds-RNAi array modulates single cell type • Drug array time-release or photo-release • Primer pair arrays for haplotyping • Gene & genome synthesis (DARPA)

  19. Polypeptide arrays Photo-deprotect peptides (Affymax) Piezo or contact spotting (Harvard-CGR, Stanford) Phage or ribosome display capture (Bulyk) In situ ribosomal synthesis (Tian) Harvard Inst. Proteomics, FLEXGene consortium

  20. B A’ A’ A’ B B B A’ B B B A’ A’ A’ A’ B A’ B B Primer A has 5’ immobilizing (Acrydite) modification. Single Molecule From Library A’ Primer is Extended by Polymerase A 1st Round of PCR

  21. 3’ 3’ 5’ 5’ B B B’ B’ A G T C G T G . . . . Sequence polonies by sequential, fluorescent single-base extensions 1. Remove 1 strand of DNA. 2. Hybridize Universal Primer. 3. Add Red(Cy3) dTTP. 4. Wash; Scan Red Channel

  22. B B B’ B’ Sequence polonies by sequential, fluorescent single-base extensions 5. Add Green(FITC) dCTP 6. Wash; Scan Green Channel 3’ 5’ 3’ 5’ C G A T C G C G T . . .

  23. Polony Template T A T T G T T A A A G T G T G T C C T T T G T C G A T A C T G G T A …5’ 3’ P’ A T A A C A A T T T C A C A C A G G A A A C A G C T A T G A C C A T 5’ P Primer Extension 26 cycles, 34 Nucleotides Mean Intensity: 58, 0.5 40, 6.5 0.3, 48 0.4, 43 FITC ( C) CY3 ( T)

  24. Polony haplotyping Trans Cis

  25. Function Genomics Measures & Models Environment Metabolites RNAi Insertions SNPs RNA DNA Protein Growth rate microbes stem cells cancer cells multicellular organisms

  26. lysC 1 2 10.4 Competition among multiple mutations & multiple homologous domains thrA 1 2 3 1.1 6.7 metL 1 2 3 1.8 1.8 Selective disadvantage in minimal media probes

  27. Multiple mutations per gene Correlation between two selection experiments

  28. predictions number of genes negatively selected not negatively selected essential 143 80 63 reduced growth rate 46 24 22 non essential 299 119 180 Comparison of selection data with FBO predictions(scale up from79 to 488 genes) > Novel duplicates? < Position effects? P-value Chi Square = 0.004

  29. Function Genomics Measures & Models Environment Metabolites RNA Protein DNA Expression

  30. RNA quantitation(Frequently Asked Questions) Is less than a 2-fold RNA-ratio ever important? Yes; 1.5-fold in trisomies. Why oligonucleotides rather than cDNAs? Alternative RNAs, gene families. Using a subset of the genome or ratios to various control RNAs? Trouble for later (meta) analyses.

  31. Lpp mRNA start & structure See: Selinger et al Nat Biotech

  32. gene sequences generate candidate oligos predict cross-hybridization filter & select oligos experimental results parameters (Tm, length, ...) gene-specific oligos background sequences generate chip layout generate control, border oligos controls, text, border oligos chip layout Oligo selection • PGA/Smith group already designing software for oligo selection • Church Lab / Lipper Center has additional tools • Unique oligos (cu-15s) • RNA string matching program Figure courtesy of Adnan Derti

  33. Combinatorial arrays for binding constants (EGR1) HMS: Martha Bulyk, Xiaohua Wang, Martin Steffen MRC: Yen Choo ds-DNA array

  34. pVIII pIII Antibodies Phage Combinatorial arrays for binding constants Combinatorial DNA-binding protein domains ds-DNA array

  35. Combinatorial arrays for binding constants Phycoerythrin - 2º IgG Phage Combinatorial DNA-binding protein domains ds-DNA array Martha Bulyk et al

  36. Interactions of Adjacent Basepairs in EGR1 Zinc Finger DNA Recognition Isalan et al., Biochemistry (‘98) 37:12026-12033

  37. Wildtype EGR1 Microarray high [DNA] (+) ctrl sequence for wt binding etc. alignment oligos

  38. Motifs weight all 64 Kaapp Wildtype RSDHLTT TGG 2.8 nM GCG 16 nM 2.5 nM TAT 5.7 nM AAA,AAT,ACT,AGA, AGC,AGT,CAT,CCT, CGA,CTT,TTC,TTT AAT 240 nM RGPDLAR REDVLIR LRHNLET KASNLVS

  39. For more information:arep.med.harvard.edu

More Related