1 / 58

BioSystems Synthesis: New optima demand new technologies

BioSystems Synthesis: New optima demand new technologies. 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics. Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI. Harvard MIT DOE GtL Center. C.Ting. Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory, Kucherlapati.

hanh
Download Presentation

BioSystems Synthesis: New optima demand new technologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BioSystems Synthesis: New optima demand new technologies 17-Sep-2003 Virtual Conference on Genomics & Bioinformatics Thanks to: DOE GtL DARPA BioComp PhRMA NHLBI

  2. HarvardMIT DOEGtL Center C.Ting Collaborating PIs: Chisholm, Polz, Church, Kolter, Ausubel, Lory, Kucherlapati

  3. Improving Models & Measures Why model? “Killer Applications”: Share, Search, Merge, Check, Design (e.g. sequence & 3D alignment)

  4. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  5. Why improve measurements? Human genomes (6 billion)2 = 1019 bp Immune & cancer genome changes >1010 bp per time point RNA ends & splicing: in situ 1012 bits/mm3 Biodiversity: Environmental & lab evolution Compact storage 105 now to 1017 bits/ mm3 eventually & How? ($1K per genome, 108-1013 bits/$ ) • The issue is not speed, but integration. • Cost per 99.99% bp : Including Reagents, Personnel, • Equipment/5yr, Overhead/sq.m • Sub-mm scale : 1mm = femtoliter (10-15) • Instruments should match GHz / $2K CPU

  6. Examples of cost bottlenecks Affymetrix $30M? microfabricator limited by chemical reaction rate to one set of chips per day. (~10000X CPU cost) Electrophoresis limited to 4000 bp/capillary/day. Fixed cost ratio of capillaries to CPUs. (~1e9X CPU cost)

  7. Projected costs determine when biosystems data overdetermination is feasible. In 1984, pre-HGP (fX, pBR322, etc.) 0.1bp/$, would have been $30B per human genome. In 2002, (de novo full vs. resequencing ) ABI/Perlegen/Lynx: $300M vs. $3M 103 bp/$(4 log improvement) Other data I/O (e.g. video) 1013 bits/$

  8. Steeper than exponential growth Instructions Per Second 1965 Moore's law of integrated circuits 1999 Kurzweil’s law http://www.faughnan.com/poverty.html http://www.kurzweilai.net/meme/frame.html?main=/articles/art0184.html

  9. Why single molecules? (1) Integrate from cells/genomes/RNAs to data (2) Geometry, “cis-ness” on a molecule, complex, or cell. e.g. DNA Haplotypes & RNA splice-forms (3) Asynchronous dNTP incorporation

  10. Polymerasecolonies(Polonies) along a DNAor RNAmolecule HMS: Shendure, Zhu, Butty, Williams Wash U: Mitra Ambergen: Olejnik U. Del: Edwards, Merritt

  11. Polymerase colony (polony) PCR in a gel B A’ A’ A’ B B B A’ B B B A’ A’ A’ A’ B A’ B B Single Molecule From Library A’ Primer is Extended by Polymerase A Primer A has 5’ immobilizing Acrydite 1st Round of PCR Mitra & Church Nucleic Acids Res. 27: e34

  12. Sequence polonies by sequential, fluorescent single-base extensions B B B’ B’ • Hybridize Universal Primer • Add Red(Cy3) dTTP. Wash. • Add Green(FITC) dCTP • Wash; Scan 3’ 5’ 3’ 5’ C G A T C G C G T . . .

  13. Inexpensive, off-the-shelf equipment Automated slide fluidics $4K MJR in situ Cycler $10K Microarray Scanner $26K-100K

  14. Human Haplotype:CFTR gene45 kbp Rob Mitra Vincent Butty Jay Shendure Ben Williams

  15. Quantitative removal of Fluorophores Rob Mitra

  16. Sequencing multiple polonies Template ST30: 3' TCACGAGT Base added: (C) A G T (C) (A) G (T) C (A) 3' TCACGAGT AGTGCTCA (G) T C A Rob Mitra

  17. Multiple Image Alignment • Metric based on optimal coincidence of high intensity noise pixels over a matrix of local offsets • (0.4 pixel precision)

  18. 1 micron bead sequences Correct signatures are pseudocolored red,white, yellow; noise signatures blue; and “guide” beads green.

  19. Polony exclusion principle &Single pixel sequences Mitra & Shendure

  20. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  21. CD44 Exon Combinatorics (Zhu & Shendure) Alternatively Spliced Cell Adhesion Molecule Specific variable exons are up-or-down-regulated in various cancers Controversial prospective diagnostic / prognostic marker (>1000 papers) Can full isoforms resolve controversy and/or act as superior markers? Eph4 = murine mammary epthithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)

  22. Algorithm for RNA Polony Finding 1. Search Signature Image for qualified ‘objects’ a. > 50 connected pixels with same signature value b. ‘solidity’ of > 0.50 c. long axis / short axis ratio < 3 OR a. > 25 connected pixels with same signature value b. ‘solidity’ of > 0.80 c. long axis / short axis ratio < 1.5 2. Search for internal regional maxima within each object (lest two adjacent polonies with same signature get counted as one) 3. Assign centroid locations as qualified individual ‘polonies’

  23. RNA exon polony examples

  24. RNA exon examplesauto-regridded& quan-titated V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

  25. Summary of Counts (RNA isoforms) Eph4 = murine mammary epthithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) Jun Zhu

  26. PolonyFlavors • Replica plating of DNA images [Mitra et al. NAR 1999] • Alternative RNA splicing combinatorics [Zhu et al. Science 2003] • Long range haplotyping [Mitra et al. PNAS 2003] • Precise SNP-mutant & mRNA ratios [Merritt et al. NAR 2003] • Fluorescent in situ Sequencing (FISSEQ) [Mitra et al. An.Bioch2003] • Tumor LOH [Butz et al BMC Biotech. 2003] • Polony models [Aach & Church, submitted to JTB 2003] • http://arep.med.harvard.edu/Polonator/

  27. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  28. Comparison of predicted with observed protein properties (abundance, localization, postsynthetic modifications)E.coli Link et al. 1997 Electrophoresis 18:1259-313 (Pub)

  29. Multidimensional peptide measures (Optionally protein separation steps) 3rd 2nd

  30. Prochlorococcus Proteogenomic Map Numberson top in basepairs. 1700 ORFs are predicted . Proteomic Model is based on Mass-spectrometry of peptides at 24h time points. DifferenceMapindicates new peptide regions. The 6 colors represent ORFs in the 6 reading frames .(Harvard-MIT GtL:Jaffe, Church, Lindell, Chisholm, et al. )

  31. Circadian time-series (Prochlorococcus)RNA &protein quantitation: RNA (3 AM) RNA (3 AM) R2=.992R2=.635 Linear RegressionR2=.1 (Harvard-MIT GtL:Jaffe, Church, Lindell, Chisholm, et al. )

  32. In vivo crosslinking DNA-binding proteins

  33. RNAs & Proteomics Integration: Next steps • Detect a higher fraction of peptides • (currently ~ 80% proteins, 87% peptides max, 19% average) • 2 Comparative proteomics, e.g. high vs low light adapted) • Smoother time-series. • Degradation

  34. Biosystems Integrating Measures & Models Environment Metabolites RNAi Insertions SNPs DNA Proteins RNA Replication rate interactions Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms

  35. Synthetic Biology • Test or manipulate optimality • Program minimal cells (100kbp) • Nanobiotechnology - new polymers • Manage complex systems • e.g. stem cells & ocean ecology

  36. Suboptimality of mutants --integrating growth rate & flux data Minimization of MetabolicAdjustment (MoMA) for the analysis of non-optimal metabolic phenotypes Daniel Segre, Dennis Vitkup

  37. MoMA/FBA REFERENCES - Haemophilus influenzae metabolism (Schilling andPalsson, J.Theor.Biol. 2000) - Escherichia coli metabolic network and gene deletions (Edwards and Palsson, PNAS 2000, BMC Bioinf. 2000) - Helicobacter pylori (Edwards, Schilling, Covert, Church, Palsson, J. Bact 2002) - Escherichia coli MOMA (Segre, Vitkup, & Church, PNAS 2003)

  38. Fluxes include transport, & a growth flux Vtrans Membrane Vsyn Vdeg Xi Vgrowth Xi=const.  vj=0 Growth: c1Xi+ c2X2+... +cmXm Biomass

  39. Biomass Composition ATP GLY LEU coeff. in growth reaction ACCOA NADH FAD SUCCOA COA metabolites

  40. FluxBalanceAnalysis core 2 1 Find max{Growth} using simplex Null(S)={v : Sv=0}

  41. Can we use flux analysis to say something about suboptimal states ?

  42. Flux ratios at each branch point yields optimal polymer composition for replication x,y are two of the 100s of flux dimensions

  43. Projection can leave the mutant feasible space…so Quadratic programming (QP) to find the nearest point

  44. 12C13CFluxRatio Data

  45. Flux DataC009-limited 200 WT (LP) 180 7 8 160 140 9 120 10 Predicted Fluxes r=0.91 p=8e-8 100 11 14 13 12 3 1 80 60 40 16 20 2 6 5 15 4 17 18 0 0 50 100 150 200 Experimental Fluxes 250 250 Dpyk (LP) Dpyk (QP) 200 200 18 7 r=0.56 p=7e-3 8 r=-0.06 p=6e-1 150 150 7 8 2 Predicted Fluxes Predicted Fluxes 10 9 13 100 9 100 11 12 3 1 14 10 14 13 11 12 3 50 50 5 6 4 16 16 2 15 5 6 18 17 15 17 0 0 4 1 -50 -50 -50 0 50 100 150 200 250 -50 0 50 100 150 200 250 Experimental Fluxes Experimental Fluxes

  46. Flux data (MOMA & FBA)

  47. Competitive growth data On minimal media negative small selection effect C 2 p-values 4x10-3 1x10-5 Novel redundancies Position effects

  48. Replication rate of a whole-genome set of mutants Badarinarayana, et al. (2001) Nature Biotech.19: 1060

  49. lysC 1 2 10.4 Replication rate challenge met: multiple homologous domains thrA 1 2 3 1.1 6.7 metL 1 2 3 1.8 1.8 Selective disadvantage in minimal media probes

More Related