Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome - PowerPoint PPT Presentation

accurate multiplex polony sequencing of an evolved bacterial genome n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome PowerPoint Presentation
Download Presentation
Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome

play fullscreen
1 / 21
Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome
0 Views
Download Presentation
ariadne
Download Presentation

Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome Jay Shendure, Gregory J. Porreca, Nikos B. Reppas, Xiaoxia Lin, John P. McCutcheon. Abraham M. Rosenbaum, Michael D. Wang, Kun Zhang, Robi D. Mitra, George M. Church Science 2005 vol 309, pg 1728-32

  2. Authors • BA 1973: Duke University • PhD 1984 :Harvard University • Professor of Genetics at Harvard Medical School • Director of the Harvard/MIT DOE Genomes-to-Life Center • Director of the Harvard/MIT/WashU NHGRI CEGS George M. Church Jay Shendure - MD-PhD Student in the Church Lab

  3. Background - Sanger Sequencing • Developed By Fredrick Sanger • Uses ddNTPs to block DNA synthesis • DNA fragments run in gel and analyzed

  4. Sequence analysis - Then • Use radioactively labeled dNTPs • Run samples in four columns • Expose your film • Manually determine sequence www.carnegieinstitution.org

  5. Sequence analysis - Now • Use fluorescently labeled ddNTPs • Run samples together in a capillary • Fluorescence detected with a laser • Computer determines sequence www.jgi.doe.gov

  6. The Problem • Electrophoretic methods may very well be reaching their limits • New sequencing method need to be developed if we are to achieve the goal of the $1000 genome

  7. The Problem • Electrophoretic methods may very well be reaching their limits • New sequencing method need to be developed if we are to achieve the goal of the $1000 genome • The Solution: Use a cyclic array method

  8. Step One: Generate DNA Fragments • Genomic DNA is sheared into ~1kb fragments (purified from a gel) • Circularized using an “universal linker” containing Mme1 cut sites • Rolling circle amplification • Mme1 is a downstream cutter, digestion results in 17-18 bps genomic DNA tags flanking the linker • Universal sequences are then ligated to the 5’ and 3’ ends of the fragment Fig. 1a

  9. Step Two: Emulsion PCR • Critically dilute DNA into PCR mix containing 1µm paramagnetic beads containing one of the PCR primers • Make water-oil emulsion, creates mini reaction chambers containing 1 bead and single DNA fragment • PCR reactions will result in each bead being coated in a single fragment Fig. 1b

  10. Step Three: Enrichment & Monolayering • Non-magnetic, low density “capture” beads allow for enrichment of amplified fraction of PCR beads via centrifugation • Beads are then immobilizedand mounted for automated sequencing Fig. 1c

  11. Step Four: Cycles of Sequencing and Imaging • Sequencing is done in automated cycles • Computer algorithm identifies each bead and determines which color it is fluorescing each round Fig. 1d

  12. Supplementary Fig. 7a+b How Sequencing Works • Beads are incubated with one of four anchor primers flanking the genomic DNA tags • Excess primer is washed away

  13. Supplementary Fig. 7a+b How Sequencing Works • Next a mixture of degenerate nonamers are ligated to the anchor primer • Each is specific only at one position the “query position” and is labeled with a different fluorphore • Excess nonamers are washed away

  14. Supplementary Fig. 7a+b How Sequencing Works • Image • Strip off the ligated primer and repeat with new anchor • Accurate for first 6 bases in 5’-3’ direction and 7 bases in 3’-5’ direction • Identifies 26 bp per amplicon • Map it against reference genome

  15. Raw Data Acquisition and Base Calling • Top: Bright field image of beads • Bottom: false colored depiction of four fluorescent images acquired during a single ligation reaction • Image represents 0.01% of total area Fig. 2a + b

  16. Raw Data Acquisition and Base Calling • Tetrahedral representation of the data obtained from a single image cycle • Data points clustered around the 4 potential base calls for a single position on the amplicon Fig. 2c

  17. Results Table 1 • ~1.16 million of 1.6 million reads were accurately mapped to reference genome • ~30.1 million bases of resequencing data with raw accuracy of 99.7% • 91.4% of genome covered

  18. Results Table 1 • 100 random single nucleotide changes (SNC) were added to the reference sequence • Data represents 2 independent sets of SNC • Were able to identify all the SNC present in the dataset when there was at least 2x coverage

  19. Results Fig. 3c+e • By looking through the data set for aberrantly mapped mate-paired tags, were able to identify rearrangements • C: 776bp deletion • E: heterogeneous inversion • These were know features

  20. Table 2 Results • Detected 3 novel polymorphisms in the evolved strain • These were confirmed by Sanger sequencing

  21. Conclusions • Current high-throughput centers sequence at a rate of 20 bases per instrument second and at a cost of $1.00 per kb • This method has an overall sequence rate of ~140 bp/s and a cost of $0.08 per kb, in house enzyme production could drop this number even lower • Open source technology. • Reasonably cheap to set up equipment ~ $140,000