Massively Parallel High Throughput DNA Sequencing: Automation for Microbial Community, Gene Expression and de novo Deciphering of New Genomes Bruce A. Roe, Ph.D., George Lynn Cross Research Professor of Chemistry and Biochemistry, Advanced Center for Genome Technology, Stephenson Research and Technology Center, University of Oklahoma
454/Roche GS-FLX-XLR 1Gb/run 100 454/Roche GS-FLX 100Mb/run 454-GS20 30Mb/run 75 50 ~ ~ 40 30 20 # Million Bases/Run 10 ~ ~ 2.0 ABI 3730 1 Mb/run 1.5 1.0 ABI 3700 200Kb/run ABI 370/377 40Kb/run 0.5 0.04 2008 2007 1994 1996 1998 2000 2002 2004 2006 Date of Introduction A Brief History of Long Read Automated DNA Sequencing Instruments: ABI and 454/Roche
454 GS-FLX Sequencer • Pico-scale sequencing reactions • 2 Core Techniques: • Emulsion PCR • Pyrosequencing
Emulsion PCR • Micro-reactors • Water-in-oil emulsion generates millions of micelles. • Each micelle contains all reagents/templates for a PCR reaction. • ~10 Million individual PCR reactions in a single tube.
44 μm Load Beads into 454 Picotiter Plate Load Enzyme Beads Load beads into PicoTiter Plate Centrifugation
DNA Bead dTTP • DNA Polymerase adds • nucleotide (dNTP) DNA Polymerase (1) A A T C G G C A T G C T A A A A G T C A T PP APS Annealed Primer i (2) • Pyrophosphate • is released (PPi) Sulfurylase Luciferase ATP (3) • Sulfurylase creates ATP • from PPi and APS Enzyme Bead (5) luciferin (4) CCD camera detects bursts of light • Luciferase hydrolyses ATP • to oxidize luciferin and • produce light Light + oxy luciferin Pyrosequencing
Base Calling via Flowgram TTCTGCGAA
Types of Libraries • 454/Roche • Shotgun • Random 250+bp reads • Paired-End • 25-250bp ends of a circularized DNA molecule • Amplicon • PCR product for SNP discovery • Roe Lab • Combined Paired-End and Shotgun approach • Best of both worlds
Shear to 2-4 Kbp fragments on the Hydroshear Quantitate on Caliper AMS-90 or by RealTime PCR DNA End Repair & Linker Ligation as in paired-end protocol Cleave the Terminal Linkers with EcoR1 Ligate to Circularized the DNA Shear to ~500 bp fragments in the Nebulizer but eliminate the enrichment step for fragments containing linker Our Combined Paired End & Shotgun DNA Preparation Protocol Overview
Quantitate on Caliper AMS-90 or by RealTime PCR DNA End Repair, Adaptor Ligation, Adapter End Repair Amplification (emPCR) Pyrosequencing of the combined linker-containing (paired end) and shotgun fragments on 454/Roche GS-FLX Our Combined Paired End/Shotgun DNA Preparation Protocol Overview (cont)
Assembly of Sequence Reads from Our Combined Paired-End/Shotgun Protocol • Separate based on inclusion or exclusion of middle linker • Those sequences containing a middle linker are further separated based on the length of the read to either end of the linker sequence • ~15% of the total reads contain the middle linker sequence • Assembly of the reads by Newbler • Convert paired ends for ordering and orienting • *.454f and *.454r
Automation of the Shotgun Library Preparation Steps • Why automate? • Time • Reproducibility • What are the obstacles? • Reaction Cleanup • Qiagen Minelute centrifuge columns are difficult to automate, so replace those steps with • Agencourt SPRI magnetic beads and add a magnetic station to the Zymark SciClone bed • Enzyme Stability and Storage • Build an enzyme cooling station on the Zymark SciClone bed
SPRI Bead Technology • Solid Phase Reversible Immobilization • Carboxyl coated magnetic particles suspended in a solution of 10% PEG and 1.25M NaCl • Reversibly binds DNA • Hawkins, et al. (1994) DNA purification and isolation using a solid-phase. Nucleic Acids Research, 22(21):4543-4544 http://www.agencourt.com/products/spri_reagents/ampure/
DNA Purification through the Qiagen Minelute Columns vs... Agencourt SPRI Magnetic Beads Agencourt SPRI magnetic beads Qiagen Minelute centrifuge column At least a 30% increase in yield with the SPRI beads and it is easier to automate when using the SPRI beads
Homemade 96 well Magnetic Plate for Purification of the SPRI Beads Inverted 96 well DNA sequencing plate with cylindrical magnets
Enzyme Chilling Station Plastic rack fitted with Swagelock fittings and tubing for cooling.
Enzyme Mixes Waste EtOH Buffers Magnet Zymark SciClone Deck Arrangement Shaker Shaker Sample SPRI Beads Shaker
Automated Library Making on the Caliper-Zymark SciClone To view this automation, get our quicktime movie 454ZymarkPrep.mov
We also have increased the average read lengths from 250 to > 315 bases by increasing the number of flows and amounts of reagents • Slightly dilute the Substrate, Inhibitor and Apyrase by transferring 2.5mL from one of the Buffer CB bottles to each respective tube in the reagent tube-tray • Add 174ul (as opposed to 164ul) from the tube of apyrase to the apyrase buffer tube in the reagent tube-tray. • Transfer 150ml Buffer CB from bottle 3 (at the back of the cassette) to bottle 0 (at the front of the cassette). • Modify the run script to allow for 130 flow cycles
Summary - Methods • For library preparation, It is possible to: • incorporate both shotgun and paired end reads in the same library • replace the Qiagen Minelute centrifuge columns with Agencourt SPRI beads in the library preparation and build (or buy) an enzyme chilling station to facilitate automating the library making process • eliminate the steps involved in single stranded DNA preparation steps • It also is possible to: • break the emulsion after emPCR using centrifugation rather than using a Swinlock filter containing a sieving fabric. • Increase the volumes of the FLX reagents and increase the number of cycles results in a significantly increased read length. • reuse the PicoTiter plate after cleaning by sonication • All our protocols are available on our lab protocol web site at url: http://www.genome.ou.edu/proto.html
Applications • Whole Genome Sequencing • Pooled samples • Plant viruses • Plant fungi • BAC-based genomic sequencing • EST Libraries • Bacterial Communities
Novel cDNA pooling strategy • Add tags to the PCR primer sequences to allow for deconvolution of viral sequences post sequencing • cDNA samples are pooled in sets with 24 unique individual tags after a two step PCR
5’ 3’ 3’ 5’ NNNNNN CCTTCGGATCCTCC CCTCCTAGGCTTCC NNNNNN CCTTCGGATCCTCC NNNNNN CCTCCTAGGCTTCC NNNNNN NNNNNN CCTCCTAGGCTTCC Strategy for preparing cDNA ready for 454 sequencing from dsRNA Anneal with Random Hexamer Primers followed by Reverse Transcriptase PCR Reaction 5’ 5’ 3’ + 5’ 5’ 3’ 5’ Additional Rounds of RT PCR with Random Hexamer Primers 3’ 5’ + 5’ 3’ CCTTCGGATCCTCC NNNNNN RNAse Treatment to Remove any Excess Random Hexamer Primers followed by a second Taq Polymerase PCR with one of the 24 four base Tagged Primers 3’ 5’ 5’ 3’ CCTTCGGATCCTCC GGAAGCCTAGGAGG NNNNNN CCTCCTAGGCTTCCGAGA + 3’ 5’ 5’ 3’ CCTCCTAGGCTTCC NNNNNN GGAAGCCTAGGAGG AGAGCCTTCGGATCCTCC Amplified Product Ready for Ligating 454 A and B Primers 5’ A AGAGCCTTCGGATCCTCC B CCTCCTAGGCTTCCGAGA
RT-PCR Sequence TGP common primer (CCTTCGGATCCTCC) 454 tag (TCAG) TGP Unique tag (GACA) Uniquely Tagged cDNA Sample on the 454
10 Day Contour Clamped Homogenous Electrophoretic Field (CHEF) Gels for Chromosome Isolation S.pombe Po OkAlf-8 in all 4 lanes • Excise individual chromosomal bands, freeze at -200C and then melt by heating to 65 0C. • Mix 500 ul aliquots of TE saturated phenol and melted gel and re-freeze at -200C • Centrifuge at 2500 RPM in a table top centrifuge at -200C • Remove aqueous layer and extract any residual phenol twice with water-saturated ether • Ppt with 2.5 vol of 95% ethanol/acetate, wash 70% ethanol and dry the DNA • Dissolve the DNA in 10 ul of 10:0.1 TE Chr. # 1 2 3 5.7 Mb 4 4.6 Mb 5 6 7 3.5 Mb
Eluted & amplified chromosomes on a 1% agarose gel BAC Hind3 1 2 3 4 5 6 7 Hind3 • Qiagen REPLI-g Mini kit was used to amplify the chromosomes • 2.5 ul of the purified chromosomal DNA was mixed with 2.5 ul of Qiagen denaturation buffer for 3 minutes at 250C followed by mixing with 5ul of Qiagen neutralization buffer. • A master mix containing 10 ul nuclease-free water, 29 ul reaction buffer (containing dNTPs and exonuclease-resistant primers) and 1 ul of the Qiagen’s DNA polymerase was added to the treated chromosomal DNA and incubated at 300C overnight. • The amplified chromosomal DNA product then was verified on a 1% agarose gel by electrophoresis and subjected to the mixed shotgun paired-end sequencing where over 90% of the sequences matched in our CRR database 10 Day Contour Clamped Homogenous Electrophoretic Field (CHEF) Gels for Chromosome Isolation
Summary of our use of CHEF gels for chromosome isolation and subsequent amplification for sequencing • Using our long established freeze/thaw phenol extraction protocol, individual chromosomes can be purified from chromosome grade agarose CHEF gels and then • Amplified using the Qiagen REPLI-g Mini kit • Sequence data can obtained after library making, emPCR and massively parallel pyrosequencing on the 454/Roche GS-FLX with over 90% of the sequences matching our target genome/fungal database
BAC growth in 96 deep well microtiter plates Robotic BAC isolation via the cleared lysate protocol using the Hydra robot. Sheer each BAC individually and create the paired end libraries on the Zymark SciClone robot. Individually tagged A linkers are added with B linkers prior to pooling 12 tagged libraries, followed by emPCR, and half-plate sequencing of each pool. Strategy of adding the 454/Roche MID-based Tags prior to BAC Pooling
Strategy of adding the 454/Roche MID-based Tags prior to BAC Pooling 12 uniquely tagged individual shotgun libraries would be pooled and sequenced on each half- 454/Roche GS-FLX picotiter plate, 24 tagged libraries/full plate 24 150 Kb BACs requires 3.6 Mb for 1 x sequence coverage With >75 Mb of DNA sequence obtained per full plate, >20x coverage is obtained for each of the 24 pooled BACs 96 BACs would therefore require 4 full plate runs on the 454/Roche GS-FLX and no ABI 3730 runs are needed to deconvolute the individual BACs as each BAC is individually tagged The BACs then are easily closed and finished using PCR-based methods.
Analysis of ordered and oriented combined shotgun/paired-end results vector repeat sequences missing in the 454 data but present in the 3730 and/or obtained by PCR-based closure 454/Roche GS-FLX only assembled sequences Phrap-assembled ABI-3730 and 454/Roche GS-FLX sequences Un-joined 454 data often with no missing base but joined by 454 paired-ends and spanned by 3730 or PCR-based sequences Our present strategy is to use the combined shotgun/paired-end pyrosequencing approach on the 454/Roche GS-FLX followed by PCR-based closure methods.
Acknowledgments • Collaborators • Plant Virus studies • Oklahoma State University: Ulrich Melcher, Vijay Muthamukar • Noble Foundation: Marilyn Roossinck, Guoan Shen, Byoung Min, Rick Nelson, Tracy Feldman • Phymatotrichopsis omnivora aka Cotton Root Rot Fungi • Oklahoma State University: Steve Marek • Noble Foundation: Carolyn Young • Medicago truncatula • University of Minnesota: Nevin Young, Roxanne Denny, Steven Cannon (now at Iowa State), Arvind Bhari, Shelly Wang • The JCV Institute: Chris Town, Foo Cheung • The John Innes Institute, UK: Giles Oldroyd & Sanger Institute: Jane Rogers • Toulouse/INRA & Genoscope, France: Fredric Debelle, Francis Quetier • Munich Bioinformatics Center IMGAG Consortium: Claus Mayer • Funding from the NSF Plant Genome, Microbial and EPSCoR Programs and the USDA
OU Genome Center Personnel Automation Graham Fares Doug Simone Nature gives up her secrets to the prepared mind, driving innovation www.genome.ou.edu/proto/htmll