
How to make the GSL work for you Platforms, Tools, and Services Jennifer Schaff, Interim Director Genomic Sciences Laboratory
The GSL is located in the Partners II building, suite 2100 (Centennial Campus) Hours are 8:30am to 4:45pm Call ahead for assistance around lunch time gsl.cals.ncsu.edu
Platform Reads/Run Read Length Data Yield ABI 3730 96-well 750-800 80Kb Roche GS FLX 1.2 million 350-400 400-600Mb Illumina GAIIx 20 million 36-108 over 2Gb Three available sequencing platforms
Platform Reads/Run Read Length Data Yield ABI 3730 96-well 750-800 80Kb Roche GS FLX 1.2 million 400 400-600Mb Illumina GAIIx 20 million 36-108 over 2Gb Platform Throughput Price per Experiment Price per Base ABI 3730 Low - Mid $ $$$ Roche GS FLX High $$$ $$ Illumina GAIIx Very High $$ $ Three available sequencing platforms
G C T A C T G A How BigDye Terminating Chemistry Works A GCATGCTGACTGATCGTAGCTAGCT T A G G G T T T C A C C G
G G C T A C T G A How BigDye Terminating Chemistry Works A GCATGCTGACTGATCGTAGCTAGCT CTGACTA T A G G G T T T C A C C G
G C C C T C T CT G CT G CT G CTG A CTG A CTGACTAGCATCGATCG T CTGACTAGCATCGATCG T How BigDye Terminating Chemistry Works GCATGCTGACTGATCGTAGCTAGCT CTGACTA . . .
G C C C T C T CT G CT G CT G CTG A CTG A CTGACTAGCATCGATCG T CTGACTAGCATCGATCG T How BigDye Terminating Chemistry Works GCATGCTGACTGATCGTAGCTAGCT CTGACTA . . .
How GS FLX 454 Chemistry Works http://www.454.com/products-solutions/multimedia-presentations.asp
Sanger sequencing GS FLX 454 CGGAATAGTCTGTAGACGACTTCCGTTCCTGGCGGGGTGTTGTGCTCGGTAGAGCAGCGTCGTGCTGCGATCTGTTGAGACTCagCCCTACGCCAGgTGATTCGTCTACAGACTATTCCGAGCCccGACATCGAACTGAGGTAAATTCGGACCTTCGGAGCCGTGATGCACGCGTTAAGCGGACAGCATCGATCTCCGCGATCCAAATGGGCTTCGACGTCGCACCTCACGTGGTGAAGCGCGACTAGTAAAGTCACATTGTTTAGAGCCTCCCGACTCTCGGGGCTCCACAGTGAGCATATCCTTGCCGGATTCGGCTAGGCTGGCTTCGGCCTTAGAGGCGTTCAGGCATAATCCCGCGGATGGTAGCTTCGCACCACCGGCCGCTCGGCCGAGTGCATGAACCAAATGTCCGAAACTGCGGTTCCTCTCGTACTGAGCAGTATTACTATCGCAACGACAAGCCATCAGTAGGGTAAAACTAACCTGTCTCACGACGGTCTAAatCCCAGCTCACGTTCCCTTTTGATGGGTGAACAATCCAACGCTTGGCGAATTTTGCTTCGCAATGATAGGAAGAGCCGACATCGAAGGATCAAAAAGCAACGTCGCTATGAACGCTTGGCTGCCACAAGCCAGTTATCCCTGTGGTAACTTTTCTGGCACCTCTTGCTAAAAACTCTTTATACTAAAGGATCGATAGGCCGTGCTTTCGCAGTCCCTATGCGTACTGAACATCTGGATCAAGCCAGCTTTTGCCCTTTTGCTCCACGCGAGGTTTCTGTCCTCGCTGAGCTGGCCTTAGGACACCTGCGTTATTCTTTGACAGATGTACCGCCCCAGTCAAACTACCCGCCTGGCAGTGTCCTCGAACCGGATCACGCGGGAGTTGTACGGCGACGAGCGTTGCCGCCACGTCGCCACTCTGCACGCTTGGAACGAAACACCGTGCGCCCGCCGATATTATCGACCGCGCACCGCTTCCGCCCAACCGAGTAAGTAATGAAACAATGAAAGTAG GAIIx
Single Molecule Sequencing Helicos - Short reads ~ 50 bases Targeted sequencing Whole genome resequencing Digital transcriptome Pacific Biosystems - Long reads ~ 1000 bases Full length transcriptome sequencing Whole genome resequencing ~Digital transcriptome
Longer Read Coverage DNA Shorter Read Coverage Choosing the right platform Longer reads assemble easier, and you need less coverage….
Longer Read Coverage DNA Shorter Read Coverage Choosing the right platform Longer reads assemble easier, and you need less coverage…. However, at 4x more data per run, 5-10x cheaper may make shorter read sequencing more attractive, and if doing a transcriptome, can get an idea of relative expression levels
Choosing the right platform Longer reads assemble easier, and you need less coverage…. Longer Read Coverage DNA Shorter Read Coverage However, at 4x more data per run, 5-10x cheaper may make shorter read sequencing more attractive, and if doing a transcriptome, can get an idea of relative expression levels
Choosing the right platform Paired End Sequencing DNA Paired End Sequencing
Paired End Sequencing DNA Paired End Sequencing Choosing the right platform Repetitive region Alternative splicing
Sample Preparation – Starting Material* GS FLX (454) Genomic Sequencing – 500ng Transcript Sequencing – 200ng selected mRNA Illumina Genomic Sequencing – 500ng Transcript Sequencing – 1 to 10ug total RNA *Amount of starting material relates to having enough to QC
Special Considerations for Transcript Sequencing
Transcript Sequencing OLD METHOD GS FLX 454: Introducing ‘mutations’ into the polyA tail while synthesizing and amplifying cDNA • Requires Amplification • Does not work well for either platform NEW METHOD: Chemical Fragmentation of mRNA • mRNA Fragmentation (GS FLX 454) or total RNA (GAIIx) • Starting material is 200ng** (GS FLX 454) or 1ug (GAIIx) • Cannot normalize your transcripts
Looking for SNPs (and other Polymorphs) Parents: VW8 and VW9 (subspecies of M. hapla) • AFLP: 4% of fragments are polymorphic • Infection on different plant species • in tomato in root w/R VW8- VW9+ • reproduction on tomato w/R VW8+ VW9+ • reproduction on bean VW8+ VW9+ • reproduction on bean w/R VW8- VW9+ • reproduction nightshade w/R VW8- VW9+ • Aggregation VW8+ VW9- • Small or no galls VW8- VW9+ Progeny: 183 F2 lines (subspecies of M. hapla)
Dr. Williamson, UC Davis - M. hapla linkage map VW8xVW9: AFLP/PCR markers LG1 LG3 LG2 LG4 LG5 LG10 LG11 LG9 LG8 LG7 LG6 LG12 Unmapped markers: ETCA/MCTC-185 ECGG/MACA-232 AF41a/b ECGG/MGA-150 ECGG/MAT-134 ECAT/MTG-125 ECAG/MACA-185 ECAA/MTG-133 AF16a/b AF19A/B LG15 LG14 EACC/MACT-120 EACC/MACT-105 EACC/MACC-100 AF12 AF28a/b ECA/MTA-80 EAT/MACT-95 LG13
Assembly Mh 1.3 of M. hapla (VW9) genome Opperman et al., PNAS Sept 30th 2008 Total Number Bases 586,990,600 High Quality Reads 824,425 Genome covered by contigs >95% Genomic Coverage 10.4X Avg Read Length 712 (±199) Total Reads 1,013,681 Assembly Statistics Scaffolds >2kb 1,523 Scaffold length 53,578,246 Median scaffold length 83,645 Gaps 1,522 * Data assembled using Arachne (Genome Res. 2003 12:91-96)
Genomic DNA (pooled F2 lines) Reads 244,757 Bases (Mb) 61.1 Ave length 250 Coverage 1.1x Ave Scaffold Coverage* 49% VW8* sequencing data *from 0.9 to 100%
Looking for SNPs (and other Polymorphisms) Parents: VW8 and VW9 • AFLP: 4% of fragments are polymorphic • Infection on different plant species • tomato w/R in root VW8- VW9+ • reproduction on tomato w/R VW8+ VW9+ • reproduction on bean VW8+ VW9+ • reproduction on bean w/R VW8- VW9+ • nightshade w/R VW8- VW9+ • Aggregation VW8+ VW9- • Small or no galls *VW8- VW9+
F2 lines lines differ in aggregation behavior A54 A16 C61 A31 No buffer – 24hr
aggregation Aggregation tendency maps to LG8 (LOD 6 based on 82 F2 lines and using Joinmap)
aggregation Aggregation tendency maps to LG8 (LOD 6 based on 82 F2 lines and using Joinmap)
aggregation Target Sequence Capture
The Agilent TechnologiesSureSelect™ Platform for Target EnrichmentFocus your next-gen sequencing on DNA that mattersNow enabling even more next-generation sequencing users with an expanded portfolio of target enrichment products!
Broad Paper on Cover of February, 2009 Nature Biotechnology Underlying Technology of SureSelect™ Target Enrichment System Agilent SureSelect™ Platform Enabling Products for the Next-Generation Sequencing Workflow Page 45
SureSelect DNA Capture Array Developed in collaboration with Cold Spring Harbor Dr. Greg Hannon et al. SureSelect Target Enrichment System* Developed in collaboration with the Broad Institute Dr. Chad Nusbaum et al. Agilent’s SureSelect™ Platform: New Options Agilent 60mer Array 1-5 µg gDNA (with WGA) 20 µg gDNA (unamplified) 1-3 µg gDNA Agilent SureSelect™ Platform Enabling Products for the Next-Generation Sequencing Workflow Page 46
SureSelect™ Target Enrichment System: Thermodynamic Equilibrium Displacement • More pond, higher volume • Pond excess dominates equilibrium • Small fraction of pond is captured (bias) • High sensitivity to GC content à Redesign • [Low] + solid phase = slower equilibrium • More bait, low volume • Bait excess dominates equilibrium • Large fraction of pond is captured (less bias) • Less sensitivity to GC content • [High] + solution phase = fast equilibrium 10µg a few pmol 0.5µg µg scale Array Prepped libraries 24 hours 72 hours Agilent SureSelect™ Platform Enabling Products for the Next-Generation Sequencing Workflow Page 47
SureSelect™ Target Enrichment System SureSelect™ DNA Capture Array Throughput High Low Study Sizes 10-1,000s samples 1-10 samples (iterative designs) gDNA Input 3 µg 3 µg Amplified library 500 ng 20 µg Automation compatible Yes No Capture of Target DNA 3.3+ Mb (2x tiling) (Custom 120-mer baits) 1Mb (20x tiling) (Custom 60-mer baits) Target enrichment products are tailored to suit customer project needs
Sequencing technologies – the next generation Nature Reviews| Genetics Volume 11, January 2010