1 / 49

Genome Biology for Programmers Lecture Series: Illumina Sequencing

Genome Biology for Programmers Lecture Series: Illumina Sequencing. Chris Daum JGI Illumina Group Lead April 1, 2011. Outline. Workflow Overview Process Science Sample Prep & qPCR quantification Cluster Generation Sequencing Sequencer instruments: GA & HiSeq Illumina Developments

phoebe
Download Presentation

Genome Biology for Programmers Lecture Series: Illumina Sequencing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome Biology for Programmers Lecture Series: Illumina Sequencing Chris Daum JGI Illumina Group Lead April 1, 2011

  2. Outline • Workflow Overview • Process Science • Sample Prep & qPCR quantification • Cluster Generation • Sequencing • Sequencer instruments: GA & HiSeq • Illumina Developments • Illumina quality & continuous improvement

  3. Illumina Workflow Analysis Clustering Sequencing Sample Preparation Sample Quantification Analysis

  4. Sample Preparation Library Preparation – Main Goals: • Prepares sample nucleic acids for sequencing • Many library types and creation procedures exist • However, all preparation results in the same general template structure: • Double-stranded DNA flanked by two different adapters • Variables include: • Sequencing Application & Starting material (e.g. gDNA, mRNA, Mate Pair, Active Chromatin, ChIP-Seq) • Insert Size • Adaptor type • Index for multiplexing

  5. Example Sample Prep Workflow:TruSeq Paired-end Library RNA DNA

  6. Library Quantification - qPCR • Real-time qPCRallows accurate quantification of DNA templates: • qPCR is based on the detection of a fluorescent reporter molecule that increases as PCR product accumulates with each cycle of amplification • By using primers specific to the Illumina universal adapters in a qPCR reaction containing library template, only cluster-forming templates will be amplified and quantified

  7. Library Quantification - qPCR Threshold of florescence for amplicon to produce a Cq Plot Standard curve using controls and determine concentration of library Phases of qPCR: Geometric phase – amplicons doubling every cycle; greatest precision & accuracy for quantitation Cycle Threshold Cq – Cycle of Quantification Log initial concentration Take home: qPCR mimics what is happening on the surface of the flowcell during cluster generation and allows for determining optimal loading concentrations.

  8. Cluster Generation • Process occurs on cBot instrument: • Aspirates DNA samples into flow cell • Automates the formation of amplified clonal clusters from the DNA single molecules • 1000x amplification generates clusters • Hybridizes sequencing primer(s)

  9. Illumina cBot • Cluster Generation 2.0 • Automated system significantly reduces workload for generation of flowcells • Compact design saves lab space • Reagent cartridge reduces prep time

  10. Flowcell

  11. Cluster Generation Prep • Prepare reagents and denature & dilute library: • The goal is to have the perfect cluster density to maximize yield (bp), this is achieved via optimized loading concentrations as determined by qPCR • Considerations: • Too low density: Fewer clusters, less sequence generated • Too high density: Overlapping clusters, removed by analysis filters, poor quality

  12. Cluster Generation Chemistry • Cluster generation Chemistry: • Hybridization • Amplification • Linearization • Blocking • Primer hybridization

  13. Cluster Generation Chemistry • Hybridize Sample fragments & extend:

  14. Cluster Generation Chemistry • Bridge Amplification:

  15. Cluster Generation Chemistry • Linearization, Blocking & Sequencing Primer Hybridization:

  16. Sequencing • Main Goals: • Translate the chemical information of the nucleotides into fluorescence information which can be captured optically • The optical information is then transformed into text, which can be searched, aligned, or otherwise mined for biologically relevant data

  17. Sequencing Workflow

  18. Sequencing by Synthesis • Clustered Flowcell is loaded on Illumina sequencer:

  19. Sequencing Chemistry: First Cycle Base Incorporation • To initiate the first sequencing cycle, add all 4 fluorescently labeled reversible terminators and DNA polymerase enzyme to the flowcell. • The complementary nucleotide will be added to the first position of each cluster. • A laser is then used to excite the attached fluorophore.

  20. Sequencing Chemistry: First Cycle Imaging

  21. Sequencing Chemistry: Cycle 2 and so on…

  22. Sequencing Read 2 • Resynthesis of second strand for Read 2 occurs on sequencer without removing flowcell:

  23. Index for Multiplex Sequencing • Sample multiplexing involves 3 reads: • A: Sample Read 1 is sequenced • B: Read 1 product removed and Index Read is sequenced • C: Template strand used to generate complementary strand, and sample Read 2 is sequenced • Analysis software identifies the index sequence from each cluster so that the sample reads 1 & 2 can be assigned to single sample

  24. Illumina HiSeq2000 Sequencer Nifty Lights

  25. HiSeq2000 Reagents

  26. 1 HiSeq = 2 GAs

  27. HiSeq2000 Fluidics Fluidics were the Achilles heel of the GA, and now 2X in the HiSeq

  28. HiSeq2000 Fluidics

  29. FY11 Service Metrics: Pareto

  30. HiSeq: Temperature control • 3 mechanisms: • Heat extraction via liquid coolant • Flow cell temperature control via Peltier • Maintain reagents temperature via cooled compartment • Reagent Chiller: • All reagents cooled at 4C • Condensation Pump runs every 4 min for 30 sec Flow cell sits on Peltierblocks, and is water cooled (heat extraction from underneath)

  31. HiSeqFlowcell Loading

  32. HiSeq Imaging

  33. HiSeq Optics

  34. HiSeq Lasers

  35. HiSeq Software Interface

  36. HiSeq Software Interface

  37. HiSeq – Real Time Metrics

  38. HiSeqvs GA

  39. Cost & Throughput Comparison • Notes: • Throughput metrics are averages from runs performed in FY11 for each of the run types to date • Italicized HiSeq Bases & Reads throughput metrics are estimates based on 2x100 run type since we have limited data on other run types • Only vendor reagent costs shown here; library creation and overhead costs are not included, but are roughly equal and are mostly independent of run type • Cost per million reads goes up with the longer run types, but the readlength increases as well and this makes each read more valuable for some assembly applications • HiSeq 2x150 run type not yet supported & the current HiSeq chemistry has worse quality beyond 80-100bases than compared to GA • The HiSeq platform is still new and we are experiencing a higher number of hardware failures than GA; Illumina does replace reagents for failed runs and we rerun failed flowcells immediately whenever possible.

  40. HiSeq Development Coming in early Summer:

  41. HiSeq Development

  42. HiSeq Development

  43. Introducing MiSeq

  44. MiSeq: all-in-one

  45. MiSeq: Fast, low throughput

  46. Providing Quality Sequence Incident Reporting & Resolution (JIRA) Troubleshooting Procedures Throughput Goals & Metrics Continuous Improvement - Lean Six Sigma Failure Tracking & SPC Charts; RQC Instrument Status & real-time run monitoring Instrument Utilization & Efficiency

  47. LLNL – Six Sigma Training • Tools and methodologies to: • Improve work quality • Improve process efficiencies & eliminate waste • Improve employee and customer satisfaction • Lean Six Sigma is about: • Eliminating waste and improving process flow • Focusing on reducing variation and improving process yield by following a problem-solving approach using statistical tools

  48. What is Six Sigma? • A Six Sigma process is literally one that’s statistically 99.99966% successful. • This is not always cost effective to achieve, so as a methodology it’s about gaining control of a process and implementing improvements.

  49. What is Six Sigma? • Six Sigma is a data driven problem solving approach where process inputs (Xs) are identified and optimized to impact the output (Y) • The output is a function of the inputs and process • Y: Output • f: function • X: variables that must be controlled to consistently predict Y Y = f(x)

More Related