1 / 44

High-Throughput Sequencing

High-Throughput Sequencing. Advanced Genomic Data Analysis BIOS 691- 804, 2012 Mark Reimers. Outline. What can we do with next-generation sequencing? Sequence variations Quantitative data What technologies are now available and coming ? Roche, Illumina, SOLiD , Ion Torrent, etc….

yana
Download Presentation

High-Throughput Sequencing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High-Throughput Sequencing Advanced Genomic Data Analysis BIOS 691-804, 2012 Mark Reimers

  2. Outline • What can we do with next-generation sequencing? • Sequence variations • Quantitative data • What technologies are now available and coming? • Roche, Illumina, SOLiD, Ion Torrent, etc…

  3. What is High-Throughput Sequencing? • Generating many thousands or millions of short (30 to 1,000 base) sequences by sequencing parts of longer (200+ base) DNA fragments • Most reads are single-end • Paired-end reads on opposite strands can be made by most technologies

  4. Full genome sequencing

  5. Exome sequencing

  6. Targeted Re-sequencing

  7. RNA-seq

  8. ChIP-seq

  9. DNA methylation profiling mC  C C  U After PCR C  C U  T PCR+Seq

  10. Mapping of chromatin interactions (HiC) (courtesy Elemento lab)

  11. HTS Technologies • Roche-454 • Illumina • SOLiD • Ion Torrent • Newer Technologies • Outlook

  12. Founded by Jonathan Rothberg as a secret project (code-named ‘454’) within CuraGen

  13. Roche 454 Sequencing Metzker, NG 2010

  14. Roche 454 Sequencing

  15. Roche 454 Peak Heights Data

  16. Advantages & Drawbacks • PRO Long reads are uniquely identifiable Relatively quick ~20 hours total • CON Cost is relatively high Frequent errors in runs of bases Frequent G-A transitions

  17. Best Uses of Roche 454 • De novo small genome (prokaryote or small eukaryote genome) sequencing • Metagenomics by16S profiling • Used to be best for metagenomics by random sequencing • new long reads from Illumina are competitive • Targeted re-sequencing

  18. Illumina (Solexa) Genome Analyzer and Flow Cell

  19. IlluminaOn-Chip Amplification

  20. Illumina (Solexa) Sequencing

  21. Paired-End Illumina Method Paired-end reads are easy on Illumina because the clusters are generated by ligated linkers. Different linkers and primers are attached to each end

  22. Advantages & Drawbacks • PRO • Very high throughput • Most widespread technology so that comparisons seemeasier • CON • Sequencing representation biases, especially at beginning • Slow – up to a week for a run

  23. Best Uses of Illumina • Expression analysis (RNA-Seq) • Chromatin Immunoprecipitation (ChIP-Seq) • Metagenomicsby random sequencing

  24. SOLiDSequencing by Oligonucleotide Ligation and Detection

  25. SOLiD History • George Church licensed his ‘polony’ technique to Agencourt Personal Genomics • ABI acquired the SOLiD technology from Agencourt in 2006

  26. SOLiD Preparation Steps • Prepare either single or ‘mate-pair’ library from DNA fragments • Attach library molecules to beads; amplify library by emulsion PCR • Modify 3’ ends of clones; attach beads to surface

  27. Emulsion PCR • Emulsion PCR isolates individual DNA molecules along with primer-coated beads in aqueous droplets within an oil phase. A polymerase chain reaction (PCR) then coats each bead with clonal copies of the DNA molecule. The bead is immobilized for sequencing.

  28. ABI SOLiD Sequencing Cycle

  29. SOLiD Reads Each Base Twice Most bases are matched by two primers in different ligation cycles

  30. SOLiD Color Coding Scheme Blue is color of homopolymer runs If you translate color reads directly into base reads then every sequence with an error in the color calls will result in a frame-shift of the base calls. it is best to convert the reference sequence into color-space. There is one unambiguous conversion of a base reference sequence into color-space, but there are four possible conversions of a color string into base strings

  31. Advantages & Drawbacks • PRO • Very high throughput • Di-base ligation ensures built-in accuracy check • Low error rate for low-coverage • Can handle repetitive regions easily • CON • Strong cycle-dependent biases (can be modeled and partly overcome – see Wu et al, Nature Methods, 2011) • Low quality color calls (Phred < 20) are common • Reported problems with paired ends – most mapped tags don’t map to the same chromosome

  32. Ion Torrent Sample Prep • Emulsion PCR loads copies of unique sequences onto beads • One bead is deposited in each well of a micro-machined plate

  33. An Ion Torrent Chip

  34. When a nucleotide is incorporated into a strand of DNA by a polymerase, a hydrogen ion is released From promotional material

  35. Ion Torrent Sequencing Process As in 454, nucleotides are washed over the nascent strand in a prescribed sequence. Each time a nucleotide is incorporated, hydrogen ions are released and detected.

  36. Ion Torrent Advantages & Drawbacks Loading Density • PRO • Very high throughput potential • Very fast (an afternoon) • CON • Homopolymer runs are still a problem • Very uneven loading of sequences wastes a lot of real estate on the chips • No prospect of paired-ends • Not many applications yet for their error model

  37. Newest Machine – Ion Proton • $150K per machine • Ion Proton I chip has 165 million sensors • Intended for exomes • Ion Proton II chip has 660 million sensors • 50X more than 318 chip • Claim $1K genome this year

  38. Newer Technologies • Complete Genomics • Pacific Biosciences • Oxford Nanopore

  39. Complete Genomics • Service company only – no equipment sales • ~$4,000 per human genome (2011 price) • DNA Nanoball technology generates paired-end sequences plated at high density • Sequenced by ligation

  40. Pacific Biosciences • Single-molecule real-time (SMRT) sequencing by circular strand technology using semiconductor technology • Long reads promised at under $200 per genome • High error rates reported

  41. Oxford Nanopore • Single-molecule sequencing by threading DNA through a protein nanopore • GridION is a general technology for sequencing polymers by measuring current – can do polypeptides also

More Related