Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington - PowerPoint PPT Presentation

slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington PowerPoint Presentation
Download Presentation
Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington

play fullscreen
1 / 67
Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington
468 Views
Download Presentation
fountain
Download Presentation

Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Overview of SNP Genotyping Debbie Nickerson Department of Genome Sciences University of Washington debnick@u.washington.edu

  2. SNP Genotyping - Overview • Project Rationale • Genotyping Strategies/Technical Leaps • Data Management/Quality Control

  3. SNP Project Rationale • Heritability • Power - Number of Individuals • Number of SNPs - Candidate Gene, Pathway, Genome • 5-10 SNPs, 400 to 1,000, 10K, 500K • DNA requirements • Cost

  4. C C C Allele-Specific Hybridization T a r g e t G A F a i l t o h y b r i d i z e H y b r i d i z e + d d C T P C Polymerase Extension T a r g e t G A C i n c o r p o r a t e d C F a i l s t o i n c o r p o r a t e C C C Oligonucleotide Ligation T a r g e t G A L i g a t e F a i l t o l i g a t e Fluorescence Polarization Sequenom C C C Taqman T a r g e t G A D e g r a d e F a i l t o d e g r a d e SNPlex Parallele Illumina SNP Genotyping Matched Mis-Matched P r o b e a n d T a r g e t C A l l e l e T A l l e l e Eclipse Dash Molecular Beacon Affymetrix

  5. Size Analysis by Electrophoresis Arrays - Custom or Universal SNP Typing Formats Scale Microtiter Plates - Fluorescence Low eg. Taqman - Good for a few markers - lots of samples - PCR prior to genotyping Medium eg. SNPlex - Intermediate Multiplexing reduces costs - Genotype directly on genomic DNA - new paradigm for high throughput High eg. Illumina, ParAllele, Affymetrics - Highly multiplexed - 96, 1,500 SNPs and beyond (500K+)

  6. Defining the scale of the genotyping project is key to selecting an approach: 5 to 10 SNPs in a candidate gene - Many approaches (expensive ~ 0.60 per SNP/genotype) 48 ( to 96) SNPs in a handful of candidate genes (~ 0.25 to 0.30 per SNP/genotype) 384 - 1,536 SNPs - cost reductions based on scale (~0.08 - 0.15 per SNP/genotype) 300,000 to 500,000 SNPs defined format (~0.002 per SNP/ genotype) 10,000-20,000 SNPs - defined and custom formats (~0.03 per SNP/genotype) 1000 individuals $6,000 $~29,000 $57,600-122,880 $800,000 $>250,000

  7. Many Approaches to Genotype a Handful of SNPs PCR region prior to SNP genotyping - Adds to cost - Many use modified primers - the more modified, the higher the cost • Taqman • Single base extension - Fluorescence Polarization • Sequenom - Mass Spec • Eclipse • Dash • Molecular Beacons

  8. Taqman Genotyping with fluorescence-based homogenous assays (single-tube assay) = 1 SNP/ tube

  9. Genotype Calling - Cluster Analysis SNP 1252 - T SNP 1252 - C

  10. Genotyping by Mass Spectrometry - 24

  11. Technological Leap - No advance PCR Universal PCR after preparing multiple regions for analysis - Several based on primer specific on genomic DNA followed by PCR of the ligated products - different strategies and different readouts. SNPlex, Illumina, Parallele (Affymetrix) Also, reduced representation - Affymetrix - cut with restriction enzyme, then ligate linkers and amplify from linkers and follow by chip hybridization to read out.

  12. A P P G G Genomic DNATarget C A SNPlex Assay - 48 SNPs Universal PCR Priming site Allele Specific Sequence ZipCode1 Universal PCR Priming site Locus Specific Sequence ZipCode2 1. Ligation Ligation Product Formed C (Homozygote shown in this case) 2. Clean-up

  13. 3. Multiplexed Universal PCR Univ. PCR Primer Biotin Univ. PCR Primer PCR & ZipChute Hybridization 4. Capture double stranded DNA- microtiter plate (Streptavidin) 5. Denature double stranded DNA 6. Wash away one strand 7.Zip Chute Hybridization •

  14. Detection 9. Characterize on Capillary Sequencer SNP 1 SNP 2

  15. SNPlex Readout ZipChuten Position n N(n) T n ~ 48/lane ~2000 lanes/day ~96,000 genotypes/day Zipchute3 NNN Position 3 T Zipchute2 NN Position 2 A Zipchute1 N Position 1 C

  16. Genotyping - Universal Tag Readouts Multiplexed G A C T L o c u s 2 S p e c i f i c S e q u e n c e L o c u s 1 S p e c i f i c S e q u e n c e T a g 2 s e q u e n c e c T a g 2 s e q u e n c e T a g 1 s e q u e n c e c T a g 1 s e q u e n c e S u b s t r a t e S u b s t r a t e B e a d o r C h i p B e a d o r C h i p B e a d A r r a y C h i p A r r a y T a g 1 T a g 2 T a g 3 T a g 4 Multiplex ~96 - 20,000 SNPs Not dependent on primary PCR ParAllele Affymetrix Illumina

  17. Arrays - High Density Genotyping Thousands of SNPs and Beyond • “Bead” Arrays - Illumina • Manufactured by self-assembly • Beads identified by decoding

  18. Sentrix™ Platform • Sentrix™ 96 Multi-array Matrix matches standard microtiter plates (96 - 1536 SNPs/well) • Up to ~140,000 assays per matrix

  19. Fluorescent Image of BeadArray ~ 3 micron diameter beads ~ 5 micron center-to-center ~50,000 features on ~1.5 mm diameter bundle Currently: up to 1,536 SNPs genotyped per bundle - at least 30 beads per code - many internal replicates

  20. Universal forward Sequences (1, 2) Universal reverse sequence 5’ 3’ 3’ 5’ Illumicode ™ Sequence tag Allele specific Sequence Locus specific Sequence Illumina Assay - 3 Primers per SNP G (1-20 nt gap) A C Genomic DNA template T SNP

  21. A Universal PCR Sequence 1 Universal PCR Sequence 2 illumiCode’ Address Allele Specific Extension & Ligation G Universal PCR Sequence 3’ Allele-Specific Extension and Ligation Polymerase Ligase [T/A] [T/C] Genomic DNA

  22. Cy3 Universal Primer 1 Cy5 Universal Primer 2 Universal Primer P3 PCR with Common Primers GoldenGate™ AssayAmplification A illumiCode #561 Amplification Template

  23. illumiCode #561 illumiCode #217 illumiCode #1024 /\/\/\/ /\/\/\/ /\/\/\/ /\/\/\/ /\/\/\/ T/T A/A C/T Hybridization to Universal IllumiCodeTM

  24. BeadArray Reader • Confocal laser scanning system • Resolution, 0.8 micron • Two lasers 532, 635 nm • Supports Cy3 & Cy5 imaging • Sentrix Arrays (96 bundle) and Slides for 100k fixed formats

  25. Process Controls Mismatch High AT/GC Gender Gap Second Hyb First Hyb Contamination

  26. Illumina Readout for Sentrix Array > 1,000 SNPs Assayed on 96 Samples

  27. Genotyping - Universal Tag Readouts Multiplexed G A C T L o c u s 2 S p e c i f i c S e q u e n c e L o c u s 1 S p e c i f i c S e q u e n c e T a g 2 s e q u e n c e c T a g 2 s e q u e n c e T a g 1 s e q u e n c e c T a g 1 s e q u e n c e S u b s t r a t e S u b s t r a t e B e a d o r C h i p B e a d o r C h i p B e a d A r r a y C h i p A r r a y T a g 1 T a g 2 T a g 3 T a g 4 Multiplex ~96 - 20,000 SNPs Not dependent on primary PCR ParAllele Affymetrix Illumina

  28. Parallele - Defined and Custom Formats • - Intermediate Strategy • Multiplex ~ 20,000 SNPs • - Affymetrix readout Universal Arrays

  29. Parallele Technology (MIP) Molecular Inversion Probes (MIP)

  30. Affymetrix’s Chip

  31. Whole Genome Association Strategies Two Platforms Available Different Designs - Affymetrix - Illumina

  32. Affymetrix GeneChip Mapping 500K Array Set

  33. Affymetrix Assays 100,000 to 500,000 fixed formats Whole genome strategy Not all regions - unique Cheap but costly and throughput an issue

  34. 48 individuals Call rate, concordance ~650K SNPs 500K: Content Optimized SNP Selection • Initial Selection: 48 people • 2.2M SNPs • 25 million genotypes • 16 each Caucasian, African, Asian • All HapMap samples • Maximize performance: Second selection over 400 people • 270 HapMap Samples • 130 diversity samples • Accuracy • HW, Mendel error, reproducibility • Call rates • Maximize information content: • Prioritize SNPs based on LD & HapMap (Broad Institute) ~2,200,000 SNPs From Public & Perlegen 400 samples Call rate, accuracy LD 500K SNPs

  35. The Assay - Details Optimized for 250-2000bp http://www.affymetrix.com/products/arrays/specific/100k.affx

  36. 40 probes are used per SNP

  37. High Throughput Chip Formats

  38. 80% genome coverage of Mapping 500K • 500K run on 270 HapMap samples • Pairwise r2 analysis for common SNPs (MAF>0.05) • Robust coverage across populations r2=0.8 • CEPH, Asian ~66% • Yoruba ~45% • 2 & 3 marker predictors (multimarker) further increase coverage

  39. Mapping 500K Set • >500K SNP’s • 2 array set • Performance • 93-98% call rate range (>95% average) • >99.5% concordance with HapMap Genotypes, 99.9% reproducibility • SNP lists, annotation and genotype data available without restriction at Affymetrix.com

  40. Illumina - Infinium I & II 10K - 300K

  41. Two haptens/colors Bead U A C WGA target Infinium II AssaySingle Base Extension

  42. HumanHap-1 Genotyping BeadChip Content Maximize coverage of human variation by choosing tag SNPs to uniquely identify haplotypes. Tag SNP selection process: • Examine HapMap Phase I SNPs with MAF > 0.05 in CEU • Bin SNPs in high LD with one another using ldSelect (Carlson, et al. 2004) 3. Select tag SNP with highest design score for each bin.

  43. 317K Content Selection • Tag SNPs • r2≥ 0.80 for bins containing SNPs within 10kb of genes or in evolutionarily conserved regions (ECRs) • r2≥ 0.70 for bins containing SNPs outside of genes or ECRs. • Additional Content • ~8,000 nsSNPs • ~1,500 tag SNPs selected from high density SNP data in the MHC region • Total 317,503 loci

  44. HumanHap300 Genomic Coverage by Population