1 / 63

SNP Resources: Finding SNPs, Databases and Data Extraction

SNP Resources: Finding SNPs, Databases and Data Extraction. Debbie Nickerson debnick@u.washington.edu SeattleSNPs. Complex inheritance/disease. Many Other Genes. Variant Gene. Environment. Disease. Diabetes Heart Disease Schizophrenia Obesity Multiple Sclerosis Celiac Disease

tayte
Download Presentation

SNP Resources: Finding SNPs, Databases and Data Extraction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson debnick@u.washington.edu SeattleSNPs

  2. Complex inheritance/disease Many Other Genes Variant Gene Environment Disease Diabetes Heart Disease Schizophrenia Obesity Multiple Sclerosis Celiac Disease Cancer Asthma Autism Two hypotheses: 1- common disease/common variant? 2- common disease/many rare variants?

  3. duplications deletions Genomic Variation inversions insertions Human Genetic Variation Copy-Number Variants Single Nucleotide Polymorphisms Small indels structural variation Frequency • Gene-rich, eg immune response, drug metabolism • Abundant cytogenetic 1 bp 1 chr Size

  4. Total sequence variation in humans Population size: 6x109 (diploid) Mutation rate: 2x10–8 per bp per generation Expected “hits”: 240 for each bp Every variant compatible with life exists in the population BUT: Most are vanishingly rare Compare 2 haploid genomes: 1 SNP per 1331 bp* *The International SNP Map Working Group, Nature409:928 - 933 (2001)

  5. Building Maps of Single Nucleotide Polymorphisms(SNPs)ATTCGGCATGAAATTCGGGATGAA Developed in two overlapping phases: SNP Discovery SNP Genotyping

  6. mRNA cDNA Library BAC Library EST Overlap BAC Overlap Validated - 5.6 MILLON SNPS G C Finding SNPs: Sequence-based SNP Mining Genomic RRS Library Random Shotgun DNA SEQUENCING Shotgun Overlap Align to Reference RANDOM Sequence Overlap - SNP Discovery GTTACGCCAATACAGGATCCAGGAGATTACC GTTACGCCAATACAGCATCCAGGAGATTACC > 11 Million SNPs

  7. 1.0 Candidate Gene Sequencing 96 48 24 16 HapMap Based on ~ 6-8 Chromosomes random 8 8 0.5 2 0.0 0.0 0.1 0.2 0.3 0.4 0.5 Minor Allele Frequency (MAF) Increasing Sample Size Improves SNP Discovery { GTTACGCCAATACAGGATCCAGGAGATTACC GTTACGCCAATACAGCATCCAGGAGATTACC 2 chromosomes Fraction of SNPs Discovered New 1000 Genome Program

  8. Genotype - Phenotype Studies You have candidate gene/region/pathway of interest and samples ready to study: What SNPs are available? How do I find the common SNPs? What is the validation/quality of the SNPs? Are these SNPs informative in my population/samples? What can I download information? How do I pick the “best” SNPs? - Dana Crawford

  9. Minimal SNP information for genotyping/characterization • What is the SNP? Flanking sequence and alleles. • FASTA format • >snp_name • ACCGAGTAGCCAG • [A/G] • ACTGGGATAGAAC • dbSNP reference SNP # (rs #) • Where is the SNP mapped? Exon, promoter, UTR, etc • How was it discovered? Method • What assurances do you have that it is real? Validated how? • What population – African, European, etc? • What is the allele frequency of each SNP? Common (>5%), rare • Are other SNPs associated - redundant? • Is genotyping data for control populations available?

  10. Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? • 1. SeattleSNPs - Candidate gene website • 2. Other web applications • GVS • HapMap Genome Browser • 3. Entrez Gene • - dbSNP • - Entrez SNP

  11. Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? • 1. SeattleSNPs - Candidate gene website • 2. Other web applications • GVS • HapMap Genome Browser • 3. Entrez Gene • - dbSNP • - Entrez SNP

  12. Finding SNPs: Seattle SNPs Candidate Genes pga.gs.washington.edu

  13. Finding SNPs: SeattleSNPs Candidate Genes Example - PCSK9

  14. Finding SNPs: SeattleSNPs Candidate Genes

  15. Finding SNPs: SeattleSNPs Candidate Genes

  16. AD ED

  17. SNP_pos <tab> Ind_ID <tab> allele1 <tab> allele2 Repeat for all individuals Repeat for next SNP

  18. PolyPhen - Polymorphism Phenotyping Structural protein characteristics and evolutionary comparison SIFT = Sorting Intolerant From Tolerant Evolutionary comparison of non-synonymous SNPs

  19. Finding SNPs: SeattleSNPs Candidate Genes pga.gs.washington.edu

  20. Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? • 1. SeattleSNPs - Candidate gene website • 2. Other web applications • GVS • HapMap Genome Browser • 3. Entrez Gene • - dbSNP • - Entrez SNP

  21. GVS: Genome Variation Server http://gvs.gs.washington.edu/GVS/ • Provides rapid analysis of 4.5 million genotyped SNPs from dbSNP and the HapMap • Mapped to human genome build 36 (hg18) • Displays genotype data in text and image formats • Displays tagSNPs or clusters of informative SNPs in text and image formats • Displays linkage disequilibrium (LD) in text and image formats • Online tutorial provided at OpenHelix.com

  22. GVS: Genome Variation Server LDLR http://gvs.gs.washington.edu/GVS/

  23. GVS: Genome Variation Server

  24. GVS: Genome Variation Server • Table of genotypes • Image of visual genotypes

  25. GVS: Genome Variation Server Genotypes displayed in prettybase table and visual genotype graphic

  26. GVS: Genome Variation Server

  27. High Density Genic Coverage(SeattleSNPs) Low Density Genome Coverage (HapMap) = Seattle \SNP discovery (1/200 bp) =HapMap SNPs (~1/1000 bp) GVS: Genome Variation Server Dense genotypes around a candidate gene can be integrated with broader HapMap genotypes

  28. GVS: Genome Variation Server Dense genotypes around a candidate gene can be integrated with lower-density HapMap genotypes

  29. GVS: Genome Variation Server Common samples-combined variations B. Combined samples- common variations Combined samples- combined variations Common Combined

  30. GVS: Genome Variation Server Common samples- combined variations -Common samples- Combined variations

  31. GVS: Genome Variation Server B. Combined samples- common variations SeattleSNPs -Combined samples- HapMap

  32. GVS: Genome Variation Server C. Combined samples- combined variations Combined variations -Combined samples-

  33. Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? • 1. SeattleSNPs - Candidate gene website • 2. Other web applications • GVS • HapMap Genome Browser • 3. Entrez Gene • - dbSNP • - Entrez SNP

  34. www.hapmap.org

  35. Finding SNPs: HapMap Browser

  36. Finding SNPs: HapMap Browser • HapMap data sets are useful because individual genotype data in deeply sampled populations can be used to determine optimal genotyping strategies (tagSNPs) or perform population genetic analyses (linkage disequilbrium) • Data are specific to the HapMap project (not all dbSNP) • HapMap data is available in dbSNP • Visualization of data and direct access to SNP data, individual genotypes, and LD analysis possible in the browser and formats can be saved for Haploview

  37. Finding SNPs: Databases and Extraction How do I find and download SNP data for analysis/genotyping? • 1. SeattleSNPs - Candidate gene website • 2. Other web applications • GVS • HapMap Genome Browser • 3. Entrez Gene • - dbSNP • - Entrez SNP

  38. NCBI - Database Resource PCSK9 www.ncbi.nlm.nih.gov

  39. Finding SNPs using NCBI databases http://www.ncbi.nlm.nih.gov/

  40. Default View cSNPs

  41. Finding SNPs using NCBI databases http://www.ncbi.nlm.nih.gov/

  42. PCSK9

More Related