1 / 59

Genome-wide association studies: what they can and can't tell us about disease biology

kalkin
Download Presentation

Genome-wide association studies: what they can and can't tell us about disease biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATGGATTCTGGTATGTTCTAGCGCTTGCACCATCCCATTTAACTGTAAGAAGAATTGCAC GGTCCCAATTGCTCGAGAGATTTCTCTTTTACCTTTTTTTACTATTTTTCACTCTCCCAT AACCTCCTATATTGACTGATCTGTAATAACCACGATATTATTGGAATAAATAGGGGCTTG AAATTTGGAAAAAAAAAAAAACTGAAATATTTTCGTGATAAGTGATAGTGATATTCTTCT TTTATTTGCTACTGTTACTAAGTCTCATGTACTAACATCGATTGCTTCATTCTTTTTGTT GCTATATTATATGTTTAGAGGTTGCTGCTTTGGTTATTGATAACGGTTCTGGTATGTGTA AAGCCGGTTTTGCCGGTGACGACGCTCCTCGTGCTGTCTTCCCATCTATCGTCGGTAGACAAGACACCAAGGTATCATGGTCGGTATGGGTCAAAAAGACTCCTACGTTGGTGATGAA CTCAATCCAAGAGAGGTATCTTGACTTTACGTTACCCAATTGAACACGGTATTGTCACCA ACTGGGACGATATGGAAAAGATCTGGCATCATACCTTCTACAACGAATTGAGAGTTGCCC CAGAAGAACACCCTGTTCTTTTGACTGAAGCTCCAATGAACCCTAAATCAAACAGAGAAA AGATGACTCAAATTATGTTTGAAACTTTCAACGTTCCAGCCTTCTACGTTTCCATCCAAG CCGTTTTGTCCTTGTACTCTTCCGGTAGAACTACTGGTATTGTTTTGGATTCCGGTGATG GTGTTACTCACGTCGTTCCAATTTACGCTGGTTTCTCTCTACCTCACGCCATTTTGAGAA TCGATTTGGCCGGTAGAGATTTGACTGACTACTTGATGAAGATCTTGAGTGAACGTGGTT ACTCTTTCTCCACCACTGCTGAAAGAGAAATTGTCCGTGACATCAAGGAAAAACTATGTT ACGTCGCCTTGGACTTCGAACAAGAAATGCAAACCGCTGCTCAATCTTCTTCAATTGAAA AATCCTACGAACTTCCAGATGGTCAAGTCATCACTATTGGTAACGAAAGATTCAGAGCCC CAGAAGCTTTGTTCCATCCTTCTGTTTTGGGTTTGGAATCTGCCGGTATTGACCAAACTA CTTACAACTCCATCATGAAGTGTGATGTCGATGTCCGTAAGGAATTATACGGTAACATCG TTATGTCCGGTGGTACCACCATGTTCCCAGGTATTGCCGAAAGAATGCAAAAGGAAATCA CCGCTTTGGCTCCATCTTCCATGAAGGTCAAGATCATTGCTCCTCCAGAAAGAAAGTACT CCGTCTGGATTGGTGGTTCTATCTTGGCTTCTTTGACTACCTTCCAACAAATGTGGATCT CAAAACAAGAATACGACGAAAGTGGTCCATCTATCGTTCACCACAAGTGTTTCTAA Genome-wide association studies: what they can and can't tell us about disease biology Hunter Fraser

  2. Causes of disease • Environmental • Genetic

  3. Causes of disease • Environmental • Correlation vs. causation • Genetic

  4. Causes of disease • Environmental • Correlation vs. causation • Genetic

  5. Causes of disease • Environmental • Correlation vs. causation • Genetic

  6. Causes of disease • Environmental • Correlation vs. causation • Infinite possibilities • Genetic

  7. Polymorphisms • SNPs • Insertions/deletions • Copy-number variation • Inversions • Translocations

  8. Causes of disease • Environmental • Correlation vs. causation • Infinite possibilities • Genetic • Causality clear (in properly designed study)

  9. Causes of disease • Environmental • Correlation vs. causation • Infinite possibilities • Genetic • Causality clear (in properly designed study) • Finite number of polymorphisms (~107 common)

  10. Genome-wide association studies (GWAS) • Polymorphisms are the basis for the genetic component of disease risk variability • Genetic component (heritability) often explains >50% of disease incidence • Goal: For every polymorphism, determine what diseases are affected

  11. Genome-wide association studies (GWAS) Diseases . . . Polymorphisms

  12. Genome-wide association studies (GWAS) Diseases . . . Polymorphisms

  13. Genome-wide association studies (GWAS) Diseases . . . Polymorphisms

  14. Genome-wide association studies (GWAS) Diseases . . . Polymorphisms

  15. Genome-wide association studies (GWAS) • Main idea: look for genetic differences of people with vs. without a disease • First, need a method able to genotype thousands/millions of polymorphisms at once– microarrays • Second, need to know where are the common polymorphisms– HapMap project • Third, need HUGE cohorts of people to find subtle allele frequency differences

  16. Genome-wide association studies (GWAS)

  17. Genome-wide association studies (GWAS) Genomic position

  18. Genome-wide association studies (GWAS) • End result of a successful GWAS:

  19. Genome-wide association studies (GWAS) • End result of a successful GWAS: • What does this actually tell us?

  20. Genome-wide association studies (GWAS) • End result of a successful GWAS: • What does this actually tell us? • How to predict disease risk from genotype? • What polymorphisms cause disease? • What genes are involved?

  21. What do GWAS tell us? • How to predict disease risk from genotype? • Potentially yes, but in practice GWAS explain only a few percent of the genetic component (“Missing heritability”) • What polymorphisms cause disease? • No, GWAS use “tag SNPs” in linkage disequilibrium with causal polymorphisms

  22. What do GWAS tell us? • What genes are involved? • An important issue for our understanding of disease biology • Candidate genes are nearly always guessed, but this is biased by prior knowledge • Almost never any evidence implicating a particular gene, and most hits are intergenic • Transcriptional enhancers can act at long distance, making this a nontrivial problem

  23. Inferring disease genes • Need an unbiased, systematic method to infer disease genes from GWAS hits • One solution: integrate results with separate GWAS for gene expression “traits” (eQTL) DNA Genotyping array RNA Expression array

  24. eQTL mapping

  25. eQTL mapping • How does this tell us about disease genes? • Coincidence, or SNP X affects disease Z via its effect on gene Y; Y is a “disease gene” SNP X expr of gene Y disease Z

  26. Affymetrix exon arrays • ~6 million probes • Covers nearly all exons in human genome

  27. Alternative splicing

  28. The data set • Exon arrays on lymphoblastoid cell lines • 89 YRI (Yoruban from Nigeria) and 87 CEPH (European-American from Utah), genotyped at >3 million SNPs (HapMap)

  29. The analysis • Compute transcript variation for each exon • Compare transcript variation patterns to SNP genotypes • Integrate with disease GWAS results

  30. The analysis • Compute transcript variation for each exon • Compare transcript variation patterns to SNP genotypes • Integrate with disease GWAS results

  31. The analysis • Compute transcript variation for each exon • Compare transcript variation patterns to SNP genotypes • Integrate with disease GWAS results

  32. Comparing transcripts to SNPs • Compare each exon to all HapMap SNPs • Strongest correlations are local– SNPs nearby the exon(s) they affect

  33. Comparing transcripts to SNPs • Calculate correlation between each exon’s expression level and all SNPs within 100kb

  34. Comparing transcripts to SNPs • Calculate correlation between each exon’s expression level and all SNPs within 100kb

  35. Comparing transcripts to SNPs • Calculate correlation between each exon’s expression level and all SNPs within 100kb SNPs 100 kb 100 kb

  36. Significant associations: 1,061 exons, ~10 SNPs/exon

  37. Top hit: IRF5 • A transcription factor that acts downstream of Toll-like receptors, and a cause of lupus

  38. Top hit: IRF5 • A transcription factor that acts downstream of Toll-like receptors, and a cause of lupus probeset

  39. Top hit: IRF5 • A transcription factor that acts downstream of Toll-like receptors, and a cause of lupus probeset r = 0.97 No overlap between genotypes

  40. Second hit: OAS1 chr12 • 2’,5’-oligoadenylate synthetase 1 • Splice site mutations contribute to viral infection susceptibility and T1D (Bonnevie-Nielsen et al., 2005) • Three known splice variants of exons 5/6

  41. Second hit: OAS1 chr12 probeset

  42. Second hit: OAS1 chr12 probeset r = 0.93

  43. Second hit: OAS1 chr12 probeset probeset r = 0.93

  44. The analysis • Compute transcript variation for each exon • Compare transcript variation patterns to SNP genotypes • Integrate with disease GWAS results

  45. Genome-wide association studies • Could some disease-associated SNPs influence disease through splicing? • Compiled list of 68 disease SNPs

  46. Genome-wide association studies • Could some disease-associated SNPs influence disease through splicing? • Compiled list of 68 disease SNPs 4 overlaps with splicing SNPs (expect 0.1)

  47. Genome-wide association studies • Could some disease-associated SNPs influence disease through splicing? • Compiled list of 68 disease SNPs 4 overlaps with splicing SNPs (expect 0.1) • All 4 are from autoimmune diseases! 24 autoimmune SNPs, expect 0.04 overlaps • Suggests tissue specificity of polymorphic transcript variation

  48. Autoimmune-associated SNP • One SNP was associated with multiple autoimmune conditions (T1D, CD) • Located near PTPN2, a tyrosine phosphatase involved in immune regulation • PTPN2 known to have two splice forms, only one of which has an NLS (Ibarra-Sanchez et al., 2000) • Two isoforms have very different target proteins

  49. Autoimmune-associated SNP • Could this SNP cause autoimmune diseases by changing the ratio of PTPN2 splice forms? probeset PTPN2

More Related