1 / 63

Genome-wide association studies (GWAS)

Genome-wide association studies (GWAS). Thomas Hoffmann. Outline for GWAS. Review / Overview Design Analysis QC Prostate cancer example Imputation Replication & Meta-analysis Advanced analysis intro (more next lecture) Limitations & “missing heritability” Gene/pathway tests

Download Presentation

Genome-wide association studies (GWAS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome-wide association studies (GWAS) Thomas Hoffmann

  2. Outline for GWAS • Review / Overview • Design • Analysis • QC • Prostate cancer example • Imputation • Replication & Meta-analysis • Advanced analysis intro (more next lecture) • Limitations & “missing heritability” • Gene/pathway tests • Polygenic models

  3. Outline for GWAS • Review / Overview • Design • Analysis • QC • Prostate cancer example • Imputation • Replication & Meta-analysis • Advanced analysis intro (more next lecture) • Limitations & “missing heritability” • Gene/pathway tests • Polygenic models

  4. Manolio et al., Clin Invest 2008

  5. Candidate Gene or GWAS Recap: Association studies (guilt by association) Hirschhorn & Daly, Nat Rev Genet 2005

  6. GWAS Microarray Assay ~ 0.7 - 5M SNPs (keeps increasing) Affymetrix, http://www.affymetrix.com

  7. Genotype calls Bad calls! Good calls!

  8. Outline for GWAS • Review / Overview • Design • Analysis • QC • Prostate cancer example • Imputation • Replication & Meta-analysis • Advanced analysis intro (more next lecture) • Limitations & “missing heritability” • Gene/pathway tests • Polygenic models

  9. Genome-wide assocation studies (GWAS)

  10. One- and two-stage GWA designs Two-Stage Design One-Stage Design SNPs SNPs nsamples Stage 1 Samples Samples Stage 2 nmarkers

  11. One-Stage Design SNPs Samples Two-Stage Design Joint analysis Replication-based analysis SNPs SNPs 1 1 Stage 1 Stage 1 Samples Samples Stage 2 Stage 2 2 2

  12. Multistage Designs • Joint analysis has more power than replication • p-value in Stage 1 must be liberal • Lower cost—do not gain power • CaTs power calculator: http://www.sph.umich.edu/csg/abecasis/CaTS/index.html

  13. Genome-wide Sequence Studies • Trade off between number of samples, depth, and genomic coverage. More later in “Next generation sequencing” (NGS) lecture Goncalo Abecasis

  14. Near-term sequencing design choices • For example, between: • Sequencing few subjects with extreme phenotypes: • e.g., 200 cases, 200 controls, 4x coverage. Then follow-up in larger population. • 10M SNP chip based on 1,000 genomes. • 5K cases, 5K controls. • Which design will work best…? More later in “Next generation sequencing” (NGS) lecture

  15. GWAS Microarray Only assay SNPs designed into array (0.7-5 million) Much cheaper (so many more subjects) Genotypes currently more reliable GWAS Sequencing “De novo” discovery (particularly good for rare variants) More expensive (but costs are falling) (many less subjects) Need much more expansive IT support Lots of interesting interpretation problems (field rapidly evolving) Design choices

  16. Exome Microarray Only assay SNPs designed into array (~300K+custom); in exons only and that could affect protein coding function Cheapest (so many more subjects) Genotypes currently more reliable (some question about rarest, but preliminary results good) Exome Sequencing “De novo” discovery (particularly good for rare variants); %age of exons only More expensive than microarrays, less expensive than gwas sequencing Need more expansive IT support Lots of interesting interpretation problems Design choices

  17. Size of study Visscher, AJHG 2012,

  18. Size of study Visscher, AJHG 2012,

  19. Biggest studies... • GWAS Microarray: 100,000 People in the Kaiser RPGEH, still to be analyzed (Hoffmann et al., Genomics, 2011a&b) • Sequencing: 1000 Genomes Project (though not disease focused, low coverage issues) • Exome Sequencing: GO ESP (12,031 subjects, for exome microarray design)

  20. Outline for GWAS • Review / Overview • Design • Analysis • QC • Prostate cancer example • Imputation • Replication & Meta-analysis • Advanced analysis intro (more next lecture) • Limitations & “missing heritability” • Gene/pathway tests • Polygenic models

  21. QC Steps • Filter SNPs and Individuals • MAF, Low call rates • Test for HWE among controls & within ethnic groups. Use conservative alpha-level. • Check for relatedness. Identity-by-state calculations. • Check genotype gender • Filter Mendelian inhertance (family-based, or potentially cryptics, if large enough sample)

  22. Check for relatedness, e.g., HapMap Pemberton et al., AJHG 2010

  23. GWAS analysis • Most common approach: look at each SNP one-at-a-time. • Possibly add in multi-marker information. • Further investigate / report top SNPs only. • Or backwards replication… P-values

  24. GWAS analysis • Additive coding of SNP most common, just a covariate in a regression framework • Dichotomous phenotype: logistic regression • Continuous phenotype: linear regression • Correct for multiple comparisons • e.g., Bonferroni, 1 million gives =5x10-8 • more next time • Adjust for potential population stratification • principal components (PC’s), on best performing SNPs • software usually does LD filter (e.g. Eigensoft)

  25. Adjusting for PC’s (recap) Balding, Nature Reviews Genetics 2010

  26. Adjusting for PC's • Li et al., Science 2008

  27. Adjusting for PC's • Razib, Current Biology 2008

  28. Adjusting for PC's • Wang, BMC Proc 2009

  29. QQ-plots and PC adjustment Wang, BMC Proc 2009

  30. Quantile-quantile (QQ) plot

  31. Example: GWAS of Prostate Cancer chromosome http://cgems.cancer.gov Multiple prostate cancer loci on 8q24 Witte, Nat Genet 2007

  32. Prostate Cancer Replications Witte, Nat Rev Genet 2009 Modest ORs

  33. Prostate Cancer Replications Witte, Nat Rev Genet 2009 Modest ORs

  34. SNPs Missed in Replication? 24,223 smallest P-value! Witte, Nat Rev Genet, 2009

  35. Population Attributable Risks for GWAS Smoking & lung cancer BRCA1 & Breast cancer Jorgenson & Witte, 2009

  36. Imputation of SNP Genotypes • Combine data from different platforms (e.g., Affy & Illumina) (for replication / meta-analysis). • Estimate unmeasured or missing genotypes. • Based on measured SNPs and external info (e.g., haplotype structure of HapMap). • Increase GWAS power (impute and analyze all), e.g. Sick sinus syndrome, most significant was 1000 Genomes imputed SNP (Holm et al., Nature Genetics, 2011) • HapMap as reference, now 1000 Genomes Project?

  37. Imputation Example Study Sample HapMap/ 1K genomes • http://www.shapeit.fr/, http://mathgen.stats.ox.ac.uk/impute/impute_v2.html • http://faculty.washington.edu/browning/beagle/beagle.html • http://www.sph.umich.edu/csg/abecasis/MACH/download/ Gonçalo Abecasis

  38. Identify Match with Reference • http://www.shapeit.fr/, http://mathgen.stats.ox.ac.uk/impute/impute_v2.html • http://faculty.washington.edu/browning/beagle/beagle.html • http://www.sph.umich.edu/csg/abecasis/MACH/download/ Gonçalo Abecasis

  39. Phase chromosomes, impute missing genotypes • http://www.shapeit.fr/, http://mathgen.stats.ox.ac.uk/impute/impute_v2.html • http://faculty.washington.edu/browning/beagle/beagle.html • http://www.sph.umich.edu/csg/abecasis/MACH/download/ Gonçalo Abecasis

  40. Imputation Application TCF7L2 gene region & T2D from the WTCCC data Observed genotypes black Imputed genotypes red. Chromosomal Position Marchini Nature Genetics2007 http://www.stats.ox.ac.uk/~marchini/#software

  41. Replication • To replicate: • Association test for replication sample significant at 0.05 alpha level • Same mode of inheritance • Same direction • Sufficient sample size for replication • Non-replications not necessarily a false positive • LD structures, different populations (e.g., flip-flop) • covariates, phenotype definition, underpowered

  42. Meta-analysis • Combine multiple studies to increase power • Either combine p-values (Fisher’s test), • or z-scores (better)

  43. (Meta-analysis)Example: GWAS of Prostate Cancer chromosome http://cgems.cancer.gov Multiple prostate cancer loci on 8q24 Witte, Nat Genet 2007

  44. Replication & Meta-analysis

  45. Meta-analysis

  46. Outline for GWAS • Review / Overview • Design • Analysis • QC • Prostate cancer example • Imputation • Replication & Meta-analysis • Advanced analysis intro (more next lecture) • Limitations & “missing heritability” • Gene/pathway tests • Polygenic models

  47. Limitations of GWAS Example: AUC for Breast Cancer Risk 58%: Gail model (# first degree relatives w bc, age menarche, age first live birth, number of previous biopsies) + age, study, entry year 58.9%: SNPs 61.8%: Combined Wacholder et al., NEJM 2010 • Not very predictive Witte, Nat Rev Genet 2009

  48. Limitations of GWAS • Not very predictive • Explain little heritability • Focus on common variation • Many associated variants are not causal

  49. Where's the heritability? Visccher, AJHG 2011

More Related