Skip this Video
Download Presentation
On genome-wide association studies (GWAS)

Loading in 2 Seconds...

play fullscreen
1 / 25

On genome-wide association studies (GWAS) - PowerPoint PPT Presentation

  • Uploaded on

On genome-wide association studies (GWAS). association linkage disequilibrium population structure. case/control design single nucleotide polymorphism data. TTCAGTCAGATCC T AGCCC. Chromosome 1. TTCAGTCAGATCC C AGCCC. Chromosome 2. AAGTCAGTCTAGG G TCGGG. SNP. AAGTCAGTCTAGG A TCGGG.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' On genome-wide association studies (GWAS)' - kieu

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
  • linkage disequilibrium
  • population structure
case/control design
  • single nucleotide polymorphism data


Chromosome 1


Chromosome 2





Population structure explained part of the significant +11.2% inflation of test statistics we observed in an analysis of 6,322 nonsynonymous SNPs in 816 cases of type 1 diabetes and 877 population-based controls from Great Britain. The remainder of the inflation resulted from differential bias in genotype scoring between case and control DNA samples, which originated from two laboratories, causing false-positive associations.

Nature Genetics37, 1243 - 1246 (2005)

Published online: 9 October 2005; | doi:10.1038/ng1653

Population structure, differential bias and genomic control in a large-scale, case-control association study

David G Clayton1, Neil M Walker1, Deborah J Smyth1, Rebecca Pask1, Jason D Cooper1, Lisa M Maier1, Luc J Smink1, Alex C Lam1, Nigel R Ovington1, Helen E Stevens1, Sarah Nutland1, Joanna M M Howson1, Malek Faham2, Martin Moorhead2, Hywel B Jones2, Matthew Falkowski2, Paul Hardenbol2, Thomas D Willis2 & John A Todd1


Genomic Control (Devlin and Roeder)

  • premise: pop structure causes variance inflation of test statistic under null
  • Y_i^2 ~ chi-square(1) ideally
  • Y_i^2 ~ inflation factor lambda * chi-square(1)
  • so use T_i = Y_i^2/lambda.hat
  • lambda.hat = median(Y_i^2)/[ null median ]

Handling population structure

  • genomic control (Devlin & Roeder)
  • structured association (Pritchard et al)
  • principal components (Price et al)


Nature447, 661-678 (7 June 2007) | doi:10.1038/nature05911; Received 26 March 2007; Accepted 11 May 2007

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls

The Wellcome Trust Case Control Consortium

UK population; european ancestry
  • seven diseases; 50 research groups (BD, CAD,CD,HT,RA,T1D,T2D)
  • 2000 cases per disease
  • 3000 common controls (two distinct sets)
  • Affymetrix 500K mapping array set

Quality Control

  • 16179 samples included (809 dropped considering contamination, non-Caucasian ancestry)
  • 469,557 SNPs included (93.8%)
  • Average call rate 99.63%
  • 392,575 have MAF > 1%

There may be important population structure that is not well captured by current geographical region of residence. Present implementations of strongly model-based approaches such as STRUCTURE11, 12 are impracticable for data sets of this size, and we reverted to the classical method of principal components13, 14, using a subset of 197,175 SNPs chosen to reduce inter-locus linkage disequilibrium. Nevertheless, four of the first six principal components clearly picked up effects attributable to local linkage disequilibrium rather than genome-wide structure. The remaining two components show the same predominant geographical trend from NW to SE but, perhaps unsurprisingly, London is set somewhat apart


The overall effect of population structure on our association results seems to be small, once recent migrants from outside Europe are excluded. Estimates of over-dispersion of the association trend test statistics (usually denoted ; ref. 15) ranged from 1.03 and 1.05 for RA and T1D, respectively, to 1.08–1.11 for the remaining diseases. Some of this over-dispersion could be due to factors other than structure, and this possibility is supported by the fact that inclusion of the two ancestry informative principal components as covariates in the association tests reduced the over-dispersion estimates only slightly (Supplementary Table 6), as did stratification by geographical region. This impression is confirmed on noting that P values with and without correction for structure are similar (Supplementary Fig. 9). We conclude that, for most of the genome, population structure has at most a small confounding effect in our study, and as a consequence the analyses reported below do not correct for structure. In principle, apparent associations in the few genomic regions identified in Table 1 as showing strong geographical differentiation should be interpreted with caution, but none arose in our analyses.