Topic 7 single locus association studies case control studies
Download
1 / 52

Topic #7 Single-Locus Association Studies: Case-Control Studies - PowerPoint PPT Presentation


  • 271 Views
  • Uploaded on

Topic #7 Single-Locus Association Studies: Case-Control Studies. University of Wisconsin Genetic Analysis Workshop June 2011. Outline. Case-Control Study: Two-allele, single locus model Alternative Tests for Association Quantitative Outcomes: Two-allele, single locus model

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Topic #7 Single-Locus Association Studies: Case-Control Studies' - sugar


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Topic 7 single locus association studies case control studies

Topic #7Single-Locus Association Studies:Case-Control Studies

University of Wisconsin

Genetic Analysis Workshop

June 2011


Outline
Outline

  • Case-Control Study:

    • Two-allele, single locus model

    • Alternative Tests for Association

  • Quantitative Outcomes:

    • Two-allele, single locus model

    • Alternative Tests

  • Multiple Testing (Topic #8):




Tests models of association
Tests (Models) of Association

  • Genotype: Distribution of 3 genotypes differs in the two groups (unstructured alternative)

    Standard c2 on 2df

  • Recessive: Relative frequency of A1A1 differs in two groups

    c2 on 1df

  • Dominant: Relative frequency of A2A2 differs in two groups

    c2 on 1df


Case control example genotype test
Case-Control Example: Genotype Test

Test #1: Compare genotype frequency in cases and controls

Test: c2(2df) = 27.7; p < 10-5


Case control example recessive test
Case-Control Example: Recessive Test

Test #2: Rate of E4/E4 in cases and controls

Test: c2 (1df) = 5.46; p =.019


Case control example dominant test
Case-Control Example: Dominant Test

Test #3: Compares rate of +/+ in Cases and Controls

Test: c2(1df) = 27.3; p < 10-6


Simple association more tests
Simple Association: More Tests

  • Trend (Cochran-Armitage): Regress proportion of cases on # of risk alleles (here E4)

  • Allele Test: Count alleles rather than individuals (assumes HWE)

  • Case-only design: Test whether cases are in HWE

  • Logistic Model


Trend test cochran armitage
Trend Test (Cochran & Armitage)

Test #4: Cochran-Armitage Trend Test

Test: c2(1df) = 25.3; p < 10-5


Allele test
Allele Test

Test #5: Allele Frequency Comparison

Test: c2(1df) = 26.8; p < 10-6

Sasieni, P. D. (1997). From genotypes to genes: Doubling the sample size. Biometrics, 53(4), 1253-1261.


Case only design test for hwe
Case-Only Design: Test for HWE

Test #6: Departure from HWE

Test: c2(1df) = 0.4; p = .52

(Little power for multiplicative model)



And there are more a 7 th and later 8 th test
And there are more: A 7th (and later 8th!) test

  • Log-additive/logistic:


Advantage of logistic framework
Advantage of Logistic Framework

  • Can easily accommodate covariates

  • Can accommodate alternative models (e.g., dominance or recessive models) with dummy variables

  • Test of H0: bi = 0 is very nearly same as allele test


Genetic determinants of human ageing and longevity project
Genetic Determinants of Human Ageing and Longevity Project

  • Aim:

    • Identify genetic variants associated with extreme longevity

  • Basic Design:

    • 1200 cases (1905) and 800 controls (MADT)

    • Candidate-gene approach: 168 genes

  • Genotyping:

    • 1536 SNPs using Illumina’s Golden Gate Array



Summary of plink data cleaning of gclc clean
Summary of plink Data Cleaning of GCLC_Clean

  • Start: 1200 Cases, 800 Controls, 13 SNPs

  • Eliminate:

    • 293 (103 cases/90 controls) individuals with > 10% missing

    • 1 SNP eliminated because > 10% missing

    • 1 SNP fail HWE at p < .001

    • 1 SNP eliminated due to low MAF

  • Final sample: 997 cases, 710 controls and 13 SNPs in GCLC


Plink implementation of association tests
plink Implementation of Association Tests

  • Basic association test (allelic):

    plink --file gclc_clean --assoc

    (generates plink.assoc)


Plink association output plink assoc
plink Association Output (plink.assoc)

CHR SNP BP A1 F_A F_U A2 CHISQ P OR

6 rs7742367 53469235 G 0.169 0.1472 A 2.906 0.08826 1.178

6 rs670548 53474948 G 0.3507 0.39 A 5.504 0.01898 0.8447

6 rs661603 53478066 G 0.4626 0.4085 A 9.752 0.001791 1.246

6 rs16883912 53481730 A 0.1093 0.09437 G 1.988 0.1585 1.177

6 rs572496 53485578 A 0.5005 0.4458 G 9.952 0.001607 1.246

6 rs617066 53491877 A 0.3296 0.2648 G 16.52 4.82e-005 1.365

6 rs2100375 53493434 A 0.3539 0.3077 G 7.936 0.004846 1.232

6 rs531557 53497954 T 0.4769 0.433 A 6.421 0.01128 1.194

6 rs16883966 53505685 G 0.05308 0.0346 A 6.506 0.01075 1.564

6 rs4712035 53509062 C 0.1745 0.1711 G 0.06685 0.796 1.024

6 rs2397147 53509546 G 0.432 0.388 A 6.608 0.01015 1.2

6 rs534957 53514310 G 0.3258 0.338 C 0.5596 0.4544 0.9464

6 rs675908 53521259 G 0.3246 0.3383 A 0.7009 0.4025 0.94

Highlighted nominally significant at p < .05


Plink implementation of association tests1
plink Implementation of Association Tests

  • Basic association test (allelic):

    plink --file gclc_clean --assoc

    (generates plink.assoc)

  • Genetic model based tests (genotype, trend, domin, recess):

    plink --file gclc_clean --model

    (generates plink.model)


Association model tests for 13 gclc snps
Association ‘Model’ Tests for 13 GCLC SNPs

Highlighted

In Red, nominally significant at p < .05,

In Blue, significant after Bonferroni correction p < .004 (i.e., 05/13)


Low frequency snps
Low Frequency SNPs

  • Within the 13 GCLC SNPs, rs16883966had MAF < .05 (.049 in Danish 1905 and .037 in MADT)

  • For this SNP unable to compute test statistic for Genotype, Dominant, & Recessive models because of low cell frequencies (Exp < .05)


Plink implementation of association tests2
plink Implementation of Association Tests

  • Basic association test (allelic):

    plink --file gclc_clean --assoc

    (generates plink.assoc)

  • Genetic model based tests (genotype, trend, domin, recess):

    plink --file gclc_clean --model

    (generates plink.model)

  • Fisher exact test (the 8th!):

    plink --file gclc_clean --fisher

    (generates plink.fisher)

  • Logistic:

    plink --file gclc_clean --logistic

    (generates plink.logistic)





Reparameterized single locus model
Reparameterized Single-locus Model


Genotypic values
Genotypic Values

A2A2

A1A1

A1A2

u11

u12

u22


Genotypic values1
Genotypic Values

A2A2

A1A1

A1A2

u11

u12

u22

-a

d

a


Genotypic values2
Genotypic Values

A2A2

A1A1

A1A2

u11

u12

u22

-a

d

a

d is dominance parameter; when d = 0, locus is additive



Additive genetic variance1
Additive Genetic Variance

Note: d contributes to additive variance whenever q is not equal to .5


Dominance genetic variance
Dominance Genetic Variance

Note: There is dominance variance only when d is not 0



Complete additivity
Complete Additivity

Slope of regression line =a

Additive genetic variance = regression variance

1

0

2



Partial dominance
Partial Dominance

Slope of regression line = a

Dominance = Residual Variance

Additive genetic variance = regression variance

1

0

2



Complete dominance
Complete Dominance

Dominance = Residual Variance

Slope of regression line = a

Additive genetic variance = regression variance

1

0

2





Some conclusions
Some Conclusions

  • Dominance effects contribute to additive genetic variance

  • Even with complete Mendelian dominance, additive variance typically exceeds dominance variance (exception would be overdominance)


Power calculation in quanto for quantitative trait
Power Calculation in Quanto for Quantitative Trait

  • In a study of 1000 unrelated individuals, what is our power to detect a single locus effect?

    • Strength of genetic effect (R2g)

    • Risk allele frequency?


Quanto g power calculation
Quanto G Power Calculation

  • Outcome/Design:

  • Continuous  Independent Individuals

  • Hypothesis:

  • Gene Only

  • Gene:

  • Allele Frequency .10 to .90 by .20

  • Additive model

  • Outcome Model:

  • R2g = .001 to .019 by .002

  • Power:

  • Sample Size = 1000 to 1000 by 0

  • Type I error rate = .05, two-sided

  • Calculate:


Computed power for n 1000 minor allele risk allele
Computed Power for N=1000(Minor Allele = Risk Allele)

% Variance Accounted For


Association with a quantitative phenotype
Association with a Quantitative Phenotype

  • Genotype: 10 SNP markers in the COMT gene, including rs4680

  • Sample: 7235 participants in MCTFR longitudinal research

  • Phenotype: General externalizing composite (having an overall mean of ~ 0.0, SD ~ .36)

    plink --bfilecomt --phen ext.dat --mpheno 2 --missing-phenotype -99.0

    --assoc –qt-means


Output plink qassoc
Output: plink.qassoc

CHR SNP BP NMISS BETA SE R2 T P

22 rs4646312 18328337 7233 -0.003598 0.006141 4.747e-005 -0.5859 0.558

22 rs165656 18328863 7232 -0.01252 0.005983 0.0006056 -2.093 0.03637

22 rs165722 18329013 7235 -0.01346 0.005974 0.0007017 -2.254 0.02424

22 rs2239393 18330428 7233 -0.003556 0.006125 4.662e-005 -0.5806 0.5615

22 500437 18330763 7232 -0.004062 0.006127 6.079e-005 -0.663 0.5074

22 rs4680 18331271 7234 -0.01358 0.005973 0.0007139 -2.273 0.02305

22 rs4646316 18332132 7235 -0.002434 0.007201 1.58e-005 -0.3381 0.7353

22 rs165774 18332561 7235 0.009351 0.006543 0.0002823 1.429 0.153

22 rs174699 18334458 7235 -0.0124 0.01288 0.0001281 -0.9626 0.3358

22 rs165599 18336781 7233 -0.004997 0.006435 8.337e-005 -0.7765 0.4375

Highlighted: Nominally significant at p < .05


Output plink qassoc means rs4680
Output: plink.qassoc.means (rs4680)


Simple association conclusions
Simple Association: Conclusions

  • Power depends on which test is used

    • In the absence of a strong hypothesis, most use tests that assume heterozygote risk is intermediate (trend, logistic, allelic)

    • While the trend test is generally preferred, logistic (~allelic) has advantages in generalizability

  • We now need to worry about multiple testing!


ad