Sequential Kernel Association Tests for the Combined Effect of Rare and Common Variants

Sequential Kernel Association Tests for the Combined Effect of Rare and Common Variants

80 Views

Download Presentation
## Sequential Kernel Association Tests for the Combined Effect of Rare and Common Variants

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Sequential Kernel Association Tests for the Combined Effect**of Rare and Common Variants Journal club (Nov/13) SH Lee**Introduction**• Sequence data • Rare and unidentified variants • Groupwise association tests • Omnibus tests • Burden test, CMC test, SKAT test • Up-weighting for rare, • down-weighting for common • Rare/common variants tested separately**Introduction**• This study develops a joint test of rare/common • Combining burden/SKAT test for rare/common • Can be applied to • whole exome sequencing + GWAS • Deep resequencing of GWAS loci • Basically can analyse all variants including rare, low-frequency and common variants • Simulation (type 1 error, power) • Real data, CD and Autism**Materials and Methods**Definition of rare/common • <0.01 rare • 0.01-0.05 low frequency • >0.05 common Or • <1/sqrt(2*n) rare • >1/sqrt(2*n) common • n = 500, rare MAF < 0.031 • n = 10000, rare MAF < 0.007**Materials and Methods**• Testing for the overall effect of rare and common variants • Rare for Burden test • Common for SKAT test • Weighted-sum statistics • Fishers method of combining the p values**Weighted-sum statistics**• Within a region (e.g. a gene) having m variants • g(*) is a linear or logistic link function • Alpha is for covariates • X is n x m matrix • Beta is regression coefficient and random variable**Weighted sum score test(Variance component score test)**Taking the first derivative of log-likelihood respect with the variance τ P-value from κχ2ν κis scale parameter, v is degree of freedom**Weighted sum score test(Variance component score test)**Wu et al (2010) AJHG 86: 929; Liu et al (2008) BMC Bioinformatics 8: 292; Lin (1997) Biometrika 84: 309; White (1982) Econometrica 50: 1**Weighted sum score test(Variance component score test)**• ρ : the correlation between regression coefficients • If perfectly correlated (ρ= 1), they will be all the same after weighting, and one should collapse the variants first before running regression, i.e., the burden test • If the regression coefficients are unrelated to each other, one should use SKAT Lee et al. (2012) AJHG 91: 224**Burden-C, SKAT-C**• Combined test statistic for rare and common • Weighting beta(p,1,25) for rare, • beta(p,0.5,0.5) for common • Partitioning rare and common variants**Other methods**• Burden-A, SKAT-A • Adaptive combining rare/common • Searching φ for the minimum p-value • Burden-F, SKAT-F • Fisher’s combination method**Simulation**• Sequence data on 10,000 haplotypes on 1 Mb region • Calibrated model for the European pop • Random sample of a region of 5 or 25 kb and simulated data with 1000-5000 individuals • Proportion of cases in the sample is 0.5**Type I error**• The proposed methods agrees with the expectation**Power (separation cut-off)**• Using burden-C test • Power with different separation cut-offs • 1/sqrt(2n) will be used further**Power (proposed methods)**• Power for 8 different tests • The proposed combination tests outperform**Power**• Rare/common causal variants (model 1, 2, 3, 6) • The combination methods perform better**Power**• Common causal variants (model 5) • The combination methods perform better • Rare causal variants (model 4) • The combination methods perform similarly**Power (proposed methods)**• The proposed combination methods outperform CMC for all 6 disease models • The proposed combination methods outperform the original SKAT for all 6 disease models**Power**• For model 1-4 which include only risk variants • SKAT better than Burden when prop. risk variants is small (10%) • Burden better than SKAT when prop. risk variants is large (30%)**Power**• Model 1-3 which include both rare/common • SKAT-F better than burden-F regardless of prop. risk variants • Model 5 which include only common risk variants • SKAT better than burden regardless of prop. risk variants**Power**• Adaptive test (SKAT-A, Burden-A) • Perform worse than SKAT-C and Burden-C • Results for a region of size 5 kb were similar**Real data**• CD NOD2 sequence data • 453 cases, 103 controls • 60 single nucleotide variations (9 of them have > MAF 0.05) • Because only pooled frequency counts available for each variants, sequencing data were simulated. • Autism LRP2 sequencing data • 430 cases, 379 controls**Real data**• The combination methods powerful than others**Discussion**• The proposed combination methods • Partitioning rare/common • Powerful approach • Better than CMC (rare/common partitioning) • Better than original Burden and SKAT test • Extend to family-based designs**Discussion**• T1D HLA region • SKAT (2.7e-43) • Wald test (6.7e-49) • Likelihood ratio test (8.9e-221) • LD between regions • Multiple different components within a region**Linear SKAT vs individual variant test statistics**• Linear SKAT (lower) and individual variant test (upper) is equivalent