1 / 16

MESA FS SC Meeting

MESA FS SC Meeting. Candidate Wide Association Study (CWAS) aka ‘Data Mining’ Joe Mychaleckyj. CG Panel 1+2: Summary. SNPs CG1 CG2 TOTAL Picked 1536 1535* 3071 Typed 1440 1467 2907 Typed, Not Dropped 1440 1442 2882 AVAILABLE FOR CWAS * One duplicate SNP AIMs CG1 CG2 TOTAL

mdozier
Download Presentation

MESA FS SC Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MESA FS SC Meeting Candidate Wide Association Study (CWAS) aka ‘Data Mining’ Joe Mychaleckyj

  2. CG Panel 1+2: Summary SNPs CG1 CG2 TOTAL Picked 1536 1535* 3071 Typed 1440 1467 2907 Typed, Not Dropped 1440 1442 2882 AVAILABLE FOR CWAS * One duplicate SNP AIMs CG1 CG2 TOTAL Picked 97 112 209 Typed 96 106 202 Typed, Not Dropped 96 103 199 Annotated, Unique Genes per Panel CG1 CG2 TOTAL Typed, Not Dropped SNPs 119 123 230

  3. CWAS - Why ? • CG1 + CG2 = 2882 SNPs • Analyze all SNPS irrespective of genome location or putative gene assignment • Inference of gene function based on genome is crude • SNPs lie in regions with co-located genes where single gene assignment is imprecise and misleading (coordinated gene regulation) • GWAS and multi-candidate gene studies are fast becoming the standard for disease and trait gene mapping publications • Results in press faster

  4. Pheno 1 Pheno 2 MESA CWAS Process Phenotype Class File Eg ECG N phenotypes 1001 12 2.41 4.77 1002 5.32 2.99 1003 6 1.69 4.13 1004 25 3.04 2.87 1001 AA GT AA CC CT 1002 AA GG AT CC CT 1003 AG GG AA CC CC 1004 AA GG AA CT CT Master Genetics File Pheno 1 Pheno 2 Pheno 3 SNP1 -0.93 SNP2 0.87 SNP3 1.10 SNP4 0.97 SNP1 1.22 SNP2 1.02 SNP3 -3.1 SNP4 -0.7 SNP1 1.34 SNP2 1.22 SNP3 0.61 SNP4 0.65 CWAS - 1 per phenotype . . . etc Gene 1 Gene 2 Gene 3 Gene 4 SNP106 SNP107 SNP108 SNP106 SNP107 SNP108 SNP106 SNP107 SNP108 Pheno 1: Candidate Gene Files Pheno 2: Candidate Gene Files Pheno 3: Candidate Gene Files

  5. Phenotype Class Status

  6. Phenotype Class Status TOTALS 886 98 *

  7. Analytical Pipeline • Use same curated phenotype and genetic data sets that are available (split into CGs) for investigators • Baseline models, within 4 ethnic group strata • Y ~ age + sex + site • Additive (1 df) + Genotype (2 df) tests • Filter on MAF > 0.05 to remove rare alleles • Full (common + rare) SNP data is available if a CWAS group requests but test statistics may be misleading

  8. What’s in a CWAS Package of Analyses for a Phenotype ? 2 Classes of tests: • Additive (1df) + Genotype (2df) Summary Table of Top N=50 Snps with rankings and summary statistics • Stratified by ethnic group • Sorted by additive model p-value • Includes ranking for each ethnic group (e.g., SNP with ranking #1 for AFA may be #240 for CHN) 4 Quantile (QQ plots) - ie by ethnic stratum for each test 4 Genome Association Plots (GAPs) - by ethnic stratum for each test

  9. Tabular Results in topN file

  10. CWAS Like GWAS is Fraught with Risk • Interpretation: Caveat lector • Do the test statistics appear reasonable ? • 1 df vs 2 df tests, CIs, Std. Errors etc • Is there evidence of genotyping bias/errors ? • Are allele effects consistent (even if not significant) ? • Are the results confounded (comorbidities or correlated traits) ? • Is the gene(s) reasonable - is there independent evidence of association or gene expression data to support a putative physiological role • Is there sufficient power ?

  11. CWAS To Do • Rerun with genetically determined ancestry adjustment (currently self-report) • Rerun models with missing non-baseline covariates • Lipids: Complete CWAS incorporating multiple imputation of lipid levels adjusted for lipid meds • Run analyses for new phenotype classes and classes with primary outcomes still TBD • Distribute results as per genetics Committee directives • NB Ancillary study groups (eg Lung, Eye) may have separate analysis/writing group policies

More Related