1 / 19

Multiple Comparisons Measures of LD

Multiple Comparisons Measures of LD. Jess Paulus, ScD January 29, 2013. Today’s topics. Multiple comparisons Measures of Linkage disequilibrium D’ and r 2 r 2 and power. Multiple testing & significance thresholds. Concern about multiple testing

ike
Download Presentation

Multiple Comparisons Measures of LD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple ComparisonsMeasures of LD Jess Paulus, ScD January 29, 2013

  2. Today’s topics • Multiple comparisons • Measures of Linkage disequilibrium • D’ and r2 • r2 and power

  3. Multiple testing & significance thresholds • Concern about multiple testing • Standard thresholds (p<0.05) will lead to a large number of “significant” results • Vast majority of which are false positives • Various approaches to handling this statistically

  4. Possible Errors in Statistical Inference Unobserved Truth in the Population Ha: SNP prevents DM H0: No association True positive (1 – β) False positive Type I error (α) Reject H0: SNP prevents DM Observed in the Sample True negative (1- α) False negative Type II error (β): Fail to reject H0: No assoc.

  5. Probability of Errors α = Also known as: “Level of significance” Probability of Type I error – rejecting null hypothesis when it is in fact true (false positive), typically 5% p value =The probability of obtaining a result as extreme or more extreme than you found in your study by chance alone

  6. Type I Error (α) in Genetic and Molecular Research A genome-wide association scan of 500,000 SNPs will yield: 25,000 false positives by chance alone using α = 0.05 5,000 false positives by chance alone using α = 0.01 500 false positives by chance alone using α = 0.001

  7. Multiple Comparisons Problem • Multiple comparisons (or "multiple testing") problem occurs when one considers a set, or family, of statistical inferences simultaneously • Type I errors are more likely to occur • Several statistical techniques have been developed to attempt to adjust for multiple comparisons • Bonferroni adjustment

  8. Adjusting alpha • Standard Bonferroni correction • Test each SNP at the α* =α /m1 level • Where m1 = number of markers tested • Assuming m1 = 500,000, a Bonferroni-corrected threshold of α*= 0.05/500,000 = 1x10–7 • Conservative when the tests are correlated • Permutation or simulation procedures may increase power by accounting for test correlation

  9. Measures of LD Jess Paulus, ScD January 29, 2013

  10. Haplotype definition • Haplotype: an ordered sequence of alleles at a subset of loci along a chromosome • Moving from examining single genetic markers to sets of markers

  11. Measures of linkage disequilibrium A G A G a g a g • Basic data: table of haplotype frequencies a g A g A G A G A G A G a g A g a g a g A G A G

  12. D’ and r2 are most common • Both measure correlation between two loci • D prime … • Ranges from 0 [no LD] to 1 [complete LD] • R squared… • also ranges from 0 to 1 • is correlation between alleles on the same chromosome

  13. D • Deviation of the observed frequency of a haplotype from the expected is a quantity called the linkage disequilibrium (D) • If two alleles are in LD, it means D ≠ 0 • If D=1, there is complete dependency between loci • Linkage equilibrium means D=0

  14. A G A G a g a g a g A g A G A G A G A G a g A g a g a g A G A G D’ = R2 = D’ = r2= (86 – 0x2)2/ (10688) = .6 (86 – 0x2) / (86) =1

  15. r2 and power • r2 is directly related to study power • A low r2 corresponds to a large sample size that is required to detect the LD between the markers • r2*N is the “effective sample size” • If a marker M and causal gene G are in LD, then a study with N cases and controls which measures M (but not G) will have the same power to detect an association as a study with r2*N cases and controls that directly measured G

  16. r2 and power • Example: • N = 1000 (500 cases and 500 controls) • r2 = 0.4 • If you had genotyped the causal gene directly, would only need a total N=400 (200 cases and 200 controls)

  17. Today’s topics • Multiple comparisons • Measures of Linkage disequilibrium • D’ and r2 • r2 and power

More Related