150 likes | 234 Views
This presentation covers the concept of Sequential Multiple Decision Procedures (SMDP) for analyzing PGRN mini-GAW data. Topics include the history of SMDP, Sequential Probability Ratio Test (SPRT), and the Haseman-Elston Method. The idea is to sequentially increase sample sizes, make decisions at each step, and plan experiments efficiently. The text also discusses independent testing, multiple decision tests, and the SNP-Drug interaction model. The SMDP approach aims to identify signals effectively while controlling statistical errors. Various statistical methods and strategies are outlined for analyzing genetic data with a focus on genetic regions and phenotypes.
E N D
Sequential Multiple Decision Procedures (SMDP)For PGRN mini-GAW Data Q. Zhang and M.A. Province Division of BiostatisticsWashington University School of Medicine PGRN AWSII, ChicagoApril 28, 2005
Brief History of SMDP Sequential Probability Ratio Test (SPRT), Wald 1947 SMDP, Bechhoffer, Kiefer, Sobel, 1968 SMDP Haseman-Elston Method (ASP), Province 2000
Idea 1: Sequential Start from a small sample size Increase sample size one by one Use sequential information Do analysis at each time Stop when conclusion is reached n0+1 n0+2 … n0 n0+i … Plan experiments in next stage and save resources Use residual/extra data to do validation
SNP1 SNP2 SNP2 SNP3 SNP3 SNP4 SNP4 SNP5 SNP5 SNP6 SNP6 … … SNPn SNPn Independent Test Multiple Decision Idea 2: Multiple Decision Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test n SMDP divides populations into two groups and guarantees that there is a real difference (D) between the two groups with probability P*. SNP1 Tradeoff between false positive rate and false negative rate.
Y =α+ βX +Є SNP x Drug/Treatment 11 => 0 ALCOHOL 12 => 1 X OR 22 => 2 IVY SNP Main effects SNP-Drug interactions Model: Regression Model SNP Genotypes 11 => 0 12 => 1 22 => 2 Phenotypes NPUBS NDRIVEL PCTDRIVEL RIVALSIDE GOTGRANTS
Methods Regular P value Bonferroni corrected P value FDR corrected P value Fixed sample Regression (Entire data) n0 Q[h]1 Q[h]2 Q[h]3 Q[h]4 Q[h]5 … … P Probability that a real difference (D) between two groups exists n0+1 Stopping rule SMDP (Sequential) n0+2 … n0+h Sequential sum of squares of regression errors …
SMDP Summary • Test, identify all signals simultaneously No multiple comparisons • Tight control statistical errors (Type I, II) • Efficient, Sequential information “Minimal” N to find significant signals • Reliable, Save rest of N for validation
Fixed sample regressionRegion 2 (28 SNPs, IVY-SNP), RIVALSIDE
Fixed sample regressionRegion 1 (74 SNPs, ALCOHOL-SNP), PCTDRIVEL
Fixed sample regression of 5 phenotypes on genotypes of 188 SNPs