1 / 51

Exploring complex diseases using genome-wide association: challenges and strategies

Exploring complex diseases using genome-wide association: challenges and strategies. Li Jin, Ph.D. Fudan University CAS-MPG Partner Institute for Computational Biology. HGM2006, Helsinki. A G C. G G C. Gly. Ser. Positional Cloning. HGM2006, Helsinki. Linkage Disequilibrium.

zuri
Download Presentation

Exploring complex diseases using genome-wide association: challenges and strategies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploring complex diseases using genome-wide association: challenges and strategies Li Jin, Ph.D. Fudan University CAS-MPG Partner Institute for Computational Biology HGM2006, Helsinki

  2. A G C G G C Gly Ser Positional Cloning HGM2006, Helsinki

  3. Linkage Disequilibrium Linkage HGM2006, Helsinki

  4. Daly et al. Nature Genetics, 2001 HGM2006, Helsinki

  5. Genome-wide Association Study Candidate Gene/Region Association Study Select tagSNPs Genotyping tagSNPs Association analysis HGM2006, Helsinki

  6. Challenges • Adjustment for multiple testing and power • Portability of tagging SNPs between populations • Population stratification • Mapping the mutation • Exploring gene-gene interaction HGM2006, Helsinki

  7. Challenges • Adjustment for multiple testing and power • Portability of tagging SNPs between populations • Population stratification • Mapping the mutation • Exploring gene-gene interaction HGM2006, Helsinki

  8. Multiple Testing • Large number of SNPs • Number of tagging SNPs remains to be large (106) • Multiple testing problem: • Stringent p-value (10-6– 10-7) • Freimer and Sabatti (2004) • Sample size and power • Association: • Linear transformation: T is an invariable • Nonlinear transformation HGM2006, Helsinki

  9. Motivation Statistics based on Statistics based on Low Power Higher Power? HGM2006, Helsinki

  10. Nonlinear Transformations Function Derivative Entropy Exponential Polynomial Sigmoid Gaussian Reciprocal HGM2006, Helsinki

  11. Power (Case-Control ) Expected noncentrality parameters of the nonlinear test statistics NA=NG=100, PD=0.5 HGM2006, Helsinki

  12. Association Studies Association test of MMP-2 gene with esophageal carcinoma P values entropy exponential polynomial sigmoid reciprocal χ2 3.2 ×10-8 2.3 ×10-7 1.9 ×10-7 2.0 ×10-7 5.1 ×10-6 7.0 ×10-6 Yu C, et al. Cancer Res 2004, 64: 7622-7628 HGM2006, Helsinki

  13. Challenges • Adjustment for multiple testing and power • Portability of tagging SNPs between populations • Population stratification • Mapping the mutation • Exploring gene-gene interaction HGM2006, Helsinki

  14. Pop A Pop B How LD patterns are compared between populations? • Step 1: Infer haplotype blocks for each population • Step 2: Compare the boundaries of LD blocks between populations. Target SNP HGM2006, Helsinki

  15. HGM2006, Helsinki

  16. Factors Influencing Block Inferences • Sample size • Criterion and thresholds • Genotyping error • Gene flow • Search algorithm HGM2006, Helsinki

  17. Af As Eu ? Daic (Thai) HGM2006, Helsinki

  18. European 40 Uighur 45 Hmong 46 Han 50 African American 48 Wa 45 Zhuang 44 Samoan 50 Samples HGM2006, Helsinki

  19. SNP Selection and Genotyping • Selected from dbSNP (build 117) • Most of them are double-hits • 26,112 SNPs on Chro. 21 • 1 SNP for every 1.3 kb (Golden Path b.34) • Illumina BeadLab platform • 17 oligonucleotide primer sets • Three QA criteria • Samples • SNP: trios & duplicates • SNP: Hardy-Weinberg Expectation HGM2006, Helsinki

  20. Zhuang Han Wa Hmong African American Samoan Uighur European HGM2006, Helsinki

  21. Phylogeny of Human Populations Hmong Genetic Distance (FST) Zhuang Han Wa Uyghur European African HGM2006, Helsinki

  22. Pop A Pop B SAB = c/a SBA = c/b Measurement of LD Sharing • SNPs presented in both Pop A & Pop B • SNPs with MAF  0.1 were included • In LD, if r2 c (c = 0.1, 0.5, 0.8) 200kb Target SNP a = # LD in A c = # LD in A & B b = # LD in B HGM2006, Helsinki

  23. SAB ~ FST In non-Africans FST increases with time after divergence (t) HGM2006, Helsinki

  24. Pop A Pop B SAB = c/a SBA = c/b 200kb Target SNP a = # LD in A c = # LD in A & B b = # LD in B Correlation of LD between Populations = corr(a,b) HGM2006, Helsinki

  25. Correlation of LD Between Populations and Genetic Distance (FST) HGM2006, Helsinki

  26. Number of SNPs captured by tagSNPs RAB = Total number of SNPs Portability of tagging SNPs (RAB) Pop A Pop B Portability from A to B = RAB HGM2006, Helsinki

  27. R can be estimated using FST • FST can be estimated using a small number of SNPs • Conclusion: R can be approximately estimated by • typing a small number of SNPs 1- RAB ~ FST HGM2006, Helsinki

  28. t RAB FST HGM2006, Helsinki

  29. Conclusions • Substantial LD sharing between populations: ancestral LDs • tagSNPs are generally portable between populations, at least within Asia • Portability of a population to another can be estimated empirically using a small set of SNPs HGM2006, Helsinki

  30. Challenges • Adjustment for multiple testing and power • Portability of tagging SNPs between populations • Population stratification • Mapping the mutation • Exploring gene-gene interaction HGM2006, Helsinki

  31. Population Stratification • 209 languages belonging to 6 linguistic families • Consistent observation of south-north differentiation • Affect the power of association studies - false positives • Different loci show different level of differentiation: Is there an adequate adjustment? HGM2006, Helsinki

  32. Individual tree Chromosome 21 20,288 SNPs HGM2006, Helsinki

  33. Cluster Decomposition of Chinese Populations HGM2006, Helsinki

  34. Geographic Genetic Clines Based on Principle Components Y Chromosomes 143 populations mtDNA 91 populations CODIS STRs 79 populations HLA-A 107 populations HGM2006, Helsinki

  35. Distributions of mtDNA Haplogroups HGM2006, Helsinki

  36. Distributions of Y Haplogroups HGM2006, Helsinki

  37. All haplogroups Major haplogroups All haplogroups HGM2006, Helsinki

  38. Uyghurs HGM2006, Helsinki

  39. Uyghurs HGM2006, Helsinki

  40. Population Stratification • Different loci show different level of differentiation • Admixture indeed exist at least in some of the populations • Adjustment for population stratification using average differentiation is not adequate HGM2006, Helsinki

  41. Challenges • Adjustment for multiple testing and power • Portability of tagging SNPs between populations • Population stratification • Mapping the mutation • Exploring gene-gene interaction HGM2006, Helsinki

  42. Perfect Phylogeny Approach • No recombination and recurrent mutation • No loop in network • Not necessarily continuous • Objective: Group SNPs into PP sets PP(A) PP(B) PP(C) HGM2006, Helsinki

  43. 1 2 4 3 1 2 3 4 5 site 1 (1, 2, 3) (4, 5) site 2 (2 , 3) (1, 4, 5) site 3 (1, 2, 3, 5) (4) site 4 (2) (1, 3, 4, 5) Inference of Phylogeny HGM2006, Helsinki

  44. Comparison of Different Algorithms HGM2006, Helsinki

  45. 1 2 4 3 1 2 3 4 5 site 1 (1, 2, 3) (4, 5) site 2 (2 , 3) (1, 4, 5) site 3 (1, 2, 3, 5) (4) site 4 (2) (1, 3, 4, 5) Inference of Phylogeny HGM2006, Helsinki

  46. Identification of Disease Mutation • For each PP, it allows a stepwise search to localize the most likely branch (edge) of the mutation. • The best PP can be determined based on the likelihood (with adjustment of degree of freedom) PP(A) PP(B) PP(C) HGM2006, Helsinki

  47. Challenges • Adjustment for multiple testing and power • Portability of tagging SNPs between populations • Population stratification • Mapping the mutation • Exploring gene-gene interaction HGM2006, Helsinki

  48. A Study of CAD • Coronary Atherosclerosis in Chinese Populations • 123 candidate genes belong to several pathways including antioxidant, inflammation, coagulation • 1,518 tagSNPs typed • 916 samples (492 cases and 424 controls) HGM2006, Helsinki

  49. HGM2006, Helsinki

  50. PON2 GPX3 CD36 MMP8 PON1 SOD2 ACE PON3 DSCR1 ITGA2 PDGFC TXN TGFB3 MSR1 ITGB1 PDGFB SELL CCR2 ITGA6 NFKB1 VEGF NPR3 LAMA4 IL1B MMP9 EDN1 SELE Anti-oxidation Pathway TXN GCLM GSR HMOX1 Inflammatory Pathway MMP9 With-PW interaction GSS NOS3 Between-PW interaction HGM2006, Helsinki

More Related