1 / 55

Association analysis

Association analysis. Shaun Purcell Boulder Twin Workshop 2004. Overview. Candidate gene association Haplotypes and linkage disequilibrium Linkage and association Family-based association. What is association?. Categorical traits disease susceptibility genes Continuous traits

kele
Download Presentation

Association analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Association analysis Shaun Purcell Boulder Twin Workshop 2004

  2. Overview • Candidate gene association • Haplotypes and linkage disequilibrium • Linkage and association • Family-based association

  3. What is association? • Categorical traits • disease susceptibility genes • Continuous traits • quantitative trait loci, QTL

  4. Disease traits Case Control AA n1 n2 Aa n3 n4 aa n5n6 Is there a difference in allele/genotype frequency between cases and controls?

  5. Disease traits Case Control AA 3025 p2 Aa 5050 2p(1-p) aa 20 25 (1-p)2 Is there a difference in allele/genotype frequency between cases and controls? , p-value Test for independence

  6. Disease traits Additive model Dominant model for A General model 1 df 1 df 2 df Effect sizes calculated as odds ratios

  7. Quantitative traits Aa AA aa aa Aa AA ID Y G A D 001 0.34 aa -1 0 002 1.23 Aa 0 1 003 1.66 Aa 0 1 004 2.74 AA 1 0 005 1.33 AA 1 0 … … … … … Y = aA + dD + e

  8. Some web resources • BGIM http://statgen.iop.kcl.ac.uk/bgim/ Introductory tutorials on twin analysis, primer on maximum likelihood, Mx language. • GxE moderator models http://statgen.iop.kcl.ac.uk/gxe/ • Power calculation http://statgen.iop.kcl.ac.uk/gpc/ • Case/control association tools http://statgen.iop.kcl.ac.uk/gpc/model/

  9. Relative risk P(D|AA) / P(D|aa) labelled RR(AA) P(D|Aa) / P(D|aa) labelled RR(Aa)

  10. Genetic models

  11. Tests

  12. Multiple samples • Constrain frequencies across samples • Constrain effects across samples • Can test genetic models with effects and/or frequencies constrained to be equal • Can perform tests of homogeneity of effects and/or frequencies across samples

  13. An example2 case/control samples • Population frequency 5%

  14. Homogeneous effects across samples Homogeneous allele frequencies across samples Model p RR(Aa) RR(AA) -2LL ----- - ------ ------ ---- Gen 0.367 1.979 3.663 0.367 1.979 3.663 793.143 Mult 0.367 1.911 3.651 0.367 1.911 3.651 793.199 Dom 0.401 1.990 1.990 0.401 1.990 1.990 802.927 Rec 0.405 1.000 1.921 0.405 1.000 1.921 805.064 None 0.442 1.000 1.000 0.442 1.000 1.000 815.628

  15. Heterogeneous effects across samples Homogeneous allele frequencies across samples Model p RR(Aa) RR(AA) -2LL ----- - ------ ------ ---- Gen 0.367 1.235 2.136 0.367 2.890 5.547 786.498 Mult 0.367 1.440 2.073 0.367 2.282 5.208 788.262 Dom 0.401 1.216 1.216 0.401 2.936 2.936 796.422 Rec 0.405 1.000 1.519 0.405 1.000 2.195 803.849 None 0.443 1.000 1.000 0.443 1.000 1.000 815.628

  16. TESTS OF GENETIC MODELS -- ASSUMING EQ EFFECTS & EQ FREQS ========================================================= Gen vs None (2 df) : 22.485 p = 0.000 Mult vs None (1 df) : 22.429 p = 0.000 Dom vs None (1 df) : 12.701 p = 0.000 Rec vs None (1 df) : 10.564 p = 0.001 Gen vs Mult (1 df) : 0.056 p = 0.813 Gen vs Dom (1 df) : 9.784 p = 0.002 Gen vs Rec (1 df) : 11.921 p = 0.001 TESTS OF GENETIC MODELS -- ASSUMING UNEQ EFFECTS & EQ FREQS =========================================================== Gen vs None (4 df) : 29.130 p = 0.000 Mult vs None (2 df) : 27.366 p = 0.000 Dom vs None (2 df) : 19.205 p = 0.000 Rec vs None (2 df) : 11.779 p = 0.003 Gen vs Mult (2 df) : 1.764 p = 0.414 Gen vs Dom (2 df) : 9.925 p = 0.007 Gen vs Rec (2 df) : 17.351 p = 0.000 TESTS OF EQUAL EFFECTS -- ASSUMING EQ FREQS =========================================== w/ Gen model (2 df) : 6.645 p = 0.036 w/ Mult model (1 df) : 4.938 p = 0.026 w/ Dom model (1 df) : 6.505 p = 0.011 w/ Rec model (1 df) : 1.215 p = 0.270

  17. Indirect association Genotyped markers QTL Ungenotyped markers

  18. Recombination Homologous chromosomes in one parent Paternal chromosome Maternal chromosome Recombination event during meiosis Recombinant gamete transmitted, harboring mutation

  19. Recombination Homologous chromosomes in one parent Paternal chromosome Maternal chromosome No recombination event during meiosis Nonrecombinant gamete transmitted, not harboring mutation

  20. Linkage: affected sib pairs Paternal chromosome Maternal chromosome First affected offspring, no recombination Second affected offspring, recombinant gamete IBD sharing from this one parent (0 or 1) 1 0

  21. Association analysis • Mutation occurs on a ‘red’ chromosome

  22. Association analysis • Mutation occurs on a ‘red’ chromosome

  23. Association analysis • Association due to `linkage disequilibrium’

  24. Haplotypes A a MAM aM mAm am This individual has aa and Mm genotypes and am and aM haplotypes a m M a

  25. Haplotypes A a MAM aM mAm am This individual has Aa and Mm genotypes and AM and am haplotypes … but given only genotype data, consistent with Am/aM as well as AM/am a m A M

  26. Haplotypes A a MAM aM mAm am This individual has AA and Mm genotypes and AM and Am haplotypes A m A M

  27. Equilibrium haplotype frequencies A a Mprpsp mqrqsq r s

  28. Linkage disequilibrium A a Mpr + Dps - Dp mqr - Dqs + Dq r s DMAX = Min(qs, pr) D’ = D /DMAX r2 = D’ / pqrs

  29. Haplotype analysis • Estimate haplotypes from genotypes • Associate haplotypes with trait Haplotype Freq. Odds Ratio AAGG 40% 1.00* AAGT 30% 2.21 CGCG 25% 1.07 AGCT 5% 0.92 * baseline, fixed to 1.00

  30. Linkage Association Sib correlation 0 1 2 IBD at the QTL Trait Trait Sib correlation Sib correlation LD RF aa Aa AA aa Aa AA 0 0 1 1 2 2 QTL genotype Marker genotype IBD at the QTL IBD at the Marker Trait aa Aa AA QTL genotype

  31. Variance Components • Means M1 M2 • Variance-covariance matrix V1 C21 C12 V2 ASSOCIATION LINKAGE

  32. Variance Components • Means M1 +bG1 M2 +bG2 • Variance-covariance matrix V1 C21+q(-½) C12 +q(-½) V2 ASSOCIATION b= regression coef. G = individual’s genotype LINKAGE q= regression coef.  = IBD sharing 0 , ½ , 1

  33. Components of a Genetic Theory G G G G G G G G G G G G G G G G G G G G Time G G G G P P • POPULATION MODEL • Allele & genotype frequencies • Demographics & population history • Linkage disequilibrium, haplotype structure • TRANSMISSION MODEL • Mendelian segregation • Identity by descent & genetic relatedness • PHENOTYPE MODEL • Biometrical model of quantitative traits • Additive & dominance components

  34. Linkage without association 3/5 2/6 3/5 2/6 3/6 3/2 5/6 5/2 Both families are ‘linked’ with the marker… …but a different allele is involved.

  35. Linkage and association 3/5 2/6 3/6 2/4 4/6 2/6 3/6 3/2 5/6 6/2 6/6 6/6 All families are ‘linked’ with the marker… … and allele 6 is ‘associated’ with disease Linkage is just association within families

  36. Association without linkage Controls Cases 6/6 6/2 3/5 3/4 3/6 5/6 2/4 3/2 3/6 2/2 4/6 2/6 2/5 5/2 Allele 6 is more common in the GREEN population The disease is more common in the GREEN population … a ‘spurious association’

  37. TDT • Transmission disequilibrium test • test for linkage and association aa Aa AA AA AA Aa Aa AA Aa Aa Aa AA

  38. TDT “A” disease allele AA x Aa AA x Aa aa x Aa aa x Aa AA Aa Aa aa + - + - 0.5 0.5 + - + - 0.5 0.5 Additive Dominant Recessive

  39. Between and within components W B W B W Sib1 = B - W Sib2 = B + W Sib1 Sib2

  40. Between and within components • Fulker et al (1999) Note : W = S1 – B

  41. Parental genotypes • Use parental genotypes to generate B • Examples • AA from AAxAA W = 0 • Aa from AAxAa W = -0.5 • Aa from AaxAa W = 0

  42. assoc.mx • Sibling pair sample • B and W components precalculated in input file • Single SNP genotype • Quantitative trait

  43. assoc.dat s1 s2 g1 g2 b w1 w2 -0.007 -0.972 -1 0 -0.5 -0.5 0.5 -0.829 -0.196 1 1 1 0 0 0.369 0.645 1 1 1 0 0 0.318 1.55 0 1 0.5 -0.5 0.5 1.52 0.910 0 0 0 0 0 -0.948 -1.55 1 1 1 0 0 0.596 -0.394 1 0 0.5 0.5 -0.5 -1.91 -0.905 0 1 0.5 -0.5 0.5 0.499 0.940 1 0 0.5 0.5 -0.5 -1.17 -1.29 1 0 0.5 0.5 -0.5 -0.16 -1.81 1 1 1 0 0

  44. ! Mx script for QTL association: sib pairs, univariate Group 1 : Calc NG=2 Begin Matrices; ! ** Parameters B Full 1 1 free ! association : between component W Full 1 1 free ! association : within component M Full 1 1 free ! mean S Full 1 1 free ! Shared residual variance N Full 1 1 free ! Nonshared residual variance ! ** Definition variables ** C Full 1 1 ! association : between X Full 1 1 ! association : within, sib 1 Y Full 1 1 ! association : within, sib 2 End Matrices; ! ** Uncomment for B=W model ! Equate W 1 1 1 B 1 1 1 ! Starting values Matrix B 0 Matrix W 0 Matrix M 0 Matrix S 0.5 Matrix N 0.5 End

  45. Group2 : Data Group Data NI=7 NO=0 RE file=assoc.dat Labels Sib1 Sib2 g1 g2 b w1 w2 Select Sib1 Sib2 b w1 w2 / Definition b w1 w2 / Matrices = Group 1 Means M + B*C + W*X | M + B*C + W*Y / Covariance S + N | S _ S | S + N / Specify C b / Specify X w1 / Specify Y w2 / End

  46. Models B & W B Full 1 1 free W Full 1 1 free !Equate W 1 1 1 B 1 1 1 B = W B Full 1 1 free W Full 1 1 free Equate W 1 1 1 B 1 1 1 B B Full 1 1 free W Full 1 1 !Equate W 1 1 1 B 1 1 1 B=W=0 B Full 1 1 W Full 1 1 !Equate W 1 1 1 B 1 1 1

  47. Tests Test HA H0 Standard association test B = W B=W=0 Test of stratification B & W B = W Robust association test B & W B

More Related