1 / 65

Haplotype Trees

Haplotype Trees. Using The Evolutionary History of Small DNA Regions To Investigate Common Diseases. Replication. Coalesence. Unrooted Haplotype Tree. Statistical Vs. Maximum Parsimony. A = AGCT B = TGCT C = TACT D = AAGG. e 3. The Apo-protein E Haplotype Tree. 21. 14. 30. 1522.

idola
Download Presentation

Haplotype Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Haplotype Trees Using The Evolutionary History of Small DNA Regions To Investigate Common Diseases

  2. Replication Coalesence

  3. Unrooted Haplotype Tree

  4. Statistical Vs. Maximum Parsimony A = AGCT B = TGCT C = TACT D = AAGG

  5. e3 The Apo-protein E Haplotype Tree 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 e4 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 e2 22

  6. What Use Are Haplotype Trees? • Provides an Interpretive Framework When Integrated With Other Analyses • Evolutionary History Generates Hypotheses About Current Significance • Provides a Powerful Tool For Detecting Current Genotype-Phenotype Associations

  7. A Haplotype Tree Can Provide an Interpretive Framework When Integrated With Other Analyses

  8. Hamon and Sing estimated interactions for all 53 pairs of ApoE sites for lnApoE variability in North Karelia, Females 8.0 560- 1163** 560-832** 6.0 560-2440** R2 X 100 4.0 832-1163** 2.0 3937-4075 0.0 g0624 g1163 g0624 g1998 g0560 g2440 g3937 g4951 g2440 g5361 g1998 g3937 g1163 g5361 g1998 g4075 g0560 g3106 g1163 g4075 g4075 g5361 g1998 g2907 g2440 g2907 g0832 g5361 g1522 g2440 g0624 g3937 g0832 g2907 g0624 g0832 g1163 g3937 g0560 g4075 g0832 g3106 g1163 g1522 g1163 g3106 g1998 g4951 g0560 g1998 g1998 g2440 g3937 g5361 g1998 g5361 g1163 g4951 g0624 g5361 g0624 g2440 g0832 g1163 g0560 g0624 g2440 g4075 g0832 g4075 g2440 g4951 g3937 g4075 g0560 g5361 g0832 g4951 g1163 g2440 g2907 g3937 g0832 g3937 g1163 g1998 g0832 g1522 g0624 g4075 g2440 g3106 g3106 g5361 g0832 g1998 g0832 g2440 g2440 g3937 g0560 g3937 g0560 g0832 g0560 g1163

  9. 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 3701 73 560 4951 29 Parallel Mutations At Site 560 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 Sites Identified By Hamon and Sing That “Interact” With Site 560 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  10. 21 14 Evolutionary Hypothesis: Two Functional Mutations (Occurring On A Specific Haplotype Background) Have Created Three Allelic Clades For the Phenotype Of ln(ApoE); the Red, Blue and Black Clades 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 3701 73 560 4951 29 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  11. The Red Clade Is Uniquely Defined By These Two Sites 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 The Blue Clade Is Uniquely Defined By These Two Sites 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  12. The Red Clade Is Not Uniquely Defined By These Two Sites Due to Homoplasy 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  13. 21 14 Sites 560 and 624 Fall into an Alu Repeat 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 The Apo-protein E Haplotype Tree 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  14. 1 Kb Exon 1 Exon 2 Exon 3 Exon 4 73 832 308 471 545 624 560 3106 1163 1522 1575 1998 2440 2907 3673 3937 4036 4075 4951 5361 5229a 5229b Single SNP Analysis of lnApoE in North Karelia, females * * * Indicates a significant single site effect

  15. The Single SNP Analysis Identifies Sites With A Weaker Phenotypic Association Because It Cannot Deal With Homoplasy At Site 560 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  16. The Single SNP Analysis Identifies Sites With A Weaker Phenotypic Association Because It Cannot Deal With Homoplasy At Site 560 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 There is a deliberate attempt To find SNPs that are Polymorphic in most or all Populations and that have High heterozygosities; that is, SNPs just like the one at Site 560. 10 560 16 624 624 4951 24 9 560 1575 22

  17. Linkage Disequilibrium Is Frequently Used in Association Studies, But Also Is Frequently Misinterpreted.Haplotype Trees Can Aid In Understanding The Proper Biological Interpretation

  18. ApoE Gene Stengård et al. (1996) showed the amino acid replacement alleles at ApoE have a major impact on mortality due to CAD in a longitudinal study.

  19. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region

  20. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region These Two Sites Are in Disequilibrium

  21. The Apo-protein E Haplotype Tree 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 1575 22

  22. The Apo-protein E Haplotype Tree These haplotypes Are T at Site 832 & C At Site 3937 21 14 30 1522 1575 5361 2907 26 624 17 20 18 624 1 4 560 29 3701 73 560 4951 832 11 23 28 19 624 545 4036 5361 471 1163 3937 1998 1998 5361 2 2440 832 624 25 15 7 6 5 12 560 560 3 8 560 3106 5229B 4951 308 31 13 27 3673 4075 10 560 16 624 624 4951 24 9 560 These haplotypes Are G at Site 832 & T At Site 3937 1575 22

  23. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region Site 3937 Is An Amino Acid Polymorphism That Affects ApoE Function and CAD

  24. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region Suppose Only This Portion Was Sequenced Site 3937 Is An Amino Acid Polymorphism That Affects ApoE Function and CAD

  25. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region Suppose Only This Portion Was Sequenced Site 832 Would Appear to Have The Strongest Association with ApoE Function and CAD Site 3937 Is An Amino Acid Polymorphism That Affects ApoE Function and CAD

  26. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region Suppose Only This Portion Was Sequenced Site 832 Would Have The Strongest Association with ApoE Function and CAD

  27. 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 E E E E x x x x o o o o n n n n 1 2 3 4 7 3 4 5 5 6 8 1 1 1 1 2 2 3 3 3 3 4 4 4 5 5 5 3 0 7 4 6 2 3 1 5 5 9 4 9 1 6 7 9 0 0 9 2 2 3 8 1 5 0 4 2 6 2 7 9 4 0 0 7 0 3 3 7 5 2 2 6 3 2 5 8 0 7 6 3 7 6 5 1 9 9 1 1* B A Apoprotein E Gene Region Would you infer From this Association That the Marker Closest to the Functional Site Was Here? Suppose Only This Portion Was Sequenced Site 832 Would Have The Strongest Association with ApoE Function and CAD

  28. Haplotype Trees Estimate an Evolutionary History That Can Generate Hypotheses About The Current Significance of Genetic Variation

  29. LPL Tree

  30. Detecting Recombinantion Events in LPL a=3, b=5, k=3, p =0.0179, crossover between sites 13 and 29. 1 10 20 30 40 50 60 69 2JNR CAGTTTCCCT CAGCACGATC GCAATTGCAC CTCAATGTAT AGTTGTAACC GAGTCCGCAT AACTATAGG 5NR CAGTTTATCT CACCACGATA GCAATTGCAC CTCAATGTAT AGTTGTAACC GAGTCCGCAT AACTATAGG Node a CAGTTTATCT CACCACGATC GCAATTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG a=2, b=7, k=2, p =0.0278, crossover between sites 16 and 19. Node d CAGTTTATCT CACCACGATC GCAACTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG 11J CAGTATATCT CACCATGATC GCAACTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG Node e CAGTATATCT CACCATGAGC GCAATTGCAC TTTAA?GTAT AGTTGTAACC GAATCAGCAT CACTGGAGA 11J CAGTATATCT CACCATGATC GCAACTGCTC TTTAATGTAT AGTTGTAACC GAATCAGCAT AACTATAGG Node e CAGTATATCT CACCATGAGC GCAATTGCAC TTTAA?GTAT AGTTGTAACC GAATCAGCAT CACTGGAGA T-1 CAGTTTATCT CACCACGAGC GCAATTGCAC TTTAA?GTAT AGTTGTAACC GAATCAGCAT CACTGGAGA

  31. Linkage Disequilibrium & TheRecombinational Hotspot in LPL

  32. Haplotype Network in 5’ Region of LPL

  33. Haplotype Network in 3’ Region of LPL

  34. Positive (Diversifying) Selection or Subdivision Positive (Directional) Selection or Bottleneck Neutral Genetic Drift, Expanding Population Size Neutral Genetic Drift, Stable Population Size Negative Selection

  35. Peeled Haplotype Network in of LPL

  36. Evolutionary Inferences On LPL • 5’ End Subject to Directional Selection, With A Selective Sweep Enhanced By Recombination • 3’ End Subject to Diversifying Selection • Implies That Most Current Polymorphisms With Functional Significance Are In 3’ End

  37. Haplotype Trees Provide a Powerful Tool For Detecting Current Genotype-Phenotype Associations • Nested Clade Analysis • Tree Scanning

  38. Nested Clade Analysis • In 1987 Published The Nested Clade Method For Using A Haplotype Tree As A Tool For Discovering Gene/Phenotype Associations • Nests The Haplotypes in Tree Into Evolutionary Clades (Branches) • The Resulting Nested Design Provides Asymptotic Independence And A Priori Contrasts For Detecting Phenotypic Associations.

  39. The Drosophila Adh Haplotype Tree

  40. The Drosophila Adh Haplotype Tree 1-6 1-11 1-10 1-7 1-1 1-2 1-9 1-3 1-5 1-8 1-4

  41. The Drosophila Adh Haplotype Tree 2-5 2-3 2-1 2-4 2-2

  42. The Drosophila Adh Haplotype Tree 3-1 3-2

  43. Results of Nested Analysis of Variance of Adh Activity Using The Adh Haplotype Tree ** *** ** ** *** Significant 0.1% Level ** Significant at 1% Level

  44. Functional Allelic Categories from the Nested Analysis of Variance of Adh Activity ** *** ** **

  45. Phenotypic Distributions Identified Though Nested Clade Analysis

  46. Nested Clade Analyses • Greater Statistical Power By Focusing On Fewer Comparisons • Greater Biological Power In Detecting Mutations With Phenotypic Effects • Deals With High Levels of Genetic Variation Through Pooling Into Clades • Deals With Linkage Disequilibrium Through Haplotypes And Tree Branches • Useful In Ultimately Identifying Causative Mutations

  47. Nested Clade Analyses • Although Nesting Is Common In Statistics and Evolutionary Biology, It Is Unfamiliar and Daunting To Others • The Analysis Finds Phenotypic Associations With Haplotypes or Groups of Haplotypes: Does Not Deal Directly With Dominance Effects Or Genotypes. • Is Inherently A Single Locus (Or Smaller) Analysis: Does Not Deal Directly With Epistasis

  48. Tree Scanning A New Method for Using Haplotype Trees At Candidate Loci To Investigate Genotype-Phenotype Associations.

More Related