1 / 14

Fill ing Missing Genotypes Using Haplotypes

Fill ing Missing Genotypes Using Haplotypes. Genotypes / Haplotypes. Genotypes indicate how many copies of each allele were inherited Haplotypes indicate which alleles are on which chromosome Observed genotypes partitioned into the two unknown haplotypes

Download Presentation

Fill ing Missing Genotypes Using Haplotypes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Filling Missing Genotypes Using Haplotypes

  2. Genotypes / Haplotypes • Genotypes indicate how many copies of each allele were inherited • Haplotypes indicate which alleles are on which chromosome • Observed genotypes partitioned into the two unknown haplotypes • Pedigree haplotyping uses relatives • Population haplotyping finds matching allele patterns

  3. Filling missing genotypes • Predict unknown SNP from known • Measure 3,000, predict 43,000 SNP • Measure 50,000, predict 500,000 • Measure each haplotype at highest density only a few times • Predict dam from progeny SNP • Increase reliabilities for less cost

  4. Haplotyping Programfindhap.f90 • Begin with population haplotyping • Divide chromosomes into segments, ~250 SNP / segment • List haplotypes by genotype match • Similar to fastPhase, IMPUTE • End with pedigree haplotyping • Detect crossover, fix noninheritance • Impute nongenotyped ancestors

  5. Computer Requirements500,000 markers, 33,414 animals

  6. Recent Program Revisions • Improved imputation and GEBV reliability since 9WCGALP paper • Changes since January 2010 • Use known haplotype if second is unknown • Use current instead of base frequency • Combine parent haplotypes if crossover is detected • Begin search with parent or grandparent haplotypes • Store 2 most popular progeny haplotypes

  7. Example Bull: O-StyleUSA137611441, Sire = O-Man • Read genotypes and pedigrees • Write haplotype segments found • List paternal / maternal inheritance • List crossover locations

  8. O-Style HaplotypesChromosome 15

  9. Pedigree HaplotypingAB allele coding Genotypes: OMan BB,AA,AA,AB,AA,AB,AB,AA,AA,AB Ostyle BB,AA,AA,AB,AB,AA,AA,AA,AA,AB Haplotypes: OStyle (pat) B A A _ A A A A A _ OStyle (mat) B A A _ B A A A A _

  10. Allele and Segment Coding • Genotypes • 0 = BB, 1 = AB or BA, 2 = AA • 5 = missing • Haplotypes • 0 = B, 1 = not known, 2 = A • Segment storage (example) • O-Style has haplotype numbers 5 and 8 • O-Man has haplotype numbers 8 and 21 • O-Style got haplotype number 5 from dam

  11. Most Frequent Haplotypes1st segment of chromosome 15 1 5.16% 022222222020020022002020200020000200202000022022222202220 2 4.37% 022020220202200020022022200002200200200000200222200002202 3 4.36% 022020022202200200022020220000220202200002200222200202220 4 3.67% 022020222020222002022022202020000202220000200002020002002 5 3.66% 022222222020222022020200220000020222202000002020220002022 6 3.65% 022020022202200200022020220000220202200002200222200202222 7 3.51% 022002222020222022022020220200222002200000002022220002220 8 3.42% 022002222002220022022020220020200202202000202020020002020 9 3.24% 022222222020200000022020220020200202202000202020020002020 10 3.22% 022002222002220022002020002220000202200000202022020202220 Most frequent haplotype in Holsteins had 4,316 copies = .0516 * 41,822 animals * 2 chromosomes each

  12. Population Haplotyping Steps • Put first genotype into haplotype list • Check next genotype against list • Do any homozygous loci conflict? • If haplotype conflicts, continue search • If match, fill any unknown SNP with homozygote • 2nd haplotype = genotype minus 1st haplotype • Search for 2nd haplotype in rest of list • If no match in list, add to end of list • Sort list to put frequent haplotypes 1st

  13. Check New Genotype Against List1st segment of chromosome 15 Check genotype: 022112222011221022021110220010110212202000102020120002021 5.16% 022222222020020022002020200020000200202000022022222202220 4.37% 022020220202200020022022200002200200200000200222200002202 4.36% 022020022202200200022020220000220202200002200222200202220 3.67% 022020222020222002022022202020000202220000200002020002002 3.66% 022222222020222022020200220000020222202000002020220002022 3.65% 022020022202200200022020220000220202200002200222200202222 3.51% 022002222020222022022020220200222002200000002022220002220 3.42% 022002222002220022022020220020200202202000202020020002020 3.24% 022222222020200000022020220020200202202000202020020002020 3.22% 022002222002220022002020002220000202200000202022020202220 Subtract 1st haplotype from genotype to get 2nd: 022002222002220022022020220020200202202000202020020002020

  14. Conclusions • Missing genotypes can be filled easily • Population and pedigree haplotyping can both process long segments efficiently • Imputing 500,000 SNP for 33,414 Holsteins required 3 Gbyte memory, 3 CPU hours • Program findhap.f90 implemented for April 2010 routine evaluation • Several recent improvements to accuracy • Ready to include lower or higher density genotypes in evaluations

More Related