1 / 24

Imputing HLA Alleles from SNPs

Imputing HLA Alleles from SNPs. CSCI 8980 Dave Roe Mar 18, 2011. Overview. HLA SNPs Allele imputation LDMhc algorithm Conclusions and Applications. HLA. Human Leukocyte Antigen Major Histocompatibility Complex (MHC) Gene names: A, B, C, DR, DP, DQ, DQ.

hide
Download Presentation

Imputing HLA Alleles from SNPs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Imputing HLA Alleles from SNPs CSCI 8980 Dave Roe Mar 18, 2011

  2. Overview • HLA • SNPs • Allele imputation • LDMhc algorithm • Conclusions and Applications

  3. HLA Human Leukocyte Antigen Major Histocompatibility Complex (MHC) Gene names: A, B, C, DR, DP, DQ, DQ. Alleles names: A*02, B*07:02:01, etc. More digits imply greater resolution (higher coverage of the gene) Image source: Wikipedia

  4. HLA (cont.) • Genes encode proteins that present molecules (antigens) on the surface of cells to immune system cells (leukocytes) Image source: Carolyn Hurley (Georgetown)

  5. T-cell T-cell HLA HLA HLA HLA HLA HLA HLA HLA TCR TCR HLA HLA HLA HLA Infected Cell HLA (cont.) Help immune system recognize viruses, parasites, bacteria, etc. Cancer Autoimmune diseases Healthy Cell Image source: Steven Mack (CHORI)

  6. HLA (cont.) DP DQ DR B C A Most human genes have only a few (5-10) variants (alleles) 400 kb 50 kb 1100 kb 100 kb 1270 kb 141 112 809 1800 829 1193 class II loci class I loci The HLA region is the most polymorphic region of the human genome Source: Steven Mack (CHORI)

  7. HLA (cont.) DP DQ DR B C A Alleles (gene variants) on the same chromosome can be inherited together as a haplotype Consider the number of possible protein variants for each HLA gene: ~4 Trillion Possible Unique Since everyone has two copies of each chromosome: ~16 Trillion Trillion Unique A-C-B-DR Haplotype Pairs (Genotypes) But, it isn’t that complicated because inheritance occurs in haplotypes Source: Steven Mack (CHORI)

  8. HLA (cont.) DP DP DQ DQ DR DR B B C C A A • Genotype: A*03:01, A*03:01, B*08:01, B*35:02, C*04:01, C*07:01, DR*03:01, DR*11:04 • Haplotypes: 03:01 08:01 04:01 03:01 03:01 11:04 07:01 35:02 Source: adapted from Steven Mack (CHORI)

  9. HLA (cont.) Image source: http://en.wikipedia.org/wiki/File:Migration_map4.png

  10. SNPs: Single Nucleotide Polymorphisms • Allele-level gene typing: all SNPS in a gene • Relatively cheap • Used has markers for • Imputing higher (allelic) resolution information • Finding case-control differences (e.g., GWAS: genome-wide association studies) CCTGTAATGTCCCCCCTTGTACGTTAAATTT CGTGTAATGCGCCCCCTTGTACGTCAAATTT

  11. SNPs (cont.) Image source: http://www.iavireport.org/archives/2007/Pages/IAVI-Report-11%284%29-perspective.aspx

  12. SNPs (cont.)

  13. SNPs (cont.)

  14. LDMhc Approach • Imputation of allele-level typings from SNPs • Optimized for HLA • Reference set: Collections of SNPs from samples with known HLA alleles • Select most informative SNPs • Type those SNPs on experiment samples • Associate SNP typings with reference set to impute HLA alleles

  15. LDMhc: Statistical Model • Probability that a haplotype carries an allele at a locus • Goal is to optimize selected SNPs (SL)

  16. Reference Data Set • 2500 samples (each w/2 haplotypes) • 7733 SNPs per haplotype • Provides (phased) SNP haplotypes • Provides allele to SNP haplotype associations

  17. LDMhc: SNP Selection via HMM • States/Transitions: SNPs along the chromosome that define haplotypes Source: Dilthey et al. 2011

  18. LDMhc: Validation of SNP Selection • Applied new SNP selection method to an earlier experiment • Threshold on certainty of calls (e.g., 90%) • Improvement of 44% • due to call rate more than accuracy • Helps, but increased size of reference panel helps more • Samples I think – not SNPs

  19. LDMhc: Validation of Imputation • Split reference set • 2/3: training/reference • 1/3: validation Source: Dilthey et al. 2011

  20. Application to Disease Association • Applied to previous psoriasis study • C*06:02 is key risk factor • Recreated the result • More significant than any single SNP • Results • Aggregation/synergy creates information Source: Dilthey et al. 2011

  21. Software Application • Local GUI for input preparation and QC • Submit to remote server for imputation Source: Dilthey et al. 2011

  22. Conclusions (1/2) • Provides accurate, high-resolution imputation of HLA • Weakness • Most important information is imputted • Phased of SNPs • Association of alleles to SNPs • Can be improved and might lead to even greater accuracy • Race specific (plans to expand)

  23. Conclusions (2/2) • Application to transplantation • Transplantation registry needs • Large donor pool • High resolution allelic typings • Potential use for recruitment typings

  24. Acknowledgements • Dilthey, A. T., Moutsianas, L., Leslie, S., McVean, G. (2011): "HLA*IMP - An integrated framework for imputing classical HLA alleles from SNP genotypes " Bioinformatics Advance Access, doi: 10.1093/bioinformatics/btr061. • Leslie, S. et al. (2008):"A statistical method for predicting classical HLA alleles from SNP data." Am J Hum Genet 82(1): 48-56. • Application: https://oxfordhla.well.ox.ac.uk/hla/tool/main • Slides/images • Steven Mack, Children’s Hospital Oakland • Carolyn Hurley, Georgetown University • Loren Gragert, NMDP

More Related