1 / 20

2003

2003. 23 Allele pairs 355 Characters. 2013. 856 Allele pairs 15,778 Characters. Some things change. 2003.

hanne
Download Presentation

2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2003

  2. 23 Allele pairs 355 Characters

  3. 2013

  4. 856 Allele pairs 15,778 Characters Some things change

  5. 2003

  6. GCTCYCACTCCATGAGGTATTTCTYCACATCCGTGTCCCGGCCCGGCCGCGGGGAGCCCCGCTTCATCGCMGTGGGCTACGTGGACGACACGCAGTTCGTGCGGTTCGACAGCGACGCCGCGAGCCAGAGGATGGAGCCGCGGGCGCCGTGGATAGAGCAGGAGGGKCCGGAGTATTGGGACGRGGAGACASGGAAAGTGAAGGCCCACTCACAGACTSACCGAGWGRACCTGSGGAYCSYGCKCSGCTACTACAACCAGAGCGAGGCCGGTTCTCACACCSTCCAGAKGATGTWTGGCTGCGACGTGGGGTCGGACKGGCGCTTCCTCCGCGGGTACCACCAGTACGCCTACGACGGCAAGGATTACATCGCCCTGAAAGAGGACCTGCGCTCTTGGACCGCGGCGGACATGGCRGCTCAGAYCACCAAGCRCAAGTGGGAGGCGGCCCATGTGGCGGAGCAGYWGAGAGCCTACCTGGAGGGCACGTGCGTGGASKGGCTCCGCAGATACCTGGAGAACGGGAAGGAGACGCTGCAGCGCACGGGCTCYCACTCCATGAGGTATTTCTYCACATCCGTGTCCCGGCCCGGCCGCGGGGAGCCCCGCTTCATCGCMGTGGGCTACGTGGACGACACGCAGTTCGTGCGGTTCGACAGCGACGCCGCGAGCCAGAGGATGGAGCCGCGGGCGCCGTGGATAGAGCAGGAGGGKCCGGAGTATTGGGACGRGGAGACASGGAAAGTGAAGGCCCACTCACAGACTSACCGAGWGRACCTGSGGAYCSYGCKCSGCTACTACAACCAGAGCGAGGCCGGTTCTCACACCSTCCAGAKGATGTWTGGCTGCGACGTGGGGTCGGACKGGCGCTTCCTCCGCGGGTACCACCAGTACGCCTACGACGGCAAGGATTACATCGCCCTGAAAGAGGACCTGCGCTCTTGGACCGCGGCGGACATGGCRGCTCAGAYCACCAAGCRCAAGTGGGAGGCGGCCCATGTGGCGGAGCAGYWGAGAGCCTACCTGGAGGGCACGTGCGTGGASKGGCTCCGCAGATACCTGGAGAACGGGAAGGAGACGCTGCAGCGCACGG 546 Characters

  7. 2013

  8. GCTCYCACTCCATGAGGTATTTCTYCACATCCGTGTCCCGGCCCGGCCGCGGGGAGCCCCGCTTCATCGCMGTGGGCTACGTGGACGACACGCAGTTCGTGCGGTTCGACAGCGACGCCGCGAGCCAGAGGATGGAGCCGCGGGCGCCGTGGATAGAGCAGGAGGGKCCGGAGTATTGGGACGRGGAGACASGGAAAGTGAAGGCCCACTCACAGACTSACCGAGWGRACCTGSGGAYCSYGCKCSGCTACTACAACCAGAGCGAGGCCGGTTCTCACACCSTCCAGAKGATGTWTGGCTGCGACGTGGGGTCGGACKGGCGCTTCCTCCGCGGGTACCACCAGTACGCCTACGACGGCAAGGATTACATCGCCCTGAAAGAGGACCTGCGCTCTTGGACCGCGGCGGACATGGCRGCTCAGAYCACCAAGCRCAAGTGGGAGGCGGCCCATGTGGCGGAGCAGYWGAGAGCCTACCTGGAGGGCACGTGCGTGGASKGGCTCCGCAGATACCTGGAGAACGGGAAGGAGACGCTGCAGCGCACGGGCTCYCACTCCATGAGGTATTTCTYCACATCCGTGTCCCGGCCCGGCCGCGGGGAGCCCCGCTTCATCGCMGTGGGCTACGTGGACGACACGCAGTTCGTGCGGTTCGACAGCGACGCCGCGAGCCAGAGGATGGAGCCGCGGGCGCCGTGGATAGAGCAGGAGGGKCCGGAGTATTGGGACGRGGAGACASGGAAAGTGAAGGCCCACTCACAGACTSACCGAGWGRACCTGSGGAYCSYGCKCSGCTACTACAACCAGAGCGAGGCCGGTTCTCACACCSTCCAGAKGATGTWTGGCTGCGACGTGGGGTCGGACKGGCGCTTCCTCCGCGGGTACCACCAGTACGCCTACGACGGCAAGGATTACATCGCCCTGAAAGAGGACCTGCGCTCTTGGACCGCGGCGGACATGGCRGCTCAGAYCACCAAGCRCAAGTGGGAGGCGGCCCATGTGGCGGAGCAGYWGAGAGCCTACCTGGAGGGCACGTGCGTGGASKGGCTCCGCAGATACCTGGAGAACGGGAAGGAGACGCTGCAGCGCACGG 546 Characters Some things remain the same

  9. 02:01/02:09/02:36/02:43N = 02:TBF 24:02/24:03/24:09N/24:11N/24:13/24:33 = 24:APFW 23 Allele pairs 355 Characters

  10. 856 allele pairs in 02:01/02:01L/02:01Q/02:04/02:09/02:12/02:36/02:37/02:43N/02:58/02:66/02:70/02:75/02:76/02:83N/02:86/02:87/02:89/02:90/02:97/02:110/02:132/02:134/02:140/02:226N/02:241/02:252/02:256/02:266/02:291/02:294/ 02:305N/02:327/02:329/02:356N/02:357/02:397 24:02/24:02L/24:02Q/24:03/24:09N/24:11N/24:13/24:14/24:28/24:29/ 24:30/24:33/24:40N/24:52/24:76/24:79/24:83N/24:93/24:104/24:144/ 24:150/24:153/24:154/24:155N/24:163N/24:183N/24:207/24:218 11,132 allele pairs out

  11. P Groups 112 pairs 1891 Characters 856 allele pairs in 856 allele pairs out Functional basis GL codes also provide lossless compression

  12. Heterozygous Ambiguity. Current Database: 856 Number of valid sequence combinations: 224 = 16,777,216 We use the database to make assumptions about the relationship between bases that were not measured experimentally. The use of phasing information has a much larger impact on data integrity than the allele pairs suggest Number of valid combinations using amplicon: 2 Number of valid combinations using shotgun: 1

  13. Amplification Hybrids Loci with islands of variability are susceptible to hybridisation during amplification. When the variable regions are bridged by sequence with high homology it is possible to produce clonal sequences that are combinations of the real allele sequences. This has the highest impact in systems where multiple loci are amplified in the same reaction, making it more difficult to distinguish between distinct alleles and experimental artefacts.

  14. Communicating Data. Combined Sequence YCACCAAGCR CAAGTGGGAGGCGGCCCATGTGGCGGAGCAG YW UNLINKED HARP TCACCAAGCG CAAGTGGGAGGCGGCCCATGTGGCGGAGCAG CT LINKED Short Clones PHASED UNPHASED PHASED PARTIALLY TCACCAAGCG CAAGTGGGAGGCGGCCCATGTGGCGGAGCAG CTLINKED CCACCAAGCATA Long Clones / Paired End PHASED TCACCAAGCG CAAGTGGGAGGCGGCCCATGTGGCGGAGCAGCT LINKED CCACCAAGCACAAGTGGGAGGCGGCCCATGTGGCGGAGCAG TA

  15. Phase Breaks Short Clones / Amplicon PHASED UNPHASED PHASED PARTIALLY TCACCAAGCG CAAGTGGGAGGCGGCCCATGTGGCGGAGCAG CTLINKED CCACCAAGCATA Why were these polymorphisms unlinked? Experimental Read length insufficient to cover the distance. Separated amplifications. Dropout of one allele between positions. Software Rejection of sequences based on observation of uncharacterised allele. Correct handling of insertions/deletions.

  16. Quality Metrics A/C/G/T count per position A/C/G/T mean confidence value per position For phase relationships, the number of sequences that were used to establish the link

  17. Beyond HLA Same principles will apply from single exons to loci to haplotypes to analysis from exomes, transcriptomes and genomes. The sequence data describes itself – experimental conditions may assist in locus assignment but where there is assignment ambiguity that should remain in the presented data. Beyond Sequence Data Extensible systems, such as HML and the IMGT/HLA XML formats provide an ideal mechanism for communicating additional data. Such systems can be readily modified to suit the needs of particular projects and allow linkage to centrally stored and served data. Sequence origin, raw data, serological properties and even allele names could then be retrieved on demand and data segregated according to the most suitable set of criteria without affecting the basic format.

More Related