1 / 22

Rare-Allele Detection Using Compressed Se(que)nsing

Rare-Allele Detection Using Compressed Se(que)nsing. Noam Shental Department of Computer Science, The Open University of Israel shental@openu.ac.il. Rare-Allele Detection Using Compressed Se(que)nsing. Or Zuk Broad Institute of MIT and Harvard In collaboration with: Amnon Amir

jola
Download Presentation

Rare-Allele Detection Using Compressed Se(que)nsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rare-Allele Detection Using Compressed Se(que)nsing Noam Shental Department of Computer Science, The Open University of Israel shental@openu.ac.il

  2. Rare-Allele Detection Using Compressed Se(que)nsing Or Zuk Broad Institute of MIT and Harvard In collaboration with: Amnon Amir Department of Physics of Complex Systems, Weizmann Institute of Science Noam Shental Department of Computer Science, The Open University of Israel

  3. Rare recessive genetic diseases Genotype Phenotype Normal Healthy Carrier Healthy! Affected Sick

  4. Nationwide carrier screen

  5. Large scale carrier screen (rates vary across ethnic groups)

  6. Published Genome-Wide Associations through 12/2009, 658 published GWA at p<5x10-8 [NHGRI GWAS Catalog www.genome.gov/GWAStudies]

  7. What Associations are Detected? [T.A. Manolio et al. Nature 2009]

  8. Specific mutations HEXA gene on chromosome 15 over 100 mutations are known

  9. Specific mutations - notation …AGCGTTCT… “A” Reference genome …AGTGTTCT… “B” Single-nucleotide polymorphism (SNPs) …AGGTTCT “B” Insertions/Deletions (InDels) Carrier test screen: Amplify a sample of DNA and then test 0 1/2 fraction of B’s out of tested alleles “AA” “AB”

  10. naïve approach – one test per individual collect DNA samples Apply 9 independent tests AA AA AA AA AA AA AA AB AB fraction of B’s out of tested alleles 0 0 0 0 1/2 0 0 0 1/2

  11. Compressed sensing based group testing Next Generation Sequencing Technology fraction of B’s infer/reconstruct compressed sensing a few tests instead of 9

  12. Example arxive 0909.0400v1

  13. Rare allele identification in a CS framework # rare alleles individuals in the pool

  14. Measuring device – NGST Roche/454 Illumina Solexa Helicos Applied Biosystems SOLiD

  15. NGST output output: “reads” Illumina: A few millions reads per lane 454: almost 1 million line = “read”

  16. NGST – targeted sequencing We measure the number of reads containing B out of the total number of reads.

  17. Model formulation Ideal measurement - the fraction of “B” reads: NGST measurement: • 1. sampling noise: finite number of reads from each site - r , Estimated frequency: r is itself a random variable 2. Technical errors: read errors: 0.5-1% DNA preparation errors Parts of this modeling appeared in P. Prabhu & I. Pe’er, Genome Research July 09

  18. Unique properties of this application 2. the sensing matrix is known up to noise: DNA preparation errors potential technical problems 3. potential constraints on the matrix M - sparseness: total amount of DNA 1. measurement noise is pool dependent

  19. Current work – Dor Yeshorim In collaboration with Y. Erlich, CSHL 8000 DNA samples

  20. Conclusions • Generic approach that puts together sequencing and CS for identifying rare allele carriers. • The method naturally deals with all possible scenarios of multiple carriers and heterozygous or homozygous rare alleles. • Much higher efficiency over the naive approach. • Direction for improvement: • x is trinary (0,1,2): how does one incorporate this into optimization? • Dependence among loci

  21. Related Work • Erlich et al.

  22. Other Applications • Compressed Sensing / Sparse Reconstruction can be used for other problems in genomics. • Other problems: • Bacterial Community Reconstruction • Direction for improvement: • x is trinary (0,1,2): how does one incorporate this into optimization? • Dependence among loci

More Related