1 / 24

Update on an investigation of somatic mutations in eMERGE datasets

Update on an investigation of somatic mutations in eMERGE datasets. Ken Kaufman CCHMC 6-20-19. Somatic Mutations. RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues Keren Yizhak , François Aguet , Jaegil Kim, et al Science 07 Jun 2019:

jacksone
Download Presentation

Update on an investigation of somatic mutations in eMERGE datasets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Update on an investigation of somatic mutations in eMERGE datasets Ken Kaufman CCHMC 6-20-19

  2. Somatic Mutations RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues Keren Yizhak, François Aguet, Jaegil Kim, et al Science 07 Jun 2019: Vol. 364, Issue 6444, eaaw0726

  3. Mutations • Germline Somatic +/- +/+ +/+ +/- +/- +/- +/+ +/+ +/- +/- +/- +/- +/- +/- 50% 25%

  4. Germ Line

  5. Somatic

  6. Somatic Mutation Pipeline • Samtools pileup • Ref and alt alleles detected • Each base sequenced • Calculate ratio alt/depth allele • Filter • Ratio 1% to 30% • Amino-Acid altering • Depth of 10 • Not in a Duplicated regions

  7. Samples processed • eMERGE 3 set A • 16,170 samples screened • 801 candidate somatic mutations in 773 samples • PGX ~10,000 samples processed • Initially 2798 samples 4403 variations • Filtered • 555 candidate somatic mutations in 541 samples • 66 samples with 2 or more candidates (5 highest) • 58 candidates found in 2 or more samples (4 highest). • 419 candidates found in 1 sample • 252 of the 555 candidate somatic mutations are found in EXAC database • MAF 0.009 to 8.2x10-6

  8. Characteristics of Somatic Mutations • 178 genes • 488 type –A variants • 483 Non-syn • 2 Ins • 3 Del • 61 type –B (loss of function) • 24 Stopgain • 1 Stoploss • 21 Frameshift • 3 Init codon • 12 Splicing

  9. Functional Predictions

  10. Genes with Most Somatic Mutations

  11. GATK 150 555 75

  12. Alt allele ratio Number of Variants Average Ratio

  13. Validation • Obtained DNA for 11 samples from Vanderbilt and Northwestern (Thank you Very Much!) • Sanger Sequence PCR amplified product (with controls) • Real-time PCR • Digital droplet PCR

  14. Validation Results

  15. Validation • PGX • 9142 samples screened (6.1%) • 555 candidate somatic mutations in 541 samples • eMERGE 3 set A • 16,170 samples screened (5.0%) • 801 candidate somatic mutations in 773 samples • eMERGE 3 set B • In process downloading bam files from DNAnexus

  16. Validation Strategy(iGENOMX Riptide) SNP Adapter Sequence  biotinylated ddNTP

  17. Strand displacing Extension SNP Adapter Barcode Random Sequence

  18. Low cycle PCR SNP

  19. Sequencing • Samples from 96 to 960 • Sequencers • MiSeq 25M reads • 96 samples • 96 targets • ~270 X coverage • HiSeq 900M reads (3 Lane) • 960 samples • 960 targets • ~100 X coverage

  20. Sample Status

  21. Current Status • Process remaining eMERGE 3 data set • Obtain samples for validation • Finalize validation strategy. • Contact Ken Kaufman (Kenneth.Kaufman@cchmc.org) or Paul Gecaine (Paul.Gecaine@cchmc.org) to participate.

  22. Acknowledgements DNAnexus • Andrew Carroll • John Didion • eMERGE consortium • Vanderbilt • Northwestern • University of Washington • Baylor (Richard Gibbs group) • CCHMC • John Harley • Scott Richards • Paul Gecaine • Beth Cobb • Cindy Prows • Bahram Namjou-Khales Contact Ken Kaufman (Kenneth.Kaufman@cchmc.org) or Paul Gecaine (Paul.Gecaine@cchmc.org) to participate

  23. Strategy • VCF files only have data where a variant was called. • Bam files have data at every position sequenced. • Ratio of Ref to Alternate allele skewed

  24. Comparison other Programs • Mosaic Hunter • Tested against candidate samples • 56 somatic candidates • 3 overlapped • Somaticsniper • Normal vs tumor • Failed in our application • Most somatic mutation detection approaches require optimization for each data set.

More Related