1 / 24

Considerations for Analyzing Targeted NGS Data Exome

Considerations for Analyzing Targeted NGS Data Exome. Tim Hague , CTO. Exome Analysis. 3 sets of full exome sequences for the same individual, targeted by 3 different kits One set had data problems because reads were from 2 different sequencers

Download Presentation

Considerations for Analyzing Targeted NGS Data Exome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Considerations for Analyzing Targeted NGS DataExome Tim Hague, CTO

  2. Exome Analysis • 3 sets of full exome sequences for the same individual, targeted by 3 different kits • One set had data problems because reads were from 2 different sequencers • Remaining 2 sets were analyzed both by the customer and by Omixon

  3. Exome Targets • Illumina TruSeq ~62 Mbp • Nimblegen SeqCap EZ Exome ~64 Mbp • ~35 Mbp overlap between targets • Exons, ORFs and putative translated regions captured • 40M and 37M read pairs resp., 101bp length

  4. Full Analysis Pipelines • In this case we are comparing two full NGS analysis pipelines • Including the mapping/alignment and a multi-step variant call pipeline • The Omixon pipeline for this analysis uses two variant callers • The Omixon pipeline also uses recalibration and indel realignment

  5. Finding long indels 1.

  6. Better indel resolution 1.

  7. Better indel resolution 2.

  8. Indel Handling • If indels are important to an analysis then this needs to be taken into account, from the planning stage onwards • BWA does better when indel realignment is used, in combination with paired data

  9. Less low quality false positives

  10. Quality and Coverage • Some of these low quality variants can be removed by filtering, after variant call • Quality and coverage cut-offs have to be parameterized properly in the alignment and variant call • Quality recalibration can also help to reduce low quality false positives

  11. Variations next to coding areas

  12. Splicing and Promoters • Most of the exon kits also provide variant calls close to the coding regions • These should be included in the analysis if possible

  13. Less false positives in complex regions 1.

  14. Less false positives in complex regions 2.

  15. Less false positives in complex regions 3.

  16. Less false positives in complex regions 4. Higher coverage.

  17. Less false positives in complex regions 5. Lower coverage.

  18. Complex regions • Mismappings due to pseudogenes or repeats – or just complex regions? • Sometime more coverage can actually be bad • Need to watch out for non-specific read mappings (reads mapping to multiple places)

  19. Regions where both aligners are confused 1.

  20. Regions where both aligners are confused 2.

  21. Very Complex Regions • Some regions are extremely difficult to map with any techniques • A different approach may be required to mapping/alignment • A different approach may be required to variant call (local de novo, phasing etc)

  22. Problems with sex chromosomes • There are may heterozygous calls in the X and Y chromosomes that are certainly false positives or incorrect calls. • This is true for both pipelines, the read specificity and variant call procedure has to be improved for these chromosomes.

  23. Summary • These kinds of comparative studies can be useful in analyzing the effectiveness of exome sequencing • Different exome kits can give different results • The data analysis and variant call tools chosen for the analysis can also have a big impact • There is some potential to improve the quality of the customer's exome analysis pipeline

More Related