1 / 6

Illu_SNV_analysis_Pipeline

Illu_SNV_analysis_Pipeline. Shiyi.Z. Diagram of analysis process. http://www.broadinstitute.org/gatk/guide/best-practices. Data Pre-processing. NGSQCToolkit filters raw data, and generates QC report. Bowtie2 map filtered data to reference, samtools convert and make duplicates.

dinh
Download Presentation

Illu_SNV_analysis_Pipeline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Illu_SNV_analysis_Pipeline Shiyi.Z

  2. Diagram of analysis process http://www.broadinstitute.org/gatk/guide/best-practices

  3. Data Pre-processing NGSQCToolkit filters raw data, and generates QC report Bowtie2 map filtered data to reference, samtools convert and make duplicates GATK realignment INDEL around sequencing data Using GATK and Samtools do variant and INDEL calling -o sample.gatk.raw1.vcf -o sample.samtools.raw1.vcf Consolidate and Filter the variant -o sample.concordance.raw1.vcf Filter: QD < 20; ReadPosRankSum < -8; FS > 10; QUAL < $MEANQUAL -o sample.concordance.flt1.vcf Correct aligned file based on filtered variant report -o sample.recal.bam

  4. Variant Discovery Patient: patient.concordance.filter1.vcf Father: father.concordance.filter1.vcf Mother: mother. concordance.filter1.vcf Control: control.concordance.filter1.vcf Based on previous *.filter1.vcf variant file, correct aligned file, and generate sample.recal.bam file Using GATK and Samtools recall variant again, and generate sample.final.vcf files

  5. Preliminary Analysis Submit VCF file to wANNOVARwebsite (http://wannovar.usc.edu/) Do annotation using ANNOVAR Variation prioritization Prioritization by ANNOVAR annotate_variation.pl -filter --dbtype generic --genericdbfile hg18_avsift.txt --score_threshold 0.05 ex1.human humandb/ Using Excel Open the file in Excel 2007 (select "tab-delimited" when opening the file). Click the "DATA" tab at the menu bar, then click the big "Filter" button. Then click any one of the headings such as 1000G_CEU or SIFT to filter out variants, essentially by clicking the check boxes. For SIFT score, make sure to use "less than 0.05 OR equal to (blank)" so that variants without SIFT score do not get filtered out. It should be straightfoward to do, but it may need a little practice for users not familiar with Excel.

  6. ANNOVAR analysis pipeline First: remove variations detected in Control Second: Genetic mode analysis Third: filtered by those parameters: (SIFT less than 0.05; PolyPhen2_HDIV greater than 0.909; PolyPhen2_HVAR greater than 0.909).

More Related