Pindel user manual
140 likes | 519 Views
Pindel user manual. Kai Ye k.ye@lumc.nl. Preparation of Pindel input. Alignment BAM file generated by BWA. Alignment BAM file generated by other aligners. bam2pindel.pl Adaptor.pm. (2) sam2pindel.cpp. Pindel input with sample tag. (3) FilterPindelReads.cpp.
Pindel user manual
E N D
Presentation Transcript
Pindel user manual Kai Ye k.ye@lumc.nl
Preparation of Pindel input Alignment BAM file generated by BWA Alignment BAM file generated by other aligners • bam2pindel.pl • Adaptor.pm (2) sam2pindel.cpp Pindel input with sample tag (3) FilterPindelReads.cpp Filtered Pindel input with sample tag Merge Pindel input files for paired or population sequence data
(1) bam2pindel.pl • Written by Keiran Raine at Sanger Institute (kr2@sanger.ac.uk) • This tool was designed for BWA based BAM/SAM Illumina data • You must prepare a name sorted bam file • Set BAM_2_PINDEL_ADAPT setenv BAM_2_PINDEL_ADAPT <path to Adaptor.pm> • Arguments: -i|input: Input BAM file (req) -o|output: Output ready for pindel -s|sample: Sample or label (sampA,sampB...) (req) -pi|insert: Required if BAM file does not have PI tag in header RG record -r|restrict: Restrict to chromosome xx • Example: ./bam2pindel_bwa.pl –i NameSorted.bam –o output_prefix -s tumour –om –pi 300
(2) sam2pindel.cpp • Written by Kai Ye at Leiden University Medical Center (k.ye@lumc.nl) • This tool was designed for all BAM/SAM Illumina data • You must first compile the cpp source code: g++ sam2pindel.cpp –o sam2pindel –O3 • 5 arguments are required by sam2pindel • 1. Input sam file. • 2. Output for pindel. • 3. insert size. • 4. tag. • 5. number of extra lines (not start with @) in the beginning of the file. • If you start with standard sam file (Input.sam with insert size 300) ./sam2pindel Input.sam Output4Pindel.txt 300 tumour 0 • If you start with bam file ./samtools view Input.bam | ./sam2pindel - Output4Pindel.txt 300 tumour 0
Running Pindel 1. Input: the reference genome sequences in fasta format; 2. Input: the unmapped reads in a modified fastq format; 3. Output folder 4. Which chr/fragment 5. BreakDancer result: Format per line: ChrALocAstringAChrBLocBstringB others If you don't have BreakDancer result, please provide an empty file here. Example: ./pindel hg19.fa pindel_input_chr1.txt Output_Folder chr1 empty
Input format of Pindel @9113 TGGGGACCGGTGGAATGCTTCCACTGGCTGGGGGGC + chr2 41149518 50 Tumor Strand, chr, 3’ coordinate and mapping quality of the mapped reads; sample tag ref Anchor
Output format: deletions D 321 ChrID 0 56173880 56174202 Supports: 15 70 130.916 TAAGAATGAGTTGGCAAATAAAGAGTTTGGTGAGTTTATAGAAATATAGGggccg<311>ataggACAAGGTACAAGGAATGGCTGAAGGAGAGAGGTTG GAGTTTATAGAAATATAGG ACAAGGTACAAGGAATG + 56173670 normal GTGAGTTTATAGAAATATAGG ACAAGGTACAAGGAA + 56173677 normal GAGTTTATAGAAATATAGG ACAAGGTACAAGGAATG + 56173681 normal TGGTGAGTTTATAGAAATATAGG ACAAGGTACAAGG + 56173687 normal GAGTTTATAGAAATATAGG ACAAGGTACAAGGAATG - 56173690 normal GTGAGTTTATAGAAATATAGG ACAAGGTACAAGGAA - 56173695 normal AGTTTGGTGAGTTTATAGAAATATAGG ACAAGGTACAAGGA - 56173697 normal GTGAGTTTATAGAAATATAGG ACAAGGTACAAGGAA + 56173700 tumor AGTTTATAGAAATATAGG ACAAGGTACAAGGAATGG + 56173710 tumor TTTGGTGAGTTTATAGAAATATAGG ACAAGGTACAA + 56174339 tumor TGAGTTTATAGAAATATAGG ACAAGGTACAAGGAATG + 56174356 tumor TGAGTTTATAGAAATATAGG ACAAGGTACAAGGAAT - 56174357 tumor GTTTATAGAAATATAGG ACAAGGTACAAGGAATGGC - 56174358 tumor GAGTTTATAGAAATATAGG ACAAGGTACAAGGAATG - 56174365 tumor AGTTTATAGAAATATAGG ACAAGGTACAAGGAATGG - 56174373 tumor 1base - 1million bases
Allow mismatches to accommodate sequence errors and SNPs D 10 ChrID 13 BP 32913041 32913052 AAATCAACTAGTGACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCAaagaacctacTCTATTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAAAGT GATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAA CAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGA CGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGA CGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGA TGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAAAG GTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGAC TAGTGACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAA CCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAA ACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGG CGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTT CCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATC AACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAA TGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACA ACCTTCCAGGGACAACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAA GATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTTGGACAA AACCCGAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAA GAACGTGATGAAAAGATCA TCTGTTGGGTTTTCATACAGCTAGCGGGAAAAAAGTTAAAATTGCAAAGGAATCTTT
Inversions sample ref
Non-template sequence in deletions, inversions and tandem duplications ref sample
Non-template sequence: deletion of 4 bases with 2 bases inserted D 4 I 2 ChrID 3 BP 156978978 156978983 Supports 12 + 0 - 12 S1 13 SUM_MS 627 NumSupSamples 1 HCC1599a 12 CATGGCTGACTTATAAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCACGTTGATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTTTATTGTC TTATAAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGCCTTGGGCAACTGCCAAA GATGCACT ATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCAT CTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCT AGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCT TTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCT TTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTT TTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCT CTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTT CTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTC AAATCCCTACAGATATGTGGTTACTTCTCTACTTTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAG CTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTTAAAGACATAGGTTT TTCCCTTTCTTTGGCTTGGGCAACTGCCAAA GATGCACTGGAGCCATTCTTCTGCATTCTTCTCATCCTTGGCCTT