120 likes | 267 Views
2000. 1995. Genômica e Bioinformática. ESTs mesmo que redundantes. Genoma completo ou morte!. (A) 200. (A) 200. start. end. O fim de uma EST. Uma foto de um novo transcriptoma [otorrin...] [...damonh...]. mRNA. cDNA (fita +). ATG. AUG. (A) 18. ATCATGACTTACGGGCGCGCGATxxxxxx.
E N D
2000 1995 Genômica e Bioinformática ESTs mesmo que redundantes Genoma completo ou morte!
(A)200 (A)200 start end O fim de uma EST Uma foto de um novo transcriptoma [otorrin...] [...damonh...] mRNA cDNA (fita +) ATG AUG (A)18 ATCATGACTTACGGGCGCGCGATxxxxxx AAATTTATTATCCxxxxx (T)18 5’EST cDNA (fita -) 3’EST mRNA cDNA (fita +) AUG (A)18 GGCGCGCGATATCCxxxx AAATTTATTATCCATCTACGxxxx (T)18 5’EST cDNA (fita -) 3’EST
Vida depois de PHRED 15 Query: non trimmed read. Subj: published sequence Query: 469 TTAGGAGGATCGTTTTTAGAATCCCCTGCAACGTTACCACGGTGGATTTCACTGACTGCG 528 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1038 ttaggaggatcgtttttagaatcccctgcaacgttaccacggtggatttcactgactgcg 979 Query: 529 ACGTTCTTAACGTTGAATCCAACGTTGCTACCAgggagagcctcagtaagtgcttcatga588 ||||||||||||||||| || |||||||||||||||||| |||||||||||||||||||| Sbjct: 978 acgttcttaacgttgaagcccacgttgctaccagggagaccctcagtaagtgcttcatga 919 Query: 589 tgcatttcgacagaattgacttcagtcgacaaaccttgcggagcaaaagtgacgaccata 648 |||||||||||||| |||||||||| |||| ||||||||||| ||||||||||||||||| Sbjct: 918 tgcatttcgacagacttgacttcagccgaccaaccttgcggaccaaaagtgacgaccata 859 Query: 649 ccaggcttgatgataccagtttcaacgc 676 |||||||||||||||||||||||||||| Sbjct: 858 ccaggcttgatgataccagtttcaacgc 831
When PHRED meets BLAST • pUC18 (published sequence) Sequencing reaction: • single pool distributed over 3 96-well plates • 3 MegaBACE • 3 reads each - 846 reads total Processing: • MegaBLAST (BLASTn, SWAT) • Phred • trim: a chromatogram analyzer • trim_alt: increasing trim_cutoff from 1% up to 25%
O fim de uma EST PHRED 10 (10% error): only losses Phred
16% 17% Added bases 30,00% Error occurrence: 25,00% Trimmed reads 20,00% % error in sequence % error in the tip 15,00% 10,00% 5,00% 3% 0,00% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 11% 12% 13% 14% 15% 16% 17% 18% 19% 20% 21% 22% 23% 24% 25% total miscall stepwise miscall
Virtual pUC18 protein: STOP = * >protein_puc18 RQGFPSHDVVKRRPVPSLHACRSTLEDPRVPSSNS*SWS*LFPV*NCYPLTIPHNIRAGS IKCKAWGA**VS*LTLIALRSLPAFQSGNLSCQLH**IGQRAGRGGLRIGRSSASSLTDS LRSVVRLRRAVSAHSKAVIRLSTESGDNAGKNM*AKGQQKARNRKKAALLAFFHRLRPPD EHHKNRRSSQRWRNPTGL*RYQAFPPGSSLVRSPVPTLPLTGYLSAFLPSGSVALSHSSR CRYLSSV*VVRSKLGCVHEPPVQPDRCALSGNYRLESNPVRHDLSPLAAATGNRISRARY VGGATEFLKWWPNYGYTRRTVFGICALLKPVTFGKRVGSS*SGKQTTAGSGGFFVCKQQI TRRKKGSQEDPLIFSTGSDAQWNENSR*GILVMRLSKRIFT*ILLN*K*SFKSI*SIYE* TWSDSYQCLISEAPISAICLFRSSIVA*LPVV*ITTIREGLPSGPSAAMIPRDPRSPAPD LSAINQPAGRAERRSGPATLSASIQSINCCREARVSSSPVNSLRNVVAIATGIVVSRSSF GMASFSSGSQRSRRVT*SPMLCKKAVSSFGPPIVVRSKLAAVLSLMVMAALHNSLTVMPS VRCFSVTGEYSTKSF*E*CMRRPSCSCPASIRDNTAPHSRTLKVLIIGKRSSGRKLSRIL PLLRSSSM*PTRAPN*SSASFTFTSVSG*AKTGRQNAAKKGIRATRKC*ILILFLFQYY* SIYQGYCLMSGYIFECI*KNKQIGVPRTFPRKVPPDV*ETIIIMTLTYKNRRITRPFRLA RFGDDGENL*HMQLPETVTACL*ADAGSRQARQGASAGVGGCRGWLNYAASEQIVLRVHH MRCEIPHRCVRRKYRIRRHSPFRLRNCWEGRSVRASSLLRQLAKGGCAARRLSWV
8 15 BLASTx score Trim_cutoff parameter value (%) tBLASTn (BLASTx) maximize with PHRED 8
Summarizing • PHRED meets BLAST as errors in tip are 16% • Molecules carry 3% global error • And scores for EST vs aa comparisons maximize • Real life: crossmatch ends with X’s • Authors: • Fabiano Peixoto (CENAPAD) • Francisco Prosdocimi (Lab Biodiversidade) • Maurício Mudado (Lab Biodados)