280 likes | 558 Views
第三章 多序列比对. 哈尔滨医科大学 生物信息学院 李霞教授. V A T G T T A T. w A T C G T A C. 0 1 2 3 4 5 6 7. 0 1 2 3 4 5 6 7. + + → + + ↓ + ↓ → A T - G T T A T - A T C G T - A - C.
E N D
第三章 多序列比对 哈尔滨医科大学 生物信息学院 李霞教授
V A T G T T A T w A T C G T A C 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 + + → + + ↓ + ↓ → A T - G T T A T - A T C G T - A - C 图3-1 使用动态规划法寻找两个序列的最长公共子序列
w A T C G T A C V A T G T T A T 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 + ← ← ← ← + ← 1 2 2 2 2 2 2 ↑ + ← ← + ← ← 1 2 2 3 3 3 3 ↑↑↑ + ← ← ← 1 2 2 3 4 4 4 ↑+↑↑ + ← ← 1 2 2 3 4 4 4 ↑+↑↑ + ↑ ↑ 1 2 2 3 4 5 5 +↑↑↑↑+ ← 1 2 2 3 4 5 5 ↑+↑↑+↑ ↑ 图3-2 动态规划表的填写
(i-1,j-1,k-1) (i-1,j,k-1) (i-1,j-1,k) (i-1,j,k) (i,j,k-1) (i,j-1,k-1) (i,j,k) (i,j-1,k) 图3-3计算三个序列间的一个比对单元(i,j,k)依赖于其7个前导项
A T G C ATCGTAC ATGTTAT 图3-4计算u=ATGTTAT,v=ATCGTAC,w=ATGC三序列比对的三维得分矩阵δ
序列 列 A 列 B 列 C 1 ……T…………T…………T 2 ……T…………T…………T 3 ……T…………T…………T 4 ……T…………T…………T 5 ……T…………T…………C 6 ……T…………C…………C T-T 失配字符对 6 T-C 匹配字符对 9 该列记分 6*6=36 -3*9=-27 36-27=9 T T T T T T T T T T T T T-T 匹配字符对 6(6-1)/2=15 T-C 匹配字符对 0 该列记分 6*15=90 -3*0=0 90-0=90 T-T 失配字符对 10 T-C 匹配字符对 5 该列记分 6*10=60 -3*5=-15 60-15=45 T T T T T T C C T T T T C C C C T T T T T T T T T-T 失配字符对 6 T-C 匹配字符对 9 该列记分 6*6=36 -1*9=-9 36-9=27 T-T 匹配字符对 6(6-1)/2=15 T-C 匹配字符对 0 该列记分 6*15=90 -1*0=0 90-0=90 T-T 失配字符对 10 T-C 匹配字符对 5 该列记分 6*10=60 -1*5=-5 60-5=55 图3-5 SP记分及得分和罚分参数对多序列比对有显著影响
AAAATTTT AAAATTTT AAAATTTT---- AAAA----GGGG AAAATTTT---- ----TTTTGGGG ----AAAATTTT GGGGAAAA---- AAAATTTT---- ----TTTTGGGG AAAATTTT---- ----TTTTGGGG AAAA----GGGG ? GGGGAAAA AAAAGGGG TTTTGGGG TTTTGGGG ----GGGGAAAA TTTTGGGG---- AAAA----GGGG ----TTTTGGGG 图3-6 三个序列的成对比对未必总能组合成一个多序列比对
权值 0.2+0.3/2=0.35 0.1+0.3/2=0.25 0.5 0.2 A 0.3 0.1 B 0.5 C 图3-7 ClusterW中对序列赋权的方法
p33=x3 p22=x2 p44=x4 p11=x1 结束 开始 S1 S2 S3 S4 p12=y1 p23=y2 p40=y4 p34=y3 1.0 pA=z11 pT=z12 pG=z13 pC=z14 pA=z21 pT=z22 pG=z23 pC=z24 pA=z31 pT=z32 pG=z33 pC=z34 pA=z41 pT=z42 pG=z43 pC=z44 图3-9 隐马尔科夫模型和三个蛋白质序列PHSFTYVMT、PGSFTYW、RFTGFW的最小公共超图
CTG CAT 权=1 权=1 y=CG x=CT 权=1 权=1 权=2 GT CG 图3-10 一个包含四个序列的树比对
--T--CC-C-AGT—TATGT-CAGGGGACACG--A-GCATGCAGA-GAC | || | || | | | ||| || | | | | |||| | AATTGCCGCC-GTCGT-T-TTCAG----CA-GTTATG--T-CAGAT--C tccCAGTTATGTCAGgggacacgagcatgcagaga |||||||||||| aattccgccgtcgttttcagCAGTTATGTCAGatc 图3-12 对二个序列进行全局和局部比对可得到完全不同的结果
AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA AGTGACCTGGGAAGACCCTGAACCCTGGGTCACAAAACTC AGTGACCTGGGAAGACCCTGAACCCTGGGTCACAAAACTC AGTGCCCTGGAACCCTGACGGTGGGTCACAAAACTTCTGGA AGTGACCTGGGAAGACCCTGAACCCTGGGTCACAAAACTC Global Local Glocal 图3-13 两个序列的局部、全局和glocal比对所对应的路径
图3-14 MAP2产生的多序列比对由得到全局比对的相似块和未进行比对的差异段组成
图3-15 UCSC基因组浏览器中Wnt3a基因的多序列比对和序列保守性
>gi|58219048|ref|NP_001010926.1| hairy and enhancer of split 5 [Homo sapiens] MAPSTVAVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPNSKLEKADILEMAVSYLK HSKAFVAAAGPKSLHQDYSEGYSWCLQEAVQFLTLHAASDTQMKLLYHFQRPPAAPAAPAKEPKAPGAAP PPALSAKATAAAAAAHQPACGLWRPW >gi|6754182|ref|NP_034549.1| hairy and enhancer of split 5 [Mus musculus] MAPSTVAVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPNSKLEKADILEMAVSYLK HSKAFAAAAGPKSLHQDYSEGYSWCLQEAVQFLTLHAASDTQMKLLYHFQRPPAPAAPAKEPPAPGAAPQ PARSSAKAAAAAVSTSRQPACGLWRPW >gi|9506775|ref|NP_062109.1| hairy and enhancer of split 2 [Rattus norvegicus] MRLPRGVGDAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVRFLREQ PASVCSTEAPGSLDSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRTVSGGPPSLTPASASAPA PSPPVPPPSSLGLWRPW >gi|60593016|ref|NP_001012713.1| hairy and enhancer of split 5 [Gallus gallus] MAPSALSLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQPNSKLEKADILEMTVSYLK YSRAFAASAKSLQQDYCEGYAWCLKEALQFLSLHSANTETQMKLICHFQRSQAMPKDSGSPSASTSTHQP SAKQTPVKPSCNLWRPW >gi|31074173|gb|AAP41832.1| hairy and enhancer of split 5 [Danio rerio] MAPAYMTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQDPNAKLEKADILEMTVVFL KQQLRPKTPQNAQIEGYSQCWRETISFLSVGSEAVAQRLQQEAQRSAAPELTHTSEAPHQQHTHIKQEPR AHAPLWRPW >gi|113205884|ref|NP_001037974.1| hairy and enhancer of split 5, gene 2 [Xenopus tropicalis] MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQPNVKLEKADILEMTVTY LRQQTLQIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQKQPETAKLISHFHSKATASSISSFPIRCS QSKTANGTGSSSSLWRPW
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MAPST--VAV-ELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAV MAPST--VAV-EMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAV MR-----LPR-GVGDAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTV MAPSA--LSL-EILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQPN-SKLEKADILEMTV MAPSTDFLDQ-QKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQPN-VKLEKADILEMTV MAPAY--MTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQDPN-AKLEKADILEMTV 1234567890123456789012345678901234567890123456789012345678901234567890 SYLKHSKAFVAA--AGPKSLHQDYSEGYSWCLQEAVQFLTLHA--ASDTQMKLLYHFQRPPAAPAAPAKE SYLKHSKAFAAA--AGPKSLHQDYSEGYSWCLQEAVQFLTLHA--ASDTQMKLLYHFQRPPA-PAAPAKE RFLREQPASVCS--TEAPGSLDSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRTV-------S SYLKYSRAFA----ASAKSLQQDYCEGYAWCLKEALQFLSLHS-ANTETQMKLICHFQRSQA-------M TYLRQQTLQIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQ--KQPETAKLISHFHSKAT-------- VFLKQQ--------LRPKTPQNAQIEGYSQCWRETISFLSVGS---EAVAQRLQQEAQRSAA-------- 123456789012345678901234567890123456 PKAPGAAPPPALSAKATAAAAAA--HQPACGLWRPW PPAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW GGPPSLTPASASAPAPSPPVPPP----SSLGLWRPW PKDSGSPSASTSTHQPSAKQTPV---KPSCNLWRPW --ASSISSFPIRCSQSKTANGTG----SSSSLWRPW -PELTHTSEAPHQQHTHIKQEPR----AHAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的MSA多序列比对结果
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 --MAPSTVAVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPNSKLEKADILEMAVSY --MAPSTVAVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPNSKLEKADILEMAVSY ----MRLPRGVGDAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVRF --MAPSALSLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQPNSKLEKADILEMTVSY MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQPNVKLEKADILEMTVTY -MAPAYMTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQDPNAKLEKADILEMTVVF 1234567890123456789012345678901234567890123456789012345678901234567890 LKHSKAFVAAAGPKSLHQDYSEGYSWCLQEAVQFLTLHAASDTQMKLLYHFQRPPAAPAAPAKEPKAPGA LKHSKAFAAAAGPKSLHQDYSEGYSWCLQEAVQFLTLHAASDTQMKLLYHFQRPPAPAAPAKEPPAPGAA LREQPASVCSTEAPGSLDSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRTVSGGPPSLTPASA LKYSRAFAASAKSLQQDYCEGYAWCLKEALQFLSLHSANTETQMKLICHFQRSQAMPKDSGSPSASTSTH LRQQTLQIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQKQPETAKLISHFHSKATASSISSFPIRCS LKQQLRPKTPQNAQIEGYSQCWRETISFLSVGSEAVAQRLQQEAQRSAAPELTHTSEAPHQQHTHIKQEP 12345678901234567890123456789 APPPALSAKATAAAAAAHQPACGLWRPW- PQPARSSAKAAAAAVSTSRQPACGLWRPW SAPAPSPPVPPPSSLGLWRPW-------- QPSAKQTPVKPSCNLWRPW---------- QSKTANGTGSSSSLWRPW----------- RAHAPLWRPW------------------- 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的ClustalW多序列比对结果,缺省参数+Pam100
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MAPST--VAVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAVS MAPST--VAVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAVS MRLPR---GVGDAAELRKS-L-KPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVR MAPSA--LSLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQPN-SKLEKADILEMTVS MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQPN-VKLEKADILEMTVT MAPAY-MTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQDPN-AKLEKADILEMTVV 1234567890123456789012345678901234567890123456789012345678901234567890 YLKHS--KAFVAAAGPKSLHQDYSEGYSWCLQEAVQFLTL-HAAS-DTQMKLLYHFQRPPAAPAAPAKEP YLKHS--KAFAAAAGPKSLHQDYSEGYSWCLQEAVQFLTL-HAAS-DTQMKLLYHFQRPP-APAAPAKEP FLREQ--PASVCSTEAPGSLDSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRT----VSGGPP YLKYS--RAFAASA--KSLQQDYCEGYAWCLKEALQFLSL-HSANTETQMKLICHFQRSQ----AMPKDS YLRQQTLQIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSL-HQKQPETAK-LISHFH-------SKATAS FLKQQ------LRP--KTPQNAQIEGYSQCWRETISFLSV-GSEA--VAQRLQQEAQR------SAAPEL 12345678901234567890123456789012345 KAPGAAPPPALS-AKATAAAAAA-HQPACGLWRPW PAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW SLTPASASAPAPSPPVPPP-------SSLGLWRPW GSPSASTSTHQPSAKQTPV------KPSCNLWRPW SISSFPIRCSQSKTANGTG-------SSSSLWRPW THTSEAPHQQHTHIKQEPR-------AHAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的ClustalW多序列比对结果,缺省参数+Pam1
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 M-APST--VAVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAV M-APST--VAVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAV MRLPRG-VGDAAEL--R-KS-L-KPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTV M-APSA--LSLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQPN-SKLEKADILEMTV M-APSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQPN-VKLEKADILEMTV M-APAY-MTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQDPN-AKLEKADILEMTV 1234567890123456789012345678901234567890123456789012345678901234567890 SYLKHSKAFV-AAAGP--KSLHQDYSEGYSWCLQEAVQFL---TLHAAS-DTQMKLLYHF-QRPPAAPAA SYLKHSKAFA-AAAGP--KSLHQDYSEGYSWCLQEAVQFL---TLHAAS-DTQMKLLYHF-QRPP-APAA RFLREQPASVCSTEAP--GSLDS-YLEGYRACLARLARVLPACSVLEPA--VSARLLEHLRQRTVS--GG SYLKYSRAFA-ASA----KSLQQDYCEGYAWCLKEALQFL---SLHSANTETQMKLICHF-QRSQ---AM TYLRQQTLQI-KSEIPHNNDIQMDYKDGYSRCFEEVIDFL---SLHQKQPET-AKLISHF-HSK----AT VFLKQQ-LRP-KTP----QNAQI---EGYSQCWRETISFL---SVGSEA--VAQRLQQEA-QRS----AA 1234567890123456789012345678901234567890 PAKEPKAPGAAPPPALS-AKATAAAAA-AHQPACGLWRPW PAKEPPAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW PPSLTPASASAPAPSPP-VP-PPSS--------LGLWRPW P-KDSGSPSASTSTHQPSAKQTPVK------PSCNLWRPW ASSISSFPIRCSQ-SKT-ANGTGSS--------SSLWRPW P-ELTHT-SEAPHQQHTHIKQEPRA-------HAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的ClustalW多序列比对结果,缺省参数+Blosum 1
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MAPSTV--AVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEF-ARHQPNSKLEKADILEMAVS MAPSTV--AVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEF-ARHQPNSKLEKADILEMAVS MRLPRGVG-----DAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVR MAPSAL--SLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEF-QRHQPNSKLEKADILEMTVS MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVF-HKQQPNVKLEKADILEMTVT MAPAYM-TEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEF-QQQDPNAKLEKADILEMTVV 1234567890123456789012345678901234567890123456789012345678901234567890 YLKHSKAFVAA-AGPK--SLHQDYSEGYSWCLQEAVQFLTLHAAS--DTQMKLLYHFQRPPAAPAAPAKE YLKHSKAFAAA-AGPK--SLHQDYSEGYSWCLQEAVQFLTLHAAS--DTQMKLLYHFQRPPA-PAAPAKE FLREQPASVCSTEAPG--SL-DSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRTVSG-GPPSL YLKYSRAFAAS---AK--SLQQDYCEGYAWCLKEALQFLSLHSAN-TETQMKLICHFQRSQAMP-KDSGS YLRQQTLQIKS-EIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQKQ-P-ETAKLISHFHSKATAS-SISSF FLKQQLR-------PK--TPQNAQIEGYSQCWRETISFLSVGSEA---VAQRLQQEAQRSAA-PELT-HT 123456789012345678901234567890123456 PKAPGAAPPPALS-AKATAA-AAAAHQPACGLWRPW PPAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW TPA-------S-ASAPAP-S-PPVPPPSSLGLWRPW PSA-------S-TSTHQPSA-KQTPVKPSCNLWRPW PI--------R-CSQSKT-A-NGT--GSSSSLWRPW SEA------P---HQQHTHI-KQEP-RAHAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的T-coffee多序列比对结果,未指定替换矩阵
1234567890123456789012345678901234567890123456789012345678901234567989012345678901234567890123456789012345678901234567890123456789012345679890 MAPST-V-AVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQ-PNSKLEKADILEMAVSY MAPST-V-AVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQ-PNSKLEKADILEMAVSY MRLPRGVGDA-----AELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVRF MAPSA-L-SLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQ-PNSKLEKADILEMTVSY MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQ-PNVKLEKADILEMTVTY MAPAYM-TEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQD-PNAKLEKADILEMTVVF 12345678901234567890123456789012345678901234567890123456789012345679890 LKHSKAF-VAAAGPK--SLHQDYSEGYSWCLQEAVQFLTL-HAASD-TQM-KLLYHFQRPPAAPAAPAKEP LKHSKAF-AAAAGPK--SLHQDYSEGYSWCLQEAVQFLTL-HAASD-TQM-KLLYHFQRP-PAPAAPAKEP LREQPASVCSTEAPGSLD---SYLEGYRACLARLARVLPACSVLEPAVSA-RLLEHLRQRTVSGGPPS-L- LKYSRAFAASA---K--SLQQDYCEGYAWCLKEALQFLSL-HSANT-ETQMKLICHFQRSQAMPKDSG-SP LRQQTL-QIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSL-HQKQP-ETA-KLISHFHSKATASSISS-FP LKQQL----RPK-----TPQNAQIEGYSQCWRETISFLSV-GSEAV-AQ--RLQQEAQRS-AAPELTH-TS 12345678901234567890123456789012345 KAPGAAPPP-ALSAKATAA-AAAAHQPACGLWRPW PAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW -------TPASASAPAPSPP--VPPPSSLGLWRPW SA------S-TSTHQPSAKQTPV--KPSCNLWRPW I---------RCSQSKTANGT----GSSSSLWRPW EA------P-HQQHTHIKQEP----RAHAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的T-coffee多序列比对结果,缺省Blosum矩阵
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MAPST--VAVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAVS MAPST--VAVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEFARHQPN-SKLEKADILEMAVS MR-----LPRGVGDAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVR MAPSA--LSLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEFQRHQPN-SKLEKADILEMTVS MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVFHKQQPN-VKLEKADILEMTVT MAPAY-MTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEFQQQDPN-AKLEKADILEMTVV 1234567890123456789012345678901234567890123456789012345678901234567890 YLKHSKAFVAAAGP--KSLHQDYSEGYSWCLQEAVQFLTLHAA--SDTQMKLLYHFQRPPAAPAAPAKEP YLKHSKAFAAAAGP--KSLHQDYSEGYSWCLQEAVQFLTLHAA--SDTQMKLLYHFQRPP-APAAPAKEP FLREQPASVCSTEA--PGSLDSYLEGYRACLARLARVLPACSVLEPAVSARLLEHL-----------RQR YLKYSRAFAASA----KSLQQDYCEGYAWCLKEALQFLSLHSA-NTETQMKLICHFQRSQAMP----KDS YLRQQTLQIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQK-QPET-AKLISHFH----------SKA FLKQQLRP--------KTPQNAQIEGYSQCWRETISFLSVGSE---AVAQRLQQEAQRSAA------PEL 12345678901234567890123456789012345 KAPGAAPPPALSAKATAAAA--AAHQPACGLWRPW PAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW TVSGGPPSLTPASASAPAPSPPVPPPSSLGLWRPW GSPSA------STSTHQPSAKQTPVKPSCNLWRPW TASSISSFPIRCSQSKTANGTGS----SSSLWRPW THTSEAPHQQHTHIKQEPRA-------HAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的MAFFT多序列比对结果
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MAPST--VAVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEF-ARHQPNSKLEKADILEMAVS MAPST--VAVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEF-ARHQPNSKLEKADILEMAVS MR-----LPRGVGDAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVR MAPSA--LSLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEF-QRHQPNSKLEKADILEMTVS MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVF-HKQQPNVKLEKADILEMTVT MAPAY-MTEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEF-QQQDPNAKLEKADILEMTVV 1234567890123456789012345678901234567890123456789012345678901234567890 YLKHSKAFVAAAGPKS--LHQDYSEGYSWCLQEAVQFLTLHAA--SDTQMKLLYHFQRPPAAPAAPAKEP YLKHSKAFAAAAGPKS--LHQDYSEGYSWCLQEAVQFLTLHAA--SDTQMKLLYHFQRPP-APAAPAKEP FLREQPASVCSTEAPG--SLDSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRT-VSGGPPSLT YLKYSRAFAASA--KS--LQQDYCEGYAWCLKEALQFLSLHSA-NTETQMKLICHFQRSQ----AMPKDS YLRQQTLQIKSEIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQK--QPETAKLISHFHSKA---------- FLKQQ------LRPKT--PQNAQIEGYSQCWRETISFLSVGSE--AVAQ-----RLQQEAQRSAAP--EL 12345678901234567890123456789012345 KAPGAAPPPALSAKATAAAA--AAHQPACGLWRPW PAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW PASASAPAPSPP----------VPPPSSLGLWRPW GSPSASTSTHQPSAKQ------TPVKPSCNLWRPW TASSISSFPIRCSQSKTA----NGTGSSSSLWRPW THTSEAPHQQHTHIK-------QEPRAHAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的MUSCLE多序列比对结果
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MAPSTV--AVELLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEF-ARHQPNSKLEKADILEMAVS MAPSTV--AVEMLSPKEKNRLRKPVVEKMRRDRINSSIEQLKLLLEQEF-ARHQPNSKLEKADILEMAVS MRLPR-----GVGDAAELRKSLKPLLEKRRRARINESLSQLKGLVLPLLGAETSRYSKLEKADILEMTVR MAPSAL--SLEILTPKEKNRLRKPIVEKLRRDRINSSIEQLKLLLEKEF-QRHQPNSKLEKADILEMTVS MAPSTDFLDQQKMTPKEKNKLRKPVVEKMRRDRINSSIEQLKGLLETVF-HKQQPNVKLEKADILEMTVT MAPAYM-TEYSKLSNKEKHKLRKPVVEKMRRDRINNCIEQLKSMLEKEF-QQQDPNAKLEKADILEMTVV 1234567890123456789012345678901234567890123456789012345678901234567890 YLKHSKAFVAA-AGP--KSLHQDYSEGYSWCLQEAVQFLTLHAAS--DTQMKLLYHFQRPPAAPAAPAKE YLKHSKAFAAA-AGP--KSLHQDYSEGYSWCLQEAVQFLTLHAAS--DTQMKLLYHFQRPPAPAA-PAKE FLREQPASVCSTEAP--GSL-DSYLEGYRACLARLARVLPACSVLEPAVSARLLEHLRQRTVSGG-PPSL YLKYSRAFAAS---A--KSLQQDYCEGYAWCLKEALQFLSLHSAN-TETQMKLICHFQRSQAMPK-DSGS YLRQQTLQIKS-EIPHNNDIQMDYKDGYSRCFEEVIDFLSLHQKQ--PETAKLISHFHSKATASS-ISSF FLKQQLR-------P--KTPQNAQIEGYSQCWRETISFLSVGSEA---VAQRLQQEAQRSAA-PE-LTHT 123456789012345678901234567890123456 PKAPGAAPPPALS-AKAT-AAAAAAHQPACGLWRPW PPAPGAAPQPARSSAKAAAAAVSTSRQPACGLWRPW TPASASAPAP----------SPPVPPPSSLGLWRPW PSASTSTHQP---------SAKQTPVKPSCNLWRPW PIRCSQS-------------KTANGTGSSSSLWRPW SEAPHQQHT-----------HIKQEPRAHAPLWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的ProbCons多序列比对结果
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 MA-PSTV--AV--EL--L-----SPKEKNRLRKPVVEKMRRDRIN-S-SIEQLK-L---LLGAETS--RY MA-PSTV--AV--EM--L-----SPKEKNRLRKPVVEKMRRDRIN-S-SIEQLK-L---LL--EKEFQ-- MRLPRGVGDAA--E---LR--------KS-L-KPLLEKRRRARINESLS--QLKGLVLPLL--ETVF--H MA-PS----ALSLEI--L-----TPKEKNRLRKPIVEKLRRDRIN-S-SIEQLK-L---LL--EKEFQRH MA-PSTD--F-------LDQQKMTPKEKNKLRKPVVEKMRRDRIN-S-SIEQLK-G---LL--EQEFARH MA-P-----AYMTEYSKL-----SNKEKHKLRKPVVEKMRRDRIN-N-CIEQLK-S---ML--EQEFARH 1234567890123456789012 34567890123456789012345678901234567890 ------SKLEKADILEMTVRFL|70 82 |S-L--H----Q-D-YSEGYSWCL-QEA-VQ-FL---T- -QQDPNAKLEKADILEMTVVFL|70 82 |S-L--H----Q-D-YSEGYSWCL-QEA-VQ-FL---T- KQQ-PNVKLEKADILEMTVTYL|68 81 |S-L---------DSYLEGYRACLARLARV---LPACSV --Q-PNSKLEKADILEMTVSYL|70 80 |S-L--Q----Q-D-YCEGYAWCL-KEA-LQ-FL---S- --Q-PNSKLEKADILEMAVSYL|72 79 |S-EIPHNNDIQMD-YKDGYSRCF-EE--VIDFL---S- --Q-PNSKLEKADILEMAVSYL|71 71 |QQLRPKTP--QNA-QIEGYSQCW-RET--ISFL---S- 1234567890123456789012345678901234567890123456789012345 6789012345 LHA-ASD-TQMKLLYHF-QR--SGGPPSLTPASAS-AP--APSP--------PVP|143 157|PAC-GLWRPW LHA-ASD-TQMKLLYHF-QR--S-AAP------ELTHTSEA--PHQQHTHIKQEP|142 158|PAC-GLWRPW LEP-AVS--AR-LLEHLRQRTVS---------K---AT--A-SSI---SS--F-P|147 147|PSSLGLWRPW LHS-ANTETQMKLICHF-QR--SQAMP-----KDSGSPS-A-STS---TH--Q-P|141 148|PSC-NLWRPW LHQ-KQPETA-KLISHF-H---PPAAPAA-PAKEPKAPG-A-AP--------P-P|137 152|-S---LWRPW VGSEAVA--QR-LQQEA-QR--PPA-PAA-PAKEPPAPG-A-AP--------Q-P|140 143|P----LWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的MAP2多序列比对结果,major_diff=10, mismatch=-1,gap_open=1,gap_extend=1
123456789012345678901234567890123456789012345678901234567890 MA-PSTV--AV--EL--L-----SPKEKNRLRKPVVEKMRRDRIN-S-SIEQLK-L---L MA-PSTV--AV--EM--L-----SPKEKNRLRKPVVEKMRRDRIN-S-SIEQLK-L---L MRLPRGVGDAA--E---LR--------KS-L-KPLLEKRRRARINESLS--QLKGLVLPL MA-PS----ALSLEI--L-----TPKEKNRLRKPIVEKLRRDRIN-S-SIEQLK-L---L MA-PSTD--F-------LDQQKMTPKEKNKLRKPVVEKMRRDRIN-S-SIEQLK-G---L MA-P-----AYMTEYSKL-----SNKEKHKLRKPVVEKMRRDRIN-N-CIEQLK-S---M L--EQEFARH--Q-PNSKLEKADILEMAVSYLKHSKAFVAA-----AGPKSL--H----Q L--EQEFARH--Q-PNSKLEKADILEMAVSYLKHSKAFAAA-----AGPKSL--H----Q LGAETS--RY------SKLEKADILEMTVRFL---REQPASVCSTEA-PGSL-------- L--EKEFQRH--Q-PNSKLEKADILEMTVSYLKYSRAF-AA--S--A--KSL--Q----Q L--ETVF--HKQQ-PNVKLEKADILEMTVTYL---RQQTLQ--I-----KSEIPHNNDIQ L--EKEFQ---QQDPNAKLEKADILEMTVVFLK---QQ-L-------RPKT--PQNA--Q -D-YSEGYSWCL-QEA-VQ-FL---T-LHA-ASD-TQMKLLYHF-QR--PPAAPAAPAKE -D-YSEGYSWCL-QEA-VQ-FL---T-LHA-ASD-TQMKLLYHF-QR--PPA-PAAPAKE -DSYLEGYRACLARLARV---LPACSVLEP-AVS--AR-LLEHLRQRTVSGG-P------ -D-YCEGYAWCL-KEA-LQ-FL---S-LHS-ANTETQMKLICHF-QR--SQAMP----KD MD-YKDGYSRCF-EE--VIDFL---S-LHQ-KQPETA-KLISHF-H---S--------K- I----EGYSQCW-RET--ISFL---S-VGSEAVA--QR-LQQEA-QR--S-AAP-----E PKAPG-AAP--------P-PAL-S-AKATAAAAAAH--QP--ACGLWRPW PPAPG-AAP--------Q-PAR-SSAKA-AAAAVSTSRQP--ACGLWRPW ---PSL-------T-----PA--S-ASAPAP---SPPVPPPSSLGLWRPW SGSPS-ASTS---TH--Q-P---S-AKQTP---VK----P--SCNLWRPW --AT--ASSI---SS--F-PIRCSQSK-TANGTGSSS-----S--LWRPW LTHTSEA-PHQQHTHIKQEP-R---AH--A---------P-----LWRPW 人、家鼠、大鼠、鸡、非洲蟾蜍 、斑马鱼hairy and enhancer of split 5蛋白质序列的MAP2多序列比对结果,major_diff=20, mismatch=-1,gap_open=1,gap_extend=1