1 / 74

Master MBI 2eme année  - Université Paris VII Module Bioinformatique

Master MBI 2eme année  - Université Paris VII Module Bioinformatique Génomique Structurale et Génomique Fonctionnelle Bioinformatique du génome 7-10 Octobre 2013. Définition de la Génomique Structurale Définition à l’échelle du nucléotide de la structure d’un génome

enrico
Download Presentation

Master MBI 2eme année  - Université Paris VII Module Bioinformatique

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Master MBI 2eme année  - Université Paris VII Module Bioinformatique Génomique Structurale et Génomique Fonctionnelle Bioinformatique du génome 7-10 Octobre 2013

  2. Définition de la Génomique Structurale Définition à l’échelle du nucléotide de la structure d’un génome Définition de sa séquence et de l’annotation en éléments structuraux identifiables par des techniques bioinformatiques La Génomique structurale s’étend à la détermination des structure 3D des protéines et ARN codées par un génome (ceci par des techniques haut débit) Définition de la Génomique Fonctionnelle Annotation des divers éléments (gènes, parties codant pour une protéine, éléments régulateurs en relation avec une function biologique (en particulier par similarité)

  3. Plan du cours : • Evolution du nombre des séquences nucléotiques et protéiques • Informations accessibles dans les données de banques • Ressources en databases • Quelques exemples de contenu (EBI, ACNUC, NCBI) • Mode d’intérrogation des ces serveurs

  4. Plus généralement : • 1. Accès à l’information comparative par les annotations déjà existentes (interrogations des banques) • 2. Extension de l’analyse comparative par similarité • sur la base de la séquence nucléique ou protéique • Recherche rapide dans les banques (ex : FASTA, BLAST) • Alignement multiple : (conservation fonctionnelle, ou divergence évolutive)

  5. Evolution de la quantité et de la complexité des séquences nucléotidiques

  6. Non à jour

  7. NGS

  8. DIVISIONS TAXONOMIQUES DIVISIONS TECHNOLOGIQUES

  9. Termes officiels pour les FEATURES

  10. Toutes les séquences identifiées dans une banque de séquences sont basés sur une structuration en champs . La qualité de l annotation a énormement changée depuis le début (1982) Exemple suivant : séquence d’un fragment de récepteur de l’insuline humain (dernier exon) à comparer à une entrée équivalente actuelle. Néanmoins il y a continuité dans les annotations De plus ces annotations restent lisibles par l’humain et interprétables par l’ordinateur

  11. Exemple de séquence de l'EMBL (1992) ID HSINSR24 standard; DNA; PRI; 873 BP. AC M32972; DT 10-JUL-1990 (Rel. 24, Created) DT 05-SEP-1992 (Rel. 33, Last updated, Version 4) DE Human insulin receptor (hINSR) gene, exon 22. KW insulin receptor. OS Homo sapiens (human) OC Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; OC Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. RN [1] RP 1-873 RA Seino S., Seino M., Bell G.I.; RT "Human insulin-receptor gene"; RL Diabetes 39:123-128(1990). CC Draft entry and computer-readable sequence for [1] kindly submitted CC by G.I.Bell, 14-MAR-1990. FH Key Location/Qualifiers FH FT CDS join(M23100:1824..1923,M32823:174..725,M32824:114..435, FT M32825:318..466,M32826:188..332,M32827:189..403,M32828:277 FT ..403,M32829:124..374,M32830:106..273,M32831:187..388, FT M32832:123..158,M32833:161..435,M32834:93..232,M32835:85.. FT 244,M32836:92..194,M32837:261..328,M32838:136..380,M32839: FT 117..227,M32840:45..204,M32841:115..244,M32842:101..235,83 FT ..437) FT /note="human insulin receptor precursor" FT source 1..873 FT /organism="Homo sapiens" FT mat_peptide join(M23100:1905..1923,M32823:174..725,M32824:114..435, FT M32825:318..466,M32826:188..332,M32827:189..403,M32828:277 FT ..403,M32829:124..374,M32830:106..273,M32831:187..388, FT M32832:123..158,M32833:161..435,M32834:93..232,M32835:85.. FT 244,M32836:92..194,M32837:261..328,M32838:136..380,M32839: FT 117..227,M32840:45..204,M32841:115..244,M32842:101..235,83 FT ..434) FT /note="human insulin receptor"

  12. Exemple de séquence de l'EMBL (suite) FH Key Location/Qualifiers FH FT /organism="Homo sapiens" FT mat_peptide join(M23100:1905..1923,M32823:174..725,M32824:114..435, FT M32825:318..466,M32826:188..332,M32827:189..403,M32828:277 FT ..403,M32829:124..374,M32830:106..273,M32831:187..388, FT M32832:123..158,M32833:161..435,M32834:93..232,M32835:85.. FT 244,M32836:92..194,M32837:261..328,M32838:136..380,M32839: FT 117..227,M32840:45..204,M32841:115..244,M32842:101..235,83 FT ..434) FT /note="human insulin receptor" FT prim_transcript <1..873 FT /note="hINSR mRNA and introns" FT intron <1..82 FT /note="hINSR intron U" SQ Sequence 873 BP; 199 A; 217 C; 234 G; 223 T; 0 other; gacgtgtcct tctgccccgc agcactgacc tcatgcgcat gtgctggcaa ttcaacccca 120 agatgaggcc aaccttcctg gagattgtca acctgctcaa ggacgacctg caccccagct 180 ttccagaggt gtcgttcttc cacagcgagg agaacaaggc tcccgagagt gaggagctgg 240 agatggagtt tgaggacatg gagaatgtgc ccctggaccg ttcctcgcac tgtcagaggg 300 aggaggcggg gggccgggat ggagggtcct cgctgggttt caagcggagc tacgaggaac 360 acatccctta cacacacatg aacggaggca agaaaaacgg gcggattctg accttgcctc 420 ... cccccacccg cccccagcag atggaaagaa agcacctgtt tttacaaatt cttttttttt 780 tttttttttt tttttttttg ctggtgtctg agcttcagta taaaagacaa aacttcctgt 840 ttgtggaaca aaatttcgaa agaaaaaacc aaa 873 // Commentaires : • Existence de filiations taxonomiques • Syntaxe de la définiton de la séquence codante (CDS) à partir de sous-séquences de plusieurs entrées pour reconstituer l'ARN mature. Annotation sur la base de 23 exons séquencés indépendemment

  13. Representations graphiques actuelles

  14. Autre exemple de features de séquence génomique eucaryote LOCUS HUMAFP 27553 bp DNA PRI 26-MAY-1995 DEFINITION Human alpha-fetoprotein gene, complete cds. ACCESSION M16110 NID g773678 KEYWORDS alpha-fetoprotein. SOURCE human. ORGANISM Homo sapiens Eukaryotae; mitochondrial eukaryotes; Metazoa/Eumycota group; Metazoa; Eumetazoa; Bilateria; Coelomata; Deuterostomia; Chordata; Vertebrata; Gnathostomata; Osteichthyes; Sarcopterygii; Choanata; Tetrapoda; Amniota; Mammalia; Theria; Eutheria; Archonta; Primates; Catarrhini; Hominidae; Homo. REFERENCE 1 (bases 1 to 27553) AUTHORS Gibbs,P.E., Zielinski,R., Boyd,C. and Dugaiczyk,A. TITLE Structure, polymorphism, and novel repeated DNA elements revealed by a complete sequence of the human alpha-fetoprotein gene JOURNAL Biochemistry 26 (5), 1332-1343 (1987) MEDLINE 87185438 ... source 1..27553 /clone_lib="charon 4a library of T. Maniatis" /organism="Homo sapiens" /map="4q11-22" /tissue_type="liver" /dev_stage="fetus" mat_peptide join(2242..3269,4082..4133,5096..5228,7516..7727, 9214..9346,10265..10362,11911..12040,14316..14530, 16188..16320,16889..16986,17469..17607,19255..19478, 20619..20751,22090..22131) /evidence=experimental /citation=[1] /product="alpha-fetoprotein"

  15. source 1..27553 /clone_lib="charon 4a library of T. Maniatis" /organism="Homo sapiens" /map="4q11-22" /tissue_type="liver" /dev_stage="fetus" mat_peptide join(2242..3269,4082..4133,5096..5228,7516..7727, 9214..9346,10265..10362,11911..12040,14316..14530, 16188..16320,16889..16986,17469..17607,19255..19478, 20619..20751,22090..22131) /evidence=experimental /citation=[1] /product="alpha-fetoprotein" repeat_unit 2609..2701 /rpt_type=direct /evidence=experimental /rpt_family="x" /citation=[1] repeat_unit 2835..2930 /rpt_type=direct /evidence=experimental /rpt_family="x" /citation=[1] CAAT_signal 3072..3076 TATA_signal 3114..3119 /note="putative" /evidence=experimental /citation=[1] prim_transcript 3141..22629 mRNA join(3141..3269,4082..4133,5096..5228,7516..7727, 9214..9346,10265..10362,11911..12040,14316..14530, 16188..16320,16889..16986,17469..17607,19255..19478, 20619..20751,22090..22144,22485..22629) exon 3141..3269 sig_peptide 3185..3241 CDS_pept join(3185..3269,4082..4133,5096..5228,7516..7727, 9214..9346,10265..10362,11911..12040,14316..14530, 16188..16320,16889..16986,17469..17607,19255..19478, 20619..20751,22090..22134)

  16. CDS_pept join(3185..3269,4082..4133,5096..5228,7516..7727, 9214..9346,10265..10362,11911..12040,14316..14530, 16188..16320,16889..16986,17469..17607,19255..19478, 20619..20751,22090..22134) /codon_start=1 /evidence=experimental /citation=[1] /product="alpha-fetoprotein" /db_xref="PID:g178236" /translation="MKWVESIFLIFLLNFTESRTLHRNEYGIASILDSYQCTAEISLA DLATIFFAQFVQEATYKEVSKMVKDALTAIEKPTGDEQSSGCLENQLPAFLEELCHEK EILEKYGHSDCCSQSEEGRHNCFLAHKKPTPASIPLFQVPEPVTSCEAYEEDRETFMN KFIYEIARRHPFLYAPTILLWAARYDKIIPSCCKAENAVECFQTKAATVTKELRESSL LNQHACAVMKNFGTRTFQAITVTKLSQKFTKVNFTEIQKLVLDVAHVHEHCCRGDVLD CLQDGEKIMSYICSQQDTLSNKITECCKLTTLERGQCIIHAENDEKPEGLSPNLNRFL GDRDFNQFSSGEKNIFLASFVHEYSRRHPQLAVSVILRVAKGYQELLEKCFQTENPLE CQDKGEEELQKYIQESQALAKRSCGLFQKLGEYYLQNAFLVAYTKKAPQLTSSELMAI TRKMAATAATCCQLSEDKLLACGEGAADIIIGHLCIRHEMTPVNPGVGQCCTSSYANR RPCFSSLVVDETYVPPAFSDDKFIFHKDLCQAQGVALQTMKQEFLINLVKQKPQITEE QLEAVIADFSGLLEKCCQGQEQEVCFAEEGQKLISKTRAALGV" intron 3270..4081 exon 4082..4133 intron 4134..5095 exon 5096..5228 intron 5229..7515 repeat_unit complement(6956..7181) /rpt_type=inverted /evidence=experimental /rpt_family="Kpn" /citation=[1] exon 7516..7727 intron 7728..9213 repeat_region 8188..8198 /rpt_type=flanking /evidence=experimental /rpt_family="Alu" /citation=[1] repeat_region 8199..8502 /rpt_type=dispersed /evidence=experimental /rpt_family="Alu" /citation=[1]

  17. repeat_region 8503..8513 /rpt_type=flanking /evidence=experimental /rpt_family="Alu" /citation=[1] allele replace(8691..8693,"tc") /citation=[1] allele replace(8722..8724,"at") /citation=[1] allele replace(8922,"g") /citation=[1] exon 9214..9346 intron 9347..10264 allele replace(9691..9695,"ac") /citation=[1] exon 10265..10362 intron 10363..11910 allele replace(10673,"g") /citation=[1] exon 11911..12040 intron 12041..14315 allele replace(12634,"g") /citation=[1] repeat_region 13818..14120 /rpt_type=direct /evidence=experimental /rpt_family="xba" /citation=[1] exon 14316..14530 intron 14531..16187 repeat_region 15092..15394 /rpt_type=direct /evidence=experimental /rpt_family="xba" /citation=[1] exon 16188..16320 intron 16321..16888 exon 16889..16986 intron 16987..17468

  18. exon 17469..17607 intron 17608..19254 exon 19255..19478 intron 19479..20618 exon 20619..20751 intron 20752..22089 exon 22090..22144 terminator 22132 intron 22145..22484 exon 22485..22629 polyA_signal 22629 repeat_region complement(23761..23771) /rpt_type=flanking /evidence=experimental /rpt_family="Alu" /citation=[1] repeat_region complement(23772..23901) /rpt_type=dispersed /evidence=experimental /rpt_family="Alu" /citation=[1] repeat_region complement(23902..23912) /rpt_type=flanking /evidence=experimental /rpt_family="Alu" /citation=[1] .. Consultez les fichiers relnotes.txt distribués avec chaque version de banque. Ils décrivent les évolutions constantes de la standardisation des informations contenues dans chaqe entrée, avec en particulier des listes exhaustives des noms de features, de leur position et des qualifiers. ftp://ftp.ncbi.nih.gov/pub/genbank/gbrel.txt The DDBJ/EMBL/GenBank Feature Table Definition http://www.ebi.ac.uk/ebi_docs/embl_db/ft/feature_table.html

  19. Exemple de séquence SwissProt DT 01-JAN-1988 (REL. 06, CREATED) DT 01-JAN-1988 (REL. 06, LAST SEQUENCE UPDATE) DT 01-FEB-1996 (REL. 33, LAST ANNOTATION UPDATE) DE INSULIN RECEPTOR PRECURSOR (EC 2.7.1.112) (IR). GN INSR. OS HOMO SAPIENS (HUMAN). OC EUKARYOTA; METAZOA; CHORDATA; VERTEBRATA; TETRAPODA; MAMMALIA; OC EUTHERIA; PRIMATES. RN [1] RP SEQUENCE FROM N.A. RX MEDLINE; 85176928. RA EBINA Y., ELLIS L., JARNAGIN K., EDERY M., GRAF L., CLAUSER E., RA OU J.-H., MASIARZ F., KAN Y.W., GOLDFINE I.D., ROTH R.A., RUTTER W.J.; RL CELL 40:747-758(1985). RN [2] RP SEQUENCE FROM N.A. RX MEDLINE; 85137889. RA ULLRICH A., BELL J.R., CHEN E.Y., HERRERA R., PETRUZZELLI L.M., RA DULL T.J., GRAY A., COUSSENS L., LIAO Y.-C., TSUBOKAWA M., RA MASON A., SEEBURG P.H., GRUNFELD C., ROSEN O.M., RAMACHANDRAN J.; RL NATURE 313:756-761(1985). .. RN [36] RP VARIANT LEU-1205. RX MEDLINE; 92225265. RA KIM H., KADOWAKI H., SAKURA H., ODAWARA M., MOMOMURA K., TAKAHASHI Y., RA MIYAZAKI Y., OHTANI T., AKANUMA Y., YAZAKI Y.; RL DIABETOLOGIA 35:261-266(1992). RN [37] RP VARIANT LEU-1220. RX MEDLINE; 93300291. RA IWANISHI M., HARUTA T., TAKATA Y., ISHIBASHI O., SASAOKA T., EGAWA K., RA IMAMURA T., NAITOU K., ITAZU T., KOBAYASHI M.; RL DIABETOLOGIA 36:414-422(1993). RN [38]

  20. RN [38] RP VARIANT SER-1227. RX MEDLINE; 91155951. RA MOLLER D.E., YOKOTA A., GINSBERG-FELLNER F., FLIER J.S.; RL MOL. ENDOCRINOL. 4:1183-1191(1990). CC -!- FUNCTION: THIS RECEPTOR BINDS INSULIN AND HAS A TYROSINE-PROTEIN CC KINASE ACTIVITY. CC -!- CATALYTIC ACTIVITY: ATP + A PROTEIN TYROSINE = ADP + CC PROTEIN TYROSINE PHOSPHATE. CC -!- AFTER BEING TRANSPORTED FROM THE ENDOPLASMIC RETICULUM TO THE CC GOLGI APPARATUS, THE SINGLE GLYCOSYLATED PRECURSOR IS FURTHER CC GLYCOSYLATED AND THEN CLEAVED, FOLLOWED BY ITS TRANSPORT TO CC THE PLASMA MEMBRANE. CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. CC -!- SUBUNIT: TETRAMER OF 2 ALPHA AND 2 BETA CHAINS LINKED BY DISULFIDE CC BONDS. THE ALPHA CHAINS CONTRIBUTE TO THE FORMATION OF THE LIGAND- CC BINDING DOMAIN, WHILE THE BETA CHAIN CARRY THE KINASE DOMAIN. CC -!- ENZYME REGULATION: AUTOPHOSPHORYLATION ACTIVATES THE KINASE CC ACTIVITY. CC -!- SIMILARITY: BELONGS TO THE INSULIN RECEPTOR FAMILY OF TYROSINE- CC PROTEIN KINASES. CC -!- SIMILARITY: CONTAINS 2 FIBRONECTIN TYPE III-LIKE DOMAINS. CC -!- ALTERNATIVE PRODUCTS: TWO FORMS ARE PRODUCED BY ALTERNATIVE CC SPLICING. THE SECOND FORM LACKS A 12 RESIDUE PEPTIDE IN THE ALPHA CC SUBUNIT. CC -!- DISEASE: MUTATIONS IN INSR CAN CAUSE VARIOUS FORMS OF INSULIN CC RESISTANCE AS WELL AS SOME FORMS OF DIABETES MELLITUS, NONINSULIN- CC DEPENDENT (NIDDM) AND OF LEPRECHAUNISM (DONOHUE SYNDROME). DR EMBL; M10051; G307070; -. DR EMBL; X02160; G33973; -. DR EMBL; M32972; G386830; -. DR EMBL; M23100; G386830; JOINED. DR EMBL; M32823; G386830; JOINED. .. DR EMBL; M32842; G386830; JOINED. DR EMBL; J03466; G463119; -. DR EMBL; M27197; G186468; -. DR EMBL; M27195; G186468; JOINED. DR EMBL; M29929; G186472; -. DR EMBL; M29930; G186474; -. DR EMBL; J05043; G553515; -. DR EMBL; M24555; G186481; -. DR EMBL; M76592; G553512; -.

  21. DR PIR; A05274; A05274. DR PIR; A05275; A05275. DR PIR; S03360; S03360. DR PDB; 1IRK[3D_IMAGE;ENTRY;RASMOL;HSSP]; 27-FEB-95. DR MIM; 147670; 11TH EDITION. DR MIM; 246200; 11TH EDITION. DR PROSITE; PS00107; PROTEIN_KINASE_ATP. DR PROSITE; PS00109; PROTEIN_KINASE_TYR. DR PROSITE; PS00239; RECEPTOR_TYR_KIN_II. DR PROSITE; PS50011; PROTEIN_KINASE_DOM. DR PRODOM; INSR_HUMAN; DR SWISS-2DPAGE; INSR_HUMAN; DR SWISS-3DIMAGE; INSR_HUMAN; KW TRANSFERASE; TYROSINE-PROTEIN KINASE; RECEPTOR; TRANSMEMBRANE; KW GLYCOPROTEIN; ATP-BINDING; PHOSPHORYLATION; SIGNAL; POLYMORPHISM; KW DISEASE MUTATION; DIABETES; ALTERNATIVE SPLICING; REPEAT; KW 3D-STRUCTURE. FT SIGNAL 1 27 FT CHAIN 28 758 ALPHA-SUBUNIT. FT PROPEP 759 762 REMOVED IN MATURE FORM. FT CHAIN 763 1382 BETA-SUBUNIT. FT DOMAIN 763 956 EXTRACELLULAR (POTENTIAL). FT TRANSMEM 957 979 POTENTIAL. FT DOMAIN 980 1382 CYTOPLASMIC (POTENTIAL). FT DOMAIN 182 339 CYS-RICH. FT DOMAIN 618 847 FIBRONECTIN TYPE-III. FT DOMAIN 848 948 FIBRONECTIN TYPE-III. FT DOMAIN 1023 1298 PROTEIN KINASE. FT NP_BIND 1029 1037 ATP (BY SIMILARITY). FT BINDING 1057 1057 ATP. FT MUTAGEN 1057 1057 K->A,M,R: ABOLISH THE KINASE ACTIVITY. FT MOD_RES 1185 1185 PHOSPHORYLATION (AUTO-). FT MOD_RES 1189 1189 PHOSPHORYLATION (AUTO-). FT MOD_RES 1190 1190 PHOSPHORYLATION (AUTO-). FT ACT_SITE 999 999 IMPORTANT FOR BIOLOGICAL ACTIVITY. FT MUTAGEN 999 999 Y->F: HAS NO EFFECT ON INSULIN-STIMULATED FT AUTOPHOSPHORYLATION, BUT INHIBITS THE FT BIOLOGICAL ACTIVITY OF THE RECEPTOR.

  22. FT MUTAGEN 999 999 Y->F: HAS NO EFFECT ON INSULIN-STIMULATED FT AUTOPHOSPHORYLATION, BUT INHIBITS THE FT BIOLOGICAL ACTIVITY OF THE RECEPTOR. FT ACT_SITE 1159 1159 BY SIMILARITY. FT DISULFID 462 495 FT DISULFID 551 551 INTERCHAIN. FT CARBOHYD 43 43 POTENTIAL. FT CARBOHYD 52 52 POTENTIAL. FT CARBOHYD 105 105 POTENTIAL. .. FT CARBOHYD 782 782 POTENTIAL. FT CARBOHYD 920 920 POTENTIAL. FT CARBOHYD 933 933 POTENTIAL. FT VARSPLIC 745 756 MISSING (IN SHORTER VARIANT). FT VARIANT 42 42 N -> K (IN RABSON-MENDENHALL SYNDROME). FT VARIANT 55 55 V -> A (IN LEPRECHAUNISM VERONA-1). FT VARIANT 58 58 G -> R (IN LEPRECHAUNISM HELMOND; FT INHIBITS PROCESSING AND TRANSPORT). FT VARIANT 113 113 R -> P (IN LEPRECHAUNISM ATLANTA-1). FT VARIANT 220 220 P -> L (IN SEVERE INS-RES). FT VARIANT 236 236 H -> R (IN LEPRECHAUNISM WINNIPEG). FT VARIANT 260 260 L -> P (IN LEPRECHAUNISM GELDEIMALSEN). FT VARIANT 393 393 G -> R (IN LEPRECHAUNISM VERONA-1). FT VARIANT 409 409 F -> V (IN SEVERE INS-RES). FT VARIANT 487 487 K -> E (IN LEPRECHAUNISM ARK-1 ALLELE 1). FT VARIANT 489 489 N -> S (IN INS-RES ACANTHOSIS NIGRICANS FT WITH INS-RES DIABETES MELLITUS). FT VARIANT 762 762 R -> S (IN INS-RES ACANTHOSIS NIGRICANS). FT VARIANT 1012 1012 V -> M (IN NIDDM). FT VARIANT 1020 1020 R -> Q (IN INS-RES ACANTHOSIS NIGRICANS FT WITH INS-RES DIABETES MELLITUS). FT VARIANT 1035 1035 G -> V (IN INS-RES ACANTHOSIS NIGRICANS FT WITH INS-RES DIABETES MELLITUS). FT VARIANT 1075 1075 A -> D (IN INS-RES, TYPE A). FT VARIANT 1161 1161 A -> T (IN INS-RES). FT VARIANT 1162 1162 A -> E (IN INS-RES ACANTHOSIS NIGRICANS; FT IMPAIRS PROTEOLYTIC PROCESSING). FT VARIANT 1180 1180 M -> I (IN INS-RES). FT VARIANT 1191 1191 R -> Q (IN NIDDM). FT VARIANT 1205 1205 P -> L (IN INS-RES, MODERATE). FT VARIANT 1220 1220 W -> L (IN INS-RES, TYPE A). FT VARIANT 1227 1227 W -> S (IN INS-RES, TYPE A).

  23. FT CONFLICT 171 171 H -> Y (IN REF. 2). FT CONFLICT 448 448 T -> I (IN REF. 2). FT CONFLICT 492 492 K -> Q (IN REF. 2 AND 4). FT CONFLICT 601 601 D -> N (IN REF. 5). FT CONFLICT 830 830 P -> E (IN REF. 5). SQ SEQUENCE 1382 AA; 156280 MW; D681AC2E CRC32; INSR_HUMAN Length: 1382 Check: 7046 .. 1 MGTGGRRGAA AAPLLVAVAA LLLGAAGHLY PGEVCPGMDI RNNLTRLHEL ENCSVIEGHL 61 QILLMFKTRP EDFRDLSFPK LIMITDYLLL FRVYGLESLK DLFPNLTVIR GSRLFFNYAL 121 VIFEMVHLKE LGLYNLMNIT RGSVRIEKNN ELCYLATIDW SRILDSVEDN HIVLNKDDNE 181 ECGDICPGTA KGKTNCPATV INGQFVERCW THSHCQKVCP TICKSHGCTA EGLCCHSECL 241 GNCSQPDDPT KCVACRNFYL DGRCVETCPP PYYHFQDWRC VNFSFCQDLH HKCKNSRRQG 301 CHQYVIHNNK CIPECPSGYT MNSSNLLCTP CLGPCPKVCH LLEGEKTIDS VTSAQELRGC 361 TVINGSLIIN IRGGNNLAAE LEANLGLIEE ISGYLKIRRS YALVSLSFFR KLRLIRGETL .. Commentaires : • SwissProt est une banque non redondante de protéines. Elle doit être associée à TREMBL pour compléter les séquences non encore expertisées (donnant UniProt). • Pas de taxonomie dans la définition. Plusieurs taxonomies, si la séquence est totalement conservée (homme, souris ..) ID ACTA_HUMAN [DOMO] STANDARD; PRT; 377 AA. DE ACTIN, AORTIC SMOOTH MUSCLE (ALPHA-ACTIN 2). OS Homo sapiens (Human), Mus musculus (Mouse), Rattus norvegicus (Rat), OS Bos taurus (Bovine), and Oryctolagus cuniculus (Rabbit). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; OC Eutheria; Primates; Catarrhini; Hominidae; Homo. • Champ GN pour gene • Champ CC (Commentaires) très structuré avec des sous-rubriques • Beaucoup de liens sur d'autres banques • Feature définissant tous les éléments de la séquence protéique. origine de l'information : potential, expérimental • Champ Variant / champ conflict

  24. UNIPROT : P06213 (INSR_HUMAN) Reviewed, UniProtKB/Swiss-Prot Last modified September 18, 2013. Version 194

  25. Ressources Internet et Données de génomes http://pdessen.free.fr/M2BI/links.html

More Related