1 / 42

מבוא לביואינפורמטיקה

... לקחת את הביולוגיה למימד חדש. מבוא לביואינפורמטיקה. בני שומר, נובמבר 2005. Exponential Growth Rate. Over the last two decades, nucleic acid data has accumulated at the EMBL database at an exponential rate, currently totaling ~110 Gbases, related from 62M entries. ~200,000 Protein Entries.

shaw
Download Presentation

מבוא לביואינפורמטיקה

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ...לקחת את הביולוגיה למימד חדש מבוא לביואינפורמטיקה בני שומר, נובמבר 2005

  2. Exponential Growth Rate Over the last two decades, nucleic acid data has accumulated at the EMBL database at an exponential rate, currently totaling ~110 Gbases, related from 62M entries.

  3. ~200,000 Protein Entries Currently stored in the UniProt database, with 70M amino acids. ERK2 MAP Kinase

  4. The whole genome of over 1500 viruses and 775 bacteria has been completely sequenced or is in progress… Salmonella sp. Bacteriophage T4 Haemophilus influenza

  5. Trypanosoma brucei Plasmodium falciparum Leishmania major Schizosaccharomyces pombe …as well as some 400 eukaryotic genomes of which 135 are of parasites, fungi and other lower forms.

  6. Mitochondrion 3D CAT More than 500 organelle genomes are in the databases Mitochondria Chloroplast

  7. About 80 plants are being genome/EST sequenced or genetically mapped Arabidopsis thaliana

  8. There are currently ~170 genome projects of Metazoa

  9. 3.2 Gb ~30,000 genes.

  10. "בעלות על מאגר של ידע, זהו אושר לא קטן" סוקרטס

  11. "לא צריך לצאת מפרופורציות וצריך להישאר עם הראש על המתניים" אלון מזרחי

  12. Same Size Genome ~3Gb • About same number of genes (30,000) • Same gene contents • 85-90% similarity between genes (up to 98% similarity with apes)

  13. Genes & Development Vol. 14, No. 20, pp. 2551-2569, October 15, 2000

  14. The Basis for Bioinformatics

  15. From Sequence to Biology Human Zebrafish HoxB4 local alignment 1460 1470 1480 1490 1500 AF3071 TGGGCAATTCCCAGAAATTAATGGCTATGAGTTCTTTTTTGATCAACTCA :: ::::::: ::::::::::::: :::::::: : :::::::::::: AF0712 TGTGCAATTCAAAGAAATTAATGGCCATGAGTTCCTATTTGATCAACTCC 180 190 200 210 220 1510 1520 1530 1540 1550 AF3071 AACTATGTCGACCCCAAGTTCCCTCCATGCGAGGAATATTCACAGAGCGA :::::::: ::::: ::::: :: :: :::::::::::::: :::::::: AF0712 AACTATGTGGACCCTAAGTTTCCACCCTGCGAGGAATATTCCCAGAGCGA 230 240 250 260 270 1560 1570 1580 1590 1600 AF3071 TTACCTACCCAGCGACCACTCGCCCGGGTACTACGCCGGCGGCCAGAGGC ::::::::::: ::::: :: : ::::: : ::: :::::::: AF0712 CTACCTACCCAGT---CACTCTCCGG---ACTACTACAGCGCCCAGAGGC 280 290 300 310 1610 1620 1630 1640 1650 AF3071 GAGAGAGCAGCTTCCAGCCGGAGGCGGGCTTCGGGCGGCGCGCGGCGTGC ::: : ::::::: ::: :: :: : : ::: ::: ::: AF0712 AAGACCCCTCGTTCCAGCATGAGTCGATCTACCACCAGCGGTCGGGCTGC 320 330 340 350 360 Local, Global, Multiple…

  16. Elongation Factor 1 alpha

  17. >gi|28558768|sp|P53601|A4_MACFA Amyloid beta A4 protein precursor (APP) (ABPP) (Alzheimer's disease amyloid protein homolog) [Contains: Soluble APP-alpha (S-APP-alpha); Soluble APP-beta (S-APP-beta); C99; Beta-amyloid protein 42 (Beta-APP42); Beta-amyloid protein 40 (Beta-APP40); C83; P3(42); P3(40); Gamma-CTF(59) (Gamma-secretase C-terminal fragment 59); Gamma-CTF(57) (Gamma-secretase C-terminal fragment 57); Gamma-CTF(50) (Gamma-secretase C-terminal fragment 50); C31] Length = 770 Score = 1277 bits (3305), Expect = 0.0 Identities = 642/752 (85%), Positives = 643/752 (85%) Query: 19 EVPTDGNAGLLAEPQIAMFCGRLNMHMNVQNGKWDSDPSGTKTCIDTKEGILQYCQEVYP 78 EVPTDGNAGLLAEPQIAMFCGRLNMHMNVQNGKWDSDPSGTKTCIDTKEGILQYCQEVYP Sbjct: 19 EVPTDGNAGLLAEPQIAMFCGRLNMHMNVQNGKWDSDPSGTKTCIDTKEGILQYCQEVYP 78 Query: 79 ELQITNVVEANQPVTIQNWCKRGRKQCKTHPHFVIPYRCLVGEFVSDALLVPDKCKFLHQ 138 ELQITNVVEANQPVTIQNWCKRGRKQCKTHPHFVIPYRCLVGEFVSDALLVPDKCKFLHQ Sbjct: 79 ELQITNVVEANQPVTIQNWCKRGRKQCKTHPHFVIPYRCLVGEFVSDALLVPDKCKFLHQ 138 Query: 139 ERMDVCETHLHWHTVAKETCSEKSTNLHDYGMLLPCGIDKFRGVEFVCCPLXXXXXXXXX 198 ERMDVCETHLHWHTVAKETCSEKSTNLHDYGMLLPCGIDKFRGVEFVCCPL Sbjct: 139 ERMDVCETHLHWHTVAKETCSEKSTNLHDYGMLLPCGIDKFRGVEFVCCPLAEESDNVDS 198 Query: 199 XXXXXXXXXXWWGGADTDYADGSXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 258 WWGGADTDYADGS Sbjct: 199 ADAEEDDSDVWWGGADTDYADGSEDKVVEVAEEEEVAEVEEEEADDDEDDEDGDEVEEEA 258 Query: 259 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXCSEQAETGPCRAMISRWYFDVTEGKCAP 318 CSEQAETGPCRAMISRWYFDVTEGKCAP Sbjct: 259 EEPYEEATERTTSIATTTTTTTESVEEVVREVCSEQAETGPCRAMISRWYFDVTEGKCAP 318

  18. SNP

  19. RNA secondary structure prediction

  20. TF Analysis

  21. Gene Analysis

  22. Genome Level Annotation

  23. Genome Level Annotation Chromosome Oriented Focus Position Chromosome Slider Focus Area Overview

  24. Genome Level Annotation Focus Area Detailed View

  25. Genome Level Annotation Focus Area Basepair View

  26. Genome Level Annotation Gene Oriented

  27. Protein properties EX33 inflammation related GPCR analysis

  28. Protein properties EX33 inflammation related GPCR analysis

  29. Secondary Structure Prediction Garnier . 10 . 20 . 30 . 40 . 50 MWNSSDANFSCYHESVLGYRYVAVSWGVVVAVTGTVGNVLTLLALAIQPK helix HHHHHHH sheet E E EEEEEEEEEEEEEEE EEEE E turns TT TTTTT TTTT TTT T coil CC CCCC . 60 . 70 . 80 . 90 . 100 LRTRFNLLIANLTLADLLYCTLLQPFSVDTYLHLHWRTGATFCRVFGLLL helix HHHHHHHH H sheet EE EEEEEE EEEEEEEE E EEEE EEEEEEEE turns T TT T TTTTTTT coil C . 110 . 120 . 130 . 140 . 150 FASNSVSILTLCLIALGRYLLIAHPKLFPQVFSAKGIVLALVSTWVVGVA helix HH HHHHH HHHHHH sheet EEEEEEE EEEE EEE EEEEEEEEEEEEEE turns T TT T coil C C CC C . 160 . 170 . 180 . 190 . 200 SFAPLWPIYILVPVVCTCSFDRIRGRPYTTILMGIYFVLGLSSVGIFYCL helix sheet EEEEEEEEEEEE EEEEEEEEEEEE EEEEEE turns T TTTTTTT T TT coil CCCC C CC CC Plotstructure PredictProtein

  30. 3D Structure analysis

  31. Pattern and Motif Analysis ID GATA_ZN_FINGER_1; PATTERN. AC PS00344; DT NOV-1990 (CREATED); NOV-1997 (DATA UPDATE); JUL-1998 (INFO UPDATE). DE GATA-type zinc finger domain. PA C-x-[DN]-C-x(4,5)-[ST]-x(2)-W-[HR]-[RK]-x(3)-[GN]-x(3,4)-C-N-[AS]-C. NR /RELEASE=41.18,131945; NR /TOTAL=99(61); /POSITIVE=99(61); /UNKNOWN=0(0); /FALSE_POS=0(0); NR /FALSE_NEG=14; /PARTIAL=0; CC /TAXO-RANGE=??E??; /MAX-REPEAT=2; CC /SITE=1,zinc; /SITE=4,zinc; /SITE=15,zinc; /SITE=18,zinc; DR O13412, AREA_ASPNG, T; O13415, AREA_ASPOR, T; P17429, AREA_EMENI, T; Protein Families

  32. Pathway Analysis

  33. Pathway Analysis

  34. Protein-Protein interaction Data Sources: Yeast Two Hybrid system Triclosan - FabI

  35. Protein-Protein interaction Data Sources: Surface Plasmon Resonance Triclosan - FabI

  36. Protein-Protein interaction Data Sources: Natural Language Processing

  37. DNA Microarray & Expression Analysis

  38. Cloning, Restriction & Mapping

  39. PCR Design

  40. Linguistics & Information systems

More Related