Primig lab michael.primig@unibas.ch bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis - PowerPoint PPT Presentation

johana
slide1 l.
Skip this Video
Loading SlideShow in 5 Seconds..
Primig lab michael.primig@unibas.ch bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis PowerPoint Presentation
Download Presentation
Primig lab michael.primig@unibas.ch bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis

play fullscreen
1 / 41
Download Presentation
Primig lab michael.primig@unibas.ch bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis
356 Views
Download Presentation

Primig lab michael.primig@unibas.ch bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Primig lab michael.primig@unibas.ch http://www.bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis (visiting scientist) Guenda Berthold Philippe Demougin Leandro Hermida Reinhold Koch Ulrich Schlecht Christa Wiederkehr Roland Zuest Bioinformatics I -- Databases Primig lab michael.primig@unibas.ch http://www.bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis (visiting scientist) Guenda Berthold Philippe Demougin Leandro Hermida Reinhold Koch Ulrich Schlecht Christa Wiederkehr Roland Zuest

  2. Primig lab michael.primig@unibas.ch http://www.bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis (visiting scientist) Guenda Berthold Philippe Demougin Leandro Hermida Reinhold Koch Ulrich Schlecht Christa Wiederkehr Roland Zuest Microarray Data Bioinformatics I -- Databases Primig lab michael.primig@unibas.ch http://www.bioz.unibas.ch/primig Thomas Aust Roopa Basavaraj (visiting scientist) Michel Bellis (visiting scientist) Guenda Berthold Philippe Demougin Leandro Hermida Reinhold Koch Ulrich Schlecht Christa Wiederkehr Roland Zuest

  3. Schwede lab Torsten.schwede@unibas.ch http://www.bioz.unibas.ch/schwede Jozef Aerts Juergen Kopp Flavio Monigatti Franziska Roeder Rainer Poehlmann SWISS-MODEL Protein Database Bioinformatics I -- Databases Schwede lab Torsten.schwede@unibas.ch http://www.bioz.unibas.ch/schwede Jozef Aerts Juergen Kopp Flavio Monigatti Franziska Roeder Rainer Poehlmann

  4. Bioinformatics I -- Databases What is a database? How do you make one? Biological Databases Knowledgebases Novel ideas… more Info at http://www.biozentrum.unibas.ch/personal/primig/ Follow the >>>teaching<<< link. What is a database? How do you make one? Biological Databases Knowledgebases Novel ideas… more Info at http://www.biozentrum.unibas.ch/personal/primig/ Follow the >>>teaching<<< link. What is a database? How do you make one? Biological Databases Knowledgebases Novel ideas… more Info at http://www.biozentrum.unibas.ch/personal/primig/ Follow the >>>teaching<<< link.

  5. Bioinformatics I -- Databases What is a database? A database is a structured collection of data Data INPUT >>> Information OUTPUT Data INPUT >>> Information OUTPUT

  6. Bioinformatics I -- Databases What is a relational database? A relational database is a set of tables containing data belonging to defined categories Data INPUT >>> Information OUTPUT

  7. Bioinformatics I -- Databases How do you make one? A relational database management system (RDBMS) lets you construct, update, and administrate a relational database. An RDBMS takes Structured Query Language (SQL) statements entered by a user and creates, updates, or provides access to the database.

  8. Bioinformatics I -- Databases RDBMS Open Source: mySQL | PostgreSQL Commercial: IBM-DB2 | Oracle

  9. Bioinformatics I -- Databases Accessing relational databases You also need a Graphical User Interface (GUI). PHP (recursive acronym for "PHP: Hypertext Preprocessor") is a widely-used Open Source general-purpose scripting language that is especially suited for Web development and can be embedded into HTML Perl is derived mostly from the C programming language. Perl's process, file, and text manipulation facilities make it particularly well-suited for tasks involving e.g. database access, graphical programming, and world wide web programming.

  10. Bioinformatics I -- Databases How do you make one? • Database Model: • Analyse aims (submission/curation system) • Define entities = tables (user, submission) • Define attributes (name, phone, email) • Define relationships between entities (user makes submission) • Draw diagram

  11. Bioinformatics I -- Databases New Assign Submission GeO Curate Submission Author Curator Author Author Delete Revision Delete Publication Accepted Rejected Revise GeO GeO Author Curator GeO Curate Revision Assign Revision GeO Deleted GeO GeO Christa Wiederkehr

  12. Bioinformatics I -- Databases How do you make one? • Database Model: • Analyse aims (submission/curation system) • Define entities = tables (user, submission) • Define attributes (name, phone, email) • Define relationships between entities (user makes submission)

  13. Orf #orf_id #nomenclature_id #orf_name Submitstate #submitstate_id *submitstate Term #term_id *name *term_type Termassign #go_acc *submission_id *ontology User #user_id *name *email *login *password *lab_id *user_role_id Submission #submission_id *title *description °submitstate_id °orf_id *user_id °curator_id Reference #reference_id *title *authors *journal *pubmed °url_pdf *submission_id User_role #user_role_id *user_role Comment #comment_id *text *submission_id *user_role_id Bioinformatics I -- Databases Christa Wiederkehr

  14. Bioinformatics I -- Databases Biological Databases: DNA DNA Sequence Data EBI: http://www.ebi.ac.uk/ NCBI:http://www.ncbi.nlm.nih.gov/ DDBJ:http://www.ddbj.nig.ac.jp/

  15. Bioinformatics I -- Databases Global data synchronization

  16. Mouse Rat Human Bioinformatics I -- Databases EBI – EMBL Release 72 contains 18,324,246 sequence entries comprising 23,090,186,146 nucleotides

  17. Bioinformatics I -- Databases Biological Databases: DNA DNA Sequence Datasubmission at http://www3.ebi.ac.uk/Services/webin/Sbm.cgi

  18. Bioinformatics I -- Databases Biological Databases: proteins Protein Structure Data Protein Databank (PDB) at http://www.rcsb.org/pdb/ Search 17’107 Petide, Protein and Virus Structures

  19. Bioinformatics I -- Databases Biological Databases: proteins Protein Structure Data Submission at http://deposit.pdb.org/adit/

  20. Bioinformatics I -- Databases Biological Databases: compounds Small Molecules Klotho DB: Biochemical Compounds Declarative Database at http://www.biocheminfo.org/klotho/ LIGAND DB at http://www.genome.ad.jp/kegg/catalog/compounds.html

  21. Bioinformatics I -- Databases Biological Databases: RNA • Expression data - RNA • Microarray data repositories • GeneOmnibus (NCBI) at • http://www.ncbi.nlm.nih.gov/geo/ • ArrayExpress (EBI) at • http://www.ebi.ac.uk/arrayexpress/ • MIAME:Minimal Information About a Microarray Experiment

  22. Bioinformatics I -- Databases

  23. Bioinformatics I -- Databases Biological Databases: RNA • Expression data - RNA • Expression data visualization • Stanford Expression Connection at • http://genome-www4.Stanford.EDU/cgi-bin/SGD/expression/expressionConnection.pl • GermOnline at http://germonline.org • RIKEN mouse at http://read.gsc.riken.go.jp/

  24. Bioinformatics I -- Databases

  25. Bioinformatics I -- Databases Biological Databases: RNA • Expression data - RNA • Yeast Cell Cycle at http://genome-www.stanford.edu/cellcycle • Human Cell Cycle at http://genome-www.stanford.edu/Human-CellCyle/Hela • Human & Mouse tissue profiling at http://expression.gnf.org

  26. Bioinformatics I -- Databases

  27. Bioinformatics I -- Databases Biological Databases: proteins • Post-translational data: protein-protein interaction in Yeast • Biochemical studies • Cellzome • BIND • MDS Proteomics • Two-hybrid studies • Curagen’s PathCalling

  28. Bioinformatics I -- Databases Biological Databases: proteins • Post-translational data: protein-protein interaction in Yeast • Biochemical studies • Cellzome at http://yeast.cellzome.com • BIND at http://bind.mshri.on.ca • MDS Proteomics at http://www.mdsp.com • Two-hybrid studies • Curagen’s PathCalling at http://portal.curagen.com Access the data through http://germonline.bioz.unibas.ch and click on S. cerevisiae. Search for any gene, e.g. SPO11 and go to the Protein/Proteome Information section of the Locus Report page.

  29. Bioinformatics I -- Databases

  30. Bioinformatics I -- Databases Biological Databases: literature Pubmed contains the abstracts of peer-reviewed publications in the field of biomedical research http://www.ncbi.nlm.nih.gov/entrez/query.fcgi Scientific Journals are often available online (sometimes even for free)! http://www.ub.unibas.ch/vlib/vbbiol.htm

  31. Bioinformatics I -- Databases Knowledgebases: a common language The GeneOntology project: http://www.geneontology.org The objective of GO is to provide controlled vocabularies for the description of gene products. These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them. The three organizing principles of GO are molecular function, biological process and cellular component. A gene product has one or more molecular functions and is used in one or more biological processes; it may be, or may be associated with, one or more cellular components. The GeneOntology project: http://www.geneontology.org The objective of GO is to provide controlled vocabularies for the description of gene products. These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them. The three organizing principles of GO are molecular function, biological process and cellular component. A gene product has one or more molecular functions and is used in one or more biological processes; it may be, or may be associated with, one or more cellular components.

  32. Bioinformatics I -- Databases Knowledgebases: a common language • The GeneOntology Evidence Code:http://www.geneontology.org/doc/GO.evidence.html • IC inferred by curator (no evidence but reasonable) • IDA inferred from direct assay (enzyme, EMSA) • IEA inferred from electronic annotation (BLAST hit) • IEP inferred from expression pattern (RNA, Protein) • IGI inferred from genetic interaction (suppressors, synthetic lethals, complementation) • IMP inferred from mutant phenotype (deletion, insertion) • IPI inferred from physical interaction (co-IP, 2-hybrid) • ISS inferred from sequence or structural similarity (homolog) • NAS non-traceable author statement (quote cannot be found) • ND no biological data available • TAS traceable author statement • NR not recorded

  33. Bioinformatics I -- Databases Biological Databases: GO based species specific db’s • Annotation: covers knowledge from Genetics, Molecular Biology and Functional genomis • SGD for S. cerevisiae • http://genome-www.stanford.edu/Saccharomyces/ • TAIR for A. thaliana • http://www.arabidopsis.org/ • Wormbase for C. elegans • http://www.wormbase.org • Flybase for D. melanogaster • http://flybase.bio.indiana.edu/ • Mouse Genome Database for M. musculus • http://www.informatics.jax.org

  34. Bioinformatics I -- Databases Knowledgebases: Swissprot >>> Uniprot Release 40.31 of 25-Oct-2002 of SWISS-PROT contains 116776 sequence entries, comprising 42881496 amino acids abstracted from 100002 references.

  35. Bioinformatics I -- Databases Knowledgebases: Swissprot >>> Uniprot • KEY FEATURES • Minimal redundancy: data from different sources are merged; if conflicts exist between various sequencing reports, they are indicated in the feature table of the corresponding entry. • Annotation: • Function(s) of the protein • Post-translational modification(s). For example carbohydrates, phosphorylation, acetylation, GPI-anchor, etc. • Domains and sites. For example calcium binding regions, ATP-binding sites, zinc fingers, homeobox, kringle, etc. • Secondary structure • Quaternary structure. For example homodimer, heterotrimer, etc. • Similarities to other proteins • Disease(s) associated with deficiencie(s) in the protein • Sequence conflicts, variants, etc. • Integration • Swissprot is currently links to about 60 external databases (list at http://www.expasy.org/cgi-bin/lists?dbxref.txt)

  36. Bioinformatics I -- Databases Knowledgebases: Swissprot >>> Uniprot In SWISS-PROT, information is given in the comment lines (CC), in the feature table (FT) and in the keyword lines (KW). Most comments are classified by `topics'; this approach permits the easy retrieval of specific categories of data from the database. ID SP11_YEAST STANDARD; PRT; 398 AA. AC P23179; CC -!- FUNCTION: REQUIRED FOR MEIOTIC RECOMBINATION. MEDIATES DNA CC CLEAVAGE THAT FORMS THE DOUBLE-STRAND BREAKS (DSB) THAT INITIATE CC MEIOTIC RECOMBINATION. CC -!- SUBCELLULAR LOCATION: Nuclear. CC -!- DEVELOPMENTAL STAGE: MEIOSIS-SPECIFIC. CC -!- SIMILARITY: BELONGS TO THE TOP6A FAMILY. FT ACT_SITE 135 135 DNA CLEAVAGE (PROBABLE). FT MUTAGEN 135 135 Y->F: LOSS OF ACTIVITY. KW Hydrolase; DNA-binding; Sporulation; Meiosis; Nuclear protein.

  37. Bioinformatics I -- Databases Novel ideas… A database that contains large-scale automatic structure predicitons: SWISS-MODEL repository Models from SWISS-MODEL server and non-curated external sources will be available.

  38. Bioinformatics I -- Databases Novel ideas… The SWISS-MODEL server at http://www.expasy.org/swissmod/ is an automated modelling system that serves all scientist as a tool to study the putative 3D structure of a protein using Comparative Modelling.

  39. Bioinformatics I -- Databases Novel ideas… The GermOnline server at http://germonline.bioz.unibas.ch http://germonline.org is a platform for online submission/curation that enables scientist who work in the field of meiosis and gametogenesis to create, update and curate a knowledgebase that uses controlled vocabulary (GO) and free text to describe the roles of genes in sexual reproduction.

  40. Bioinformatics I -- Databases Major DB info EBI: http://www.ebi.ac.uk/Databases Nucl. Acid Res. 2002 http://nar.oupjournals.org/content/vol30/issue1/ GermOnline http://germonline.unibas.ch Primig lab http://www.bioz.unibas.ch/personal/primig/ follow the teaching link, check out literature & info, download ppt presentation db’s. Life Sciences Training Facility http://www.bioz.unibas.ch/corelab: you will find more links on bioinformatics

  41. Bioinformatics I -- Projects We would like to collaborate with you on our ongoing GermOnline project. You will be asked to use online sources (species-specific and general knowledgebases, Pubmed) to collect information about the genomes of S. pombe, A. thaliana, C. elegans, D. melanogaster, M. musculus and H. sapiens. This information should be presented in a concise paragraph like the one written by Peter Philippsen for the genome of S. cerevisiae (click on S. cerevisiae and follow the more link in the Genome Information section). You should include two complete references. Furthermore we ask you to search for knowledge about a list of conserved genes important for meiosis and gametogenesis. You are asked to identify the homologs and orthologues and provide curated information about the yeast genes DMC1, MLH3, MRE11, MSH4, MSH5andSPO11. Your search should include literature, knowledgebases and protein structures. More info at http://www.biozentrum.unibas.ch/personal/primig/teaching/bioinfo_I_literature.html The information you provide will be integrated into GermOnline by Ulrich Schlecht. You will be credited for your contribution. The results you produce will be recorded and (if everything works out) they count for the exam. We look forward to getting your feedback.