1 / 52

Bioinformatics 2.0/3.0

Bioinformatics 2.0/3.0. Kei Cheung Yale Center for Medical Informatics. Outline. Introduction Web 2.0 Web 3.0 Semantic Web Topic Map Merging Web 2.0 and Web 3.0. Introduction.

Download Presentation

Bioinformatics 2.0/3.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics

  2. Outline • Introduction • Web 2.0 • Web 3.0 • Semantic Web • Topic Map • Merging Web 2.0 and Web 3.0

  3. Introduction • The Human Genome Project (HGP) has transformed genome sciences from being experimental to being increasingly computational • HGP has intensified the growth of bioinformatics • The Web has become a popular medium for accessing information over the Internet • Numerous bioinformatics databases and tools are Web accessible • These databases and tools as well as the Web have become indispensable for modern-day genomic research • Web 1.0 -> Web 2.0 -> Web 3.0

  4. Web 1.0 • It is read-only • It is about a single person, organization, … • It is document centric • It is based on HTML • It is for human to read

  5. Web 2.0

  6. Web 2.0 • Social networking (wiki, blog, tagging, bookmarking, rating, etc) • Multimedia content (photo, audio, video, etc) • Interactive, responsive, and dynamic web interface (Facebook, Flickr, YouTube, etc) • Mashup (assembly tools and visualization tools)

  7. Folksonomy (Social Tagging) • Folksonomy is the practice and method of collaboratively creating and managing tags to annotate and categorizecontent • In contrast to traditional subject indexing, metadata is not only generated by experts but also by creators and consumers of the content • Freely chosen keywords are used instead of a controlled vocabulary

  8. Tag Cloud • A tag cloud (or weighted list in visual design) is a visual depiction of user-generated tags used typically to describe the content of web sites.

  9. Web 2.0 (cont’d) • It is decentralized • It is a community/collaborator model instead of authority/consumer model • It is fun • It can be seriously used to share and integrate scientific datasets and algorithms

  10. Bioinformatics Applications of Web 2.0

  11. Wiki Proteins

  12. Nature Precedings (pre-publication research and preliminary findings)

  13. Scientific Podcasts

  14. Multimedia (cont’d)

  15. Journal of Visualized Experiments

  16. myExperiment

  17. Mashup (1): Assembly Tools • Dapper (scrape web content and convert it into machine readable format) • Yahoo! Pipes (fetch, filter, and integrate data)

  18. Yahoo! Pipes Demo

  19. Yahoo! Pipes Use Case

  20. GeoCommons: Mashup of Maps

  21. Mashup (2): Visualization Tools • E.g., Google Earth

  22. Geo-Mashup: Google Earth (tracking H5N1 virus over time)

  23. Bioinformatics Mashup’s • Mashup of biological entities of the same type • Protein network mashup • Sequence annotation mashup • Mashup of biological entities of different types

  24. Mashup of pathway data and gene expression data Calvin cycle pathway associated with gene expressions

  25. Challenges to Data Mashup • Lack of annotation • Lack of links • Lack of link semantics • Lack of data semantics • Lack of standards or use of standards

  26. Kei Tsi Daniel Cheng (this is not me!!) Kei Cheung (16 years ago) Kei Cheung (6 months ago) Lack of Semantic Annotation

  27. colllaborators Lack of Links

  28. (?) Lack of Link Semantics prototyped

  29. Lack of Data Semantics <html” <body> … <table> <tr> <td>Alcohol Dehydrogenase 1B (class I), beta polypeptide</td><td>ADH1B</td> </tr> … </table> … </body> </html>

  30. Lack of Standards (Use of Standards) • Different naming rules (based on phenotype, sequence, function, organisms, etc) • Armadillo (fruitflies) vs. i-catenin (mice) • PSM1 (human) = PSM2 (yeast); PSM1 (yeast) = PSM2 (human) • Sonic Hedgehog • ID proliferation • Different ID schemes: 1OF1  (PDB ID) and P06478 (SwissProt ID) correspond to Herpes Thymidine Kinase • Lexcial variation: GO1234, GO:1234, GO-1234 • Synonyms vs. homonyms • Dopamine receptor D2: DRD2, DRD-2, D2 • PSA: prostate specific antigen, puromycin-sensitive aminopeptidase, psoriatric arthritis, pig serum albumin

  31. Web 3.0

  32. Web 3.0 • It refers to a third generation of Internet-based services that emphasize machine-facilitated understanding of information in order to provide a more productive and intuitive user experience. • Semantic Web • Topic Map

  33. Semantic Web • "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001 • It provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries • It is based on the Resource Description Framework (RDF) • URI for naming/identify web objects • Graph structure (directed acyclic graph or DAG) for connecting web objects

  34. Resource Description Framework (RDF) • It is a standard data model (directed acyclic graph) for representing information (metadata) about resources in the World Wide Web • In general, it can be used to represent information about “things” or “resources” that can be identified (using URI’s) on the Web • It is intended to provide a simple way to make statements (descriptions) about Web resources

  35. RDF Statement • A RDF statement consists of: • Subject: resource identified by a URI • Predicate: property (as defined in a name space identified by a URI) • Object: property value (literal) or a resource A resource can be described by multiple statements.

  36. Graphical & XML Representation http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&list_uids=125 http://en.wikipedia.org/wiki/Snynonym http://en.wikipedia.org/wiki/Name “ADH1B” “Alcohol Dehydrogenase 1B (class I), beta polypeptide” <?xml version="1.0"?> <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:en=“http://en.wikipedia.org/wiki/” > <rdf:Description about=“http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&list_uids=125”> <en:name>Alcohol Dehydrogenase 1B (class I), beta polypeptide”></en:name> <en:synonym>ADH1B</en:synonym> </rdf:Description> </rdf:RDF>

  37. RDF Schema (RDFS) • RDF Schema terms: • Class • Property • type • subClassOf • range • Domain • Example: <DNASequence, type, Class> <Promoter,subClassOf,DNASequence> <Protein,type,Class> <TranscriptionFactor,subClassOf,Protein> <Bind,type,Property> <Bind,domain, TranscriptionFactor> <Bind,range, Promoter>

  38. Ontologies • In both computer science and information science, an ontology is a representation of a set of concepts within a domain and the relationships between those concepts. • It is a shared conceptualization of a domain • Ontologies are commonly encoded using ontology languages.

  39. Web Ontology Language (OWL) • Latest standard in ontology languages from the W3C • Built on top of RDF • OWL semantically extends RDF while it is syntactically the same as RDF • Three species of OWL • OWL-Lite • OWL-DL • OWL-Full

  40. OWL > RDF/RDFS • Cardinality restrictions: (e.g., a gene may have more than one transcription factor binding sites) • Disjointedness of classes: (e.g., mRNA may be classified either as introns or exons) • Other OWL constructs • uniqueness: (e.g.,a GO term can have only one GO identifier) • unionOf: (e.g., gene may be the unionOfintron and exons • sameAs: specifying synonymous relationship between classes (e.g., “Cerebellar Purkinje Cell” sameAs “Purkinje Neuron”).

  41. Topic Map • A topic map (an ISO standard) is used represent information using topics (concepts), associations, and occurrences • It is used to organize information in a way that can be optimized for navigation. association occurrence

  42. Neuroscience Topic Map

  43. Topic Map Encoding/Querying • XML Topic Map (XTM) • Top Map Query Language (TMQL)

  44. Visual Topic Maps • A Visual Topic Map can be defined as a topic map including visual topics. A visual topic is defined by a topic name which refers to a visual content.

  45. NCBI Site Map

  46. Mosaic of Chinese Characters in Stories about the Meaning of Ideograms

  47. Visualization of the del.icio.us Tags in an Interactive Graph

  48. Combining Semantic Web and Topic Map Visualization Topic Map Semantic Web Machine reasoning Knowledge organization & representation (mapping between XTM and RDF/OWL)

  49. Web 2.0 Meets Web 3.0 • Folksonomy meets ontology • Tags can evolve into standard heavy-weight ontologies, while light-weight ontologies can be applied to tagging • Human readability meets machine readability • Visual network vs. semantic network • Social network meets semantic network • FOAF, semantic wiki • Syntactic mashup meets semantic mashup • Dapper and yahoo pipes may become ontologically aware

More Related