Building and Using Knowledge Bases - PowerPoint PPT Presentation

tacey
building and using knowledge bases n.
Skip this Video
Loading SlideShow in 5 Seconds..
Building and Using Knowledge Bases PowerPoint Presentation
Download Presentation
Building and Using Knowledge Bases

play fullscreen
1 / 74
Download Presentation
Building and Using Knowledge Bases
82 Views
Download Presentation

Building and Using Knowledge Bases

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. BuildingandUsingKnowledge Bases Steffen Staab Saqib Mir – European Bioinformatics InstituteErmelindad‘Oro, Massimo Ruffolo – Univ. Calabria, Italy & WeST Team

  2. Institut WeST – Web Science & Technologies Semantic Web Web Retrieval Social Web Multimedia Web Software Web GESIS

  3. PhDthesistrauma 17 yearsago „Nach dem Auspacken der LPS 105 präsentiert sich dem Betrachter ein stabiles Laufwerk, das genauso geringe Außenmaße besitzt wie die Maxtor.“ Havingunwrappedthe LPS 105 – revealsitselftotheonlooker - a stablediskdrive, whichhassimilarlysmallvolumeasthe Maxtor.“

  4. General motivationis not informationextraction, but itissolvingtasks! General Motivation

  5. General objective: Extracting to LOD useAsExample hasLivedIn • Crucialtoknow: Ontologiesnowadaysreflectthisstructure • Ontologiesare • Modular (vsonetorulethem all) • Distributed (vsdefined in oneplace) • Connected (vsisolatedtemplates) • Extensible (vsclaimedtobefinished) • Lightweight (vscomputationallyintractable) • Popularonesareusedmoreoften (vspeopledisagreeing) • Ontologies – LEGO style

  6. Most famousapplications • Steve Macbeth (Microsoft): - discussion wrt Schema.org -“about 7% of pages we crawl have mark-up” • http://www.w3.org/2012/06/06-schema-minutes.html • LOD Cloud • Google Knowledge Graph • Bing getsitsownknowledgegraphhttp://searchengineland.com/bing-britannica-partnership-123930

  7. Exampleontology-basedapplication 1: Analysis ofUrban parameters

  8. General objective: Analysing LOD useAsExample hasLivedIn

  9. http://lisa.west.uni-koblenz.de/lisa-demo/ Family‘sanalysisofKoblenz LOD + Open Street Mapdata

  10. http://lisa.west.uni-koblenz.de/lisa-demo/ Entrepreneur‘sanalysisofKoblenz LOD + Open Street Mapdata 1. Prize German Linked Open Gov Data Competition 2012

  11. Exampleontology-basedapplication: Faceted Multimedia exploration

  12. Making Web 2.0 More Accessible Links Location low- to midlevel features Persons xxxxxxxxx Knowledge Tags [Schenk et al; JoWS 2009] GeoNames

  13. Choosing between Koblenz – and Koblenz Video at: http://vimeo.com/2057249

  14. Contextual Information

  15. Tag-based refinement

  16. A tag view of „Koblenz“ & „Castle“

  17. Semantic Identity – Festung Ehrenbreitstein

  18. Persons – Celebrities, FOAFers & Flickr Users Billion Triples Challenge 1. Prize 2008 [Schenk et al; JoWS 2009]

  19. Now on toinformationextraction: Observations on Information Extraction

  20. Challenges & Opportunities for IE Not all web pages are created equal

  21. Challenges & Opportunities for IE Some challenges are the same, e.g. finding type instances

  22. Challenges & Opportunities for IE Some challenges are the same, e.g. finding relation instances

  23. Challenges & Opportunities for IE Some contain concepts and their descriptions, some don‘t No types here, few relation types

  24. Challenges & Opportunities for IE Knowing that they are instances and of which type Positional indication Textual indication

  25. Challenges & Opportunities for IE To some extent positional and layout indications work across languages and sites

  26. Challenges & Opportunities for IE owl:sameAs We should not only think about Web pages, but about Web sites

  27. Challenges & Opportunities for IE We should not only think about Web pages, but about Web sites owl:sameAs

  28. Comparing related work to our objectives Relatedworkobjectives • IE on Web pages • Acquiringinstancesandrelationshipinstances • IE based on linear text Ourobjectives • IE on Web sites • Acquiringitems • Classifyingitems in • Instances • Concepts • Relation instances • Relationships • IE also basedon spatialposition Thereisoverlapandofcoursethereareexceptionsin relatedwork

  29. Outline The Bio-Case The SocialMedia-Case • Motivation • State-of-the-Art • Core ideaofSXPath • Implementation • Evaluation [Oroet al; VLDB 2010]

  30. Presentation-oriented documents

  31. Presentation-oriented documents • HTML DOM structureissitespecific • Spatialarrangementsarerarely explicit • Spatiallayoutishidden in complexnestingoflayoutelements • Intricate DOM treestructuresareconceptuallydifficulttoqueryfortheuser (or a tool!)

  32. Related Work Web Query languages • Xpath 1.0 and XQuery1.0 • Established • Toodifficulttouseforscrapingfromintricate DOM structures Visual languages • Spatial Graph Grammars[Kong et al.] arequitecomplex in termofbothusabilityandefficiency • Algebrasforcreatingandqueryingmultimediainteractivepresentations (e.g. ppt) [Subrahmanian et al.] Web wrapperinductionexploitingvisualinterface[Gottlob et al.] [Sahuguet et al.] • generateXPathlocationpathsof DOM nodes • canbenefitfromusingSpatialXPath

  33. Outline The Bio-Case The SocialMedia-Case • Motivation • State-of-the-Art • Core ideaofSXPath • Implementation • Evaluation

  34. Representing Spatial Relations between DOM Nodes b e

  35. Idea: Use Spatial Relations among DOM Nodes

  36. Spatial DOM (SDOM)

  37. SXPath System Architecture

  38. Querying for Relations Among Nodes Rectangular Cardinal Relations (RCR) r1 E:NE r2 Spatial models allow for expressing disjunctive relations among regions Topological Relations

  39. XPath Example

  40. SXPath Example

  41. From XPath 1.0 towards Spatial Querying with SXPath SXPath features • adopts intuitive path notation: • axis::nodetest [pred]* • adds to XPath • spatial axes • spatial position functions • natural semantics for spatial querying

  42. SXPath System Architecture

  43. Complexity Results • Formal modeldefined in thepaper[Oro et al; VLDB 2010]

  44. Outline The Bio-Case The SocialMedia-Case • Motivation • State-of-the-Art • Core ideaofSXPath • Implementation • Evaluation

  45. SXPath System

  46. Summative User Study

  47. Summative User Study

  48. Summative User Study

  49. Outline The Bio-Case • Motivation • The (Biochemical) Deep Web • Contributions • Page-level wrapperinduction • Site-widewrappergeneration • Error Correctionby Mutual Reinforcement • Conclusionsand Future Directions The Social Media Case • Motivation • State-of-the-Art • Core ideaofSXPath • SXPath Language • Spatial Data Model • Syntax & Semantics • Complexity • Implementation • Evaluation