1 / 43

Università Politecnica delle Marche

DBin: an all round Semantic Web platform for user communities Giovanni Tummarello, Ph. D. SEMEDIA Semantic Web and Multimedia. Universita' Politecnica delle Marche, Ancona, Italy. http://semedia.deit.univpm.it.

jarvis
Download Presentation

Università Politecnica delle Marche

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DBin: an all round Semantic Web platform for user communities Giovanni Tummarello, Ph. D SEMEDIA Semantic Web and Multimedia Universita' Politecnica delle Marche, Ancona, Italy http://semedia.deit.univpm.it TOWARD A SCALABLE MULTIMEDIA METADATA INFRASTRUCTURE USING DISTRIBUTED COMPUTING AND SEMANTIC WEB TECHNOLOGIES Patrizia Asirelli1, Maria Grazia Di Bono1, Massimo Martinelli1, Ovidio Salvetti1, Oreste Signore1 1Institute of Information Science and Technologies (ISTI), Italian National Research Council (CNR), via Moruzzi 1, 56124 Pisa, Italy Patrizia.Asirelli@isti.cnr.it, Maria.Grazia.DiBono@isti.cnr.it, Massimo.Martinelli@isti.cnr.it, Ovidio.Salvetti@isti.cnr.it, Oreste.Signore@isti.cnr.it Michele Catasta2, Christian Morbidoni2, Francesco Piazza2 , Giovanni Tummarello2 2SeMedia, Universita' Politecnica delle Marche, Ancona, Italy http://semedia.deit.univpm.it mcatasta@acm.org, c.morbidoni@deit.univpm.it, f.piazza@univpm.it, g.tummarello@gmail.com Università Politecnica delle Marche SEMEDIA Semantic Web and Multimedia Università Politecnica delle Marche SEMEDIA Semantic Web and Multimedia

  2. “Accessing” the Semantic Web The direct approach: • “The Semantic Web consists of many RDF graphs nameable by URIs.” Carroll, Bizer, Hayes, Stickler ISWC 2004, www2005, etc. • Perfectly supported by SparQL • need Ink cartdrige? So easy to ask HP.com • need a data cable? So easy to ask Nokia.com

  3. The Semantic Web What if I was interested in “datacables that work for my Nokia 1234” (no matter who produces them)? Or “all about beer Peroni” (Reviews, comments, places where I can buy it with prices, pictures of its glass, its brewery) ? Inverse approach: • The Semantic Web consists of many concepts (URIs) which are annotated at global scale

  4. Scalability issues Direct: Accessing Nokia.com/data.rdf • I know exactly who to ask to • Network traffic= the size of the document, • Computational complexity: neglectable Inverse: “Something about” my Nokia1234 • Many parties will have something to say Find them/distribute the query/collection traffic Impose them the query answering burden How to join local data?

  5. A P2P / Personal Semantic Space approach File sharing, P2P “philosophy”: • Downloads a lot, too much not a problem • Shares what downloaded • Uncommitted, no guarantees, join and leave at will But for the Semantic Web: • Exchanges, downloads and serves “RDF” rather than “files” • Searches about user interests rather than file names (“sicilian cucine”, “Scottish pubs”) • Remembers (almost) all  grows a local triplestore

  6. SW P2P a la Napster, scenario and possibilities (2) Storing a lot of metadata locally..why? 1) Why not, disk space is very cheap! (and its just metadata) 2) Key enabler to the global scalability!  “use the Semantic Web” without direct network traffic or external computational burden 3) Maximally fast and interactive (high speed local queries) 4) Gets your local CPUs at work!  much more powerful than what a server can give you for free, allows sophisticate information processing (reasoning, filters)  Its your computer, “your” data  personalized algorithms for rating and trust  relate it to your local resources (SW desktop integration)

  7. One size (P2P model) doesn’t fit all Several SW P2P approaches have been proposed: • Centralized + Crawlers/feeds • Distributed queries (Edutella et Al.) • Distributed RDF storage (RDFPeers) Different scenarios, not the one studied here, see RDFGrowth paper.

  8. “RDFGrowth” - Design essentials In this scenario we don’texpect others to: • Execute external arbitrary graph queries • Perform active “information hunt” for us. • No replicating queries, no query forwarding or routing. In general, no operations that induce non constant burden • Provide a service if not in a purely “best effort” fashion • No uptime guarantees, no service guarantees

  9. RDFGrowth Groups Based on a shared definition of “interesting URIs” via a local semantic query: Example Beer&Breweries Group: Select x where {x} <rdf:type> {<beer:Beer>} Select x where {x} <rdf:type> {<beer:Brewery>} Those who join will execute the query and share information about the resulting URIs

  10. Information “Surrounding” a URI: RDF “Neighbours” MSG(statement) (approx def). The “blank node closure” of the statement. RDFN (def). The RDFN of a resource is the graph composed by all the MSGs involving the resource itself. Similar to a Concise Bounded Resource Description (CBRD) given in [URIQA], but is differs mainly by the use of the “involves” RDFN(Uri) is the only remote query allowed in RDFGrowth

  11. Locating “News”:RDFN Hash Set RHS(URI)=Hashes(canonicalize(RDFN(URI))) • Concise values exposed to the network to reppresent the knowledge a peer has about a URI • Peers looking for information about a URI use the published RHS to select who to talk to (i.e. the most “interesting” peer)

  12. Simulations, no KEL delay

  13. With KEL publishing delay

  14. Using “epidemic news propagation”

  15. DBin: everything else around RDFGrowth

  16.  A lot of pragmatic decisions A complete Semantic Web application today means… Deliverable integration platform Domain application/GUI Trust policies tools Data flow pipeline RDF signing methodologies Ontology Import Policies RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  17. URL Data Handling and URI Minting URL Data handling (Up/Down)URI Minting

  18. The P2P infrastructure will deal with RDF but: • People want to access pictures, mp3s, files, not just see URI. URL resolving/downloading • Automated uploading also needed! URL Data Handling and URI Minting

  19. RDF Storage URL Data handling (Up/Down)URI Minting RDF Storage

  20. RDF/S Storage • Many choices! • We chose Sesame (SeRQL was schema aware long ago) • Thanks Sesame guys! New features being added.. (See trust filtering, pipelining)

  21. RDF P2P Transport Layer RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  22. Ontology Import Policies Ontology Import Policies RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  23. Ontology importing need care • They have an active role (see Sesame forward inference) • Policies to control import and export are needed • Our approach: DBin will “suggest” that the import of ontologies when discovered, but the process is never fully automated and can be reversed

  24. RDF Signing Methodologies RDF signing methodologies Ontology Import Policies RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  25. Authorship@model level: RDFTrustToolkit Being certain about Who said what We want: • Small granularities! As information will flow bit by bit. • Signatures INSIDE rdf, so they flow along with the data and are kept withing the triplestore. Tools: RDF canonical serialization (J. Carroll ) MSG theory  reify a singly triple/sign it all!

  26. The MSG theory comes handy From the RDF blank nodes semantics:  A MSG is also the minimum unit that can be sent across a P2P so that once merged the original graph will be restored. From the MSG definition: If s and t are distinct statements and t belong to MSG(s), then MSG(t) = MSG(s). Each statement belong to one and only one MSG. A graph can be univocally decomposed in MSGs. the signature can be attached to a single, arbitrary triple in a MSG!

  27. So RDFTrustToolkit.. • Given a URI will list the MSG around it • Given a MSG will list and verify existing signatures • Can remove existing signatures or add new ones

  28. Signing a Minimum Selfcontained Graph (MSG) mbz:artistid=15290 IdKtR...j4c= dbin:Base64sigvalue mus:is_part_of http://public../69..bd.pem rdf:subject dbin:X509Certificate mus:plays rdf:type rdf:object rdf:predicate rdf:type mus:Song rdf:statement rdf:type mus:file mus:Band MD5:123123 Larger MSG lowers %overhead. In the DBin, signign overhead approx 25%.

  29. Example (RDFTrustToolkit run) Original MSG <rdf:RDF xmlns:dbin="http://dbin.org#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > <rdf:Description rdf:about="http://dbin.org/Home/Panaioli"> <dbin:student>Panaioli Fabio</dbin:student> </rdf:Description> </rdf:RDF> Signed MSG <rdf:RDF xmlns:dbin="http://dbin.org#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > <rdf:Description rdf:about="http://dbin.org/Home/Panaioli"> <dbin:student>Panaioli Fabio</dbin:student> </rdf:Description> <rdf:Description rdf:nodeID="A0"> <rdf:predicate rdf:resource="http://dbin.org#student"/> <dbin:PGPCertificate>http://public.dbin.org/cont/238785872.asc</dbin:PGPCertificate> <dbin:Base64SigValue>MCwCFOPX….A7xIaUgBzhkjcB5w==</dbin:Base64SigValue> <rdf:subject rdf:resource="http://dbin.org/Home/Panaioli"/> <rdf:object>Panaioli Fabio</rdf:object> <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/> </rdf:Description> </rdf:RDF> Canonical Reppresentation [http://dbin.org/Home/Panaioli http://dbin.org#student Panaioli Fabio] Signature , Base 64 encoding MCwCFOPX….A7xIaUgBzhkjcB5w== A triple is reified and both the signature and a URI to a public key certificate are attached

  30. Trust Policy Tools Trust policies tools RDF signing methodologies Ontology Import Policies RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  31. From authorship to trust Given the DS infrastructure, Given it’s a local, personal DB repository  many solutions! Examples: • I trust Giovanni and Christian (only) • I trust who Giovanni and Christian trust. • Etc etc..

  32. Data Flow Pipeline Trust policies tools Data flow pipeline RDF signing methodologies Ontology Import Policies RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  33. Metadata Pipeline This P2P scenario requires a pipeline of RDF processing. At low Pipeline levesraw growth, monotonicity, no inference At higher level inference, trusted growth, information revision, filtering etc.. Non monotonic filters (revocation) Raw RDF Repository RDFTrust filtering RDFS inference enabled Repository Even smarter Repository RDFGrowth P2P OWL, Domain rules, etc.. User selected policies Approved Schema Repository

  34. Domain Application GUI Domain application/GUI Trust policies tools Data flow pipeline RDF signing methodologies Ontology Import Policies RDF P2P transport layer URL Data handling (Up/Down)URI Minting RDF Storage

  35. User interface All but a detail! • As we’re trying to “deliver” a sw tool for regular people, if it is unusable  failure • More complex than simple “semantic web browsing”: • Editing must be taken into considerations • Filtering, revocations, ontologies, P2P must be armonized at user level by an appropriate facade

  36. DBin “domain applications”: brainlets • A single, downloadable domain specific application to run on top of DBin • Brainlets creation does NOT require programming knowledge, just XML editing. •  Communities can be started by domain experts rather than SW hackers!

  37. DBin “domain applications”: brainlets • A single, downloadable package containing: • The setup information for the RDFGrowth the transport layer • The ontologies to be used for annotations in the domain (e.g. The beer ontology). • A general GUI layout;. which components to visualize (e.g. A message board, an ontology browser, a “detail” view) and how they're cascaded in terms of selection/reaction • Templates for domain specific “annotations”, e.g. a “movie review template” • Templates for readily available, “pre cooked” domain queries, which are structurally complex domain queries with only a few simple free parameters, • A suggested trust model and information filtering rules for the domain. e.g. Public keys of well known “founding members” or authorities, preset “browsing levels”. • Support material, customized icons, help files etc.. • A basic RDF knowledge package

  38. DBin eats this.. (+ annotation ontologies) <Brainlet name="Beer"author="Onofrio Panzarino" version="1.0"> <Ontology file="brainlet/beer.owl"/> <GUED name="Beer"> <Topic name="Beers" uri="http://www.purl.org/net/ontology/beer#Beer"> <Child query="SELECT X FROM {X} <rdfs:subClassOf> {$parent} WHERE X != $parent" recursive="true"> <Child subjectBy="rdf:type" icon="/icons/beer.gif"/> </Child> <Child subjectBy="rdf:type" icon="/icons/beer.gif"/> </Topic> <Topic name="Ingredients" uri="http://www.purl.org/net/ontology/beer#Ingredient"> <Child query="SELECT X FROM {X} <rdfs:subClassOf> {$parent} WHERE X != $parent" recursive="true"> <Child subjectBy="rdf:type"/> </Child> <Child subjectBy="rdf:type"/> </Topic> </GUED> <View id="Focus" /> <View id="GUEDNavigator" title="BeerNavigator" icon="icons/nav.gif" selecterFor="main" /> <View id="Comments" title="Comments" listenTo="main" selecterFor="comments" /> <View id="Comment" title="Details" listenTo="comments" /> <View id="Gallery" listenTo="main" /> </Brainlet>

  39. The user sees this

  40. DBin Based on the Eclipse RCP so: Looks nice Multiplatform Completely plug-in based Lots of possible plugins Open source Demo time!

  41. Conclusions • Its RDF for the masses! • DBin is an early tool to explore this scenario, we don’t claim its fit for the real task yet, notably: • Performance issues • Real world hardening • Usability testing in real communities • There are many alternative to each of the blocks • A lot of cool ideas are within a plugin reach (Semantic desktop integration, maps, WS integration etc) Hurray! 

  42. SEMEDIA Semantic Web and Multimedia Thanks for your attention Get DBin at http://www.dbin.org

More Related