slide1
Download
Skip this Video
Download Presentation
Università Politecnica delle Marche

Loading in 2 Seconds...

play fullscreen
1 / 43

Università Politecnica delle Marche - PowerPoint PPT Presentation


  • 124 Views
  • Uploaded on

DBin: an all round Semantic Web platform for user communities Giovanni Tummarello, Ph. D. SEMEDIA Semantic Web and Multimedia. Universita\' Politecnica delle Marche, Ancona, Italy. http://semedia.deit.univpm.it.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Università Politecnica delle Marche' - jarvis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

DBin: an all round Semantic Web platform for user communities

Giovanni Tummarello, Ph. D

SEMEDIA

Semantic Web and Multimedia

Universita\' Politecnica delle Marche, Ancona, Italy

http://semedia.deit.univpm.it

TOWARD A SCALABLE MULTIMEDIA METADATA INFRASTRUCTURE USING DISTRIBUTED COMPUTING AND SEMANTIC WEB TECHNOLOGIES

Patrizia Asirelli1, Maria Grazia Di Bono1, Massimo Martinelli1, Ovidio Salvetti1, Oreste Signore1

1Institute of Information Science and Technologies (ISTI), Italian National Research Council (CNR), via Moruzzi 1, 56124 Pisa, Italy

[email protected], [email protected], [email protected], [email protected], [email protected]

Michele Catasta2, Christian Morbidoni2, Francesco Piazza2 , Giovanni Tummarello2

2SeMedia, Universita\' Politecnica delle Marche, Ancona, Italy

http://semedia.deit.univpm.it

[email protected], [email protected], [email protected], [email protected]

Università Politecnica

delle Marche

SEMEDIA

Semantic Web and Multimedia

Università Politecnica

delle Marche

SEMEDIA

Semantic Web and Multimedia

accessing the semantic web
“Accessing” the Semantic Web

The direct approach:

  • “The Semantic Web consists of many RDF graphs nameable by URIs.”

Carroll, Bizer, Hayes, Stickler

ISWC 2004, www2005, etc.

  • Perfectly supported by SparQL
      • need Ink cartdrige? So easy to ask HP.com
      • need a data cable? So easy to ask Nokia.com
the semantic web
The Semantic Web

What if I was interested in “datacables that work for my Nokia 1234” (no matter who produces them)?

Or “all about beer Peroni” (Reviews, comments, places where I can buy it with prices, pictures of its glass, its brewery) ?

Inverse approach:

  • The Semantic Web consists of many concepts (URIs) which are annotated at global scale
scalability issues
Scalability issues

Direct: Accessing Nokia.com/data.rdf

  • I know exactly who to ask to
  • Network traffic= the size of the document,
  • Computational complexity: neglectable

Inverse: “Something about” my Nokia1234

  • Many parties will have something to say

Find them/distribute the query/collection traffic

Impose them the query answering burden

How to join local data?

slide5

A P2P / Personal Semantic Space approach

File sharing, P2P “philosophy”:

  • Downloads a lot, too much not a problem
  • Shares what downloaded
  • Uncommitted, no guarantees, join and leave at will

But for the Semantic Web:

  • Exchanges, downloads and serves “RDF” rather than “files”
  • Searches about user interests rather than file names (“sicilian cucine”, “Scottish pubs”)
  • Remembers (almost) all  grows a local triplestore
storing a lot of metadata locally why

SW P2P a la Napster, scenario and possibilities (2)

Storing a lot of metadata locally..why?

1) Why not, disk space is very cheap! (and its just metadata)

2) Key enabler to the global scalability!

 “use the Semantic Web” without direct network traffic or external computational burden

3) Maximally fast and interactive (high speed local queries)

4) Gets your local CPUs at work!

 much more powerful than what a server can give you for free, allows sophisticate information processing (reasoning, filters)

 Its your computer, “your” data

 personalized algorithms for rating and trust

 relate it to your local resources (SW desktop integration)

one size p2p model doesn t fit all
One size (P2P model) doesn’t fit all

Several SW P2P approaches have been proposed:

  • Centralized + Crawlers/feeds
  • Distributed queries (Edutella et Al.)
  • Distributed RDF storage (RDFPeers)

Different scenarios, not the one studied here, see

RDFGrowth paper.

rdfgrowth design essentials
“RDFGrowth” - Design essentials

In this scenario we don’texpect others to:

  • Execute external arbitrary graph queries
  • Perform active “information hunt” for us.
    • No replicating queries, no query forwarding or routing. In general, no operations that induce non constant burden
  • Provide a service if not in a purely “best effort” fashion
    • No uptime guarantees, no service guarantees
rdfgrowth groups
RDFGrowth Groups

Based on a shared definition of “interesting URIs” via a local semantic query:

Example Beer&Breweries Group:

Select x where {x} <rdf:type> {<beer:Beer>}

Select x where {x} <rdf:type> {<beer:Brewery>}

Those who join will execute the query and share information about the resulting URIs

information surrounding a uri rdf neighbours
Information “Surrounding” a URI: RDF “Neighbours”

MSG(statement) (approx def). The “blank node closure” of the statement.

RDFN (def). The RDFN of a resource is the graph composed by all the MSGs involving the resource itself.

Similar to a Concise Bounded Resource Description (CBRD) given in [URIQA], but is differs mainly by the use of the “involves”

RDFN(Uri) is the only remote query allowed in RDFGrowth

locating news rdfn hash set
Locating “News”:RDFN Hash Set

RHS(URI)=Hashes(canonicalize(RDFN(URI)))

  • Concise values exposed to the network to reppresent the knowledge a peer has about a URI
  • Peers looking for information about a URI use the published RHS to select who to talk to (i.e. the most “interesting” peer)
slide16

 A lot of pragmatic decisions

A complete Semantic Web application today means…

Deliverable integration platform

Domain application/GUI

Trust policies tools

Data flow pipeline

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

slide17
URL Data Handling and URI Minting

URL Data handling (Up/Down)URI Minting

slide18

The P2P infrastructure will deal with RDF but:

  • People want to access pictures, mp3s, files, not just see URI. URL resolving/downloading
  • Automated uploading also needed!

URL Data Handling and URI Minting

slide19
RDF Storage

URL Data handling (Up/Down)URI Minting

RDF Storage

rdf s storage
RDF/S Storage
  • Many choices!
  • We chose Sesame (SeRQL was schema aware long ago)
  • Thanks Sesame guys! New features being added.. (See trust filtering, pipelining)
slide21
RDF P2P Transport Layer

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

slide22
Ontology Import Policies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

ontology importing need care
Ontology importing need care
  • They have an active role (see Sesame forward inference)
  • Policies to control import and export are needed
  • Our approach: DBin will “suggest” that the import of ontologies when discovered, but the process is never fully automated and can be reversed
slide24
RDF Signing Methodologies

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

authorship@model level rdftrusttoolkit
[email protected] level: RDFTrustToolkit

Being certain about Who said what

We want:

  • Small granularities! As information will flow bit by bit.
  • Signatures INSIDE rdf, so they flow along with the data and are kept withing the triplestore.

Tools: RDF canonical serialization (J. Carroll )

MSG theory  reify a singly triple/sign it all!

the msg theory comes handy
The MSG theory comes handy

From the RDF blank nodes semantics:

 A MSG is also the minimum unit that can be sent across a P2P so that once merged the original graph will be restored.

From the MSG definition:

If s and t are distinct statements and t belong to MSG(s), then MSG(t) = MSG(s).

Each statement belong to one and only one MSG.

A graph can be univocally decomposed in MSGs.

the signature can be attached to a single, arbitrary triple in a MSG!

so rdftrusttoolkit
So RDFTrustToolkit..
  • Given a URI will list the MSG around it
  • Given a MSG will list and verify existing signatures
  • Can remove existing signatures or add new ones
signing a minimum selfcontained graph msg
Signing a Minimum Selfcontained Graph (MSG)

mbz:artistid=15290

IdKtR...j4c=

dbin:Base64sigvalue

mus:is_part_of

http://public../69..bd.pem

rdf:subject

dbin:X509Certificate

mus:plays

rdf:type

rdf:object

rdf:predicate

rdf:type

mus:Song

rdf:statement

rdf:type

mus:file

mus:Band

MD5:123123

Larger MSG lowers %overhead. In the DBin, signign overhead approx 25%.

example rdftrusttoolkit run
Example (RDFTrustToolkit run)

Original MSG

<rdf:RDF

xmlns:dbin="http://dbin.org#"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >

<rdf:Description rdf:about="http://dbin.org/Home/Panaioli">

<dbin:student>Panaioli Fabio</dbin:student>

</rdf:Description>

</rdf:RDF>

Signed MSG

<rdf:RDF

xmlns:dbin="http://dbin.org#"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >

<rdf:Description rdf:about="http://dbin.org/Home/Panaioli">

<dbin:student>Panaioli Fabio</dbin:student>

</rdf:Description>

<rdf:Description rdf:nodeID="A0">

<rdf:predicate rdf:resource="http://dbin.org#student"/>

<dbin:PGPCertificate>http://public.dbin.org/cont/238785872.asc</dbin:PGPCertificate>

<dbin:Base64SigValue>MCwCFOPX….A7xIaUgBzhkjcB5w==</dbin:Base64SigValue>

<rdf:subject rdf:resource="http://dbin.org/Home/Panaioli"/>

<rdf:object>Panaioli Fabio</rdf:object>

<rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>

</rdf:Description>

</rdf:RDF>

Canonical Reppresentation

[http://dbin.org/Home/Panaioli

http://dbin.org#student Panaioli Fabio]

Signature , Base 64 encoding

MCwCFOPX….A7xIaUgBzhkjcB5w==

A triple is reified and both the signature and a URI to a public key certificate are attached

slide30
Trust Policy Tools

Trust policies tools

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

from authorship to trust
From authorship to trust

Given the DS infrastructure,

Given it’s a local, personal DB repository

 many solutions!

Examples:

  • I trust Giovanni and Christian (only)
  • I trust who Giovanni and Christian trust.
  • Etc etc..
slide32
Data Flow Pipeline

Trust policies tools

Data flow pipeline

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

metadata pipeline
Metadata Pipeline

This P2P scenario requires a pipeline of RDF processing.

At low Pipeline levesraw growth, monotonicity, no inference

At higher level inference, trusted growth, information revision, filtering etc..

Non monotonic filters (revocation)

Raw RDF Repository

RDFTrust

filtering

RDFS inference enabled Repository

Even smarter Repository

RDFGrowth P2P

OWL, Domain rules, etc..

User selected policies

Approved Schema Repository

slide34
Domain Application GUI

Domain application/GUI

Trust policies tools

Data flow pipeline

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage

user interface
User interface

All but a detail!

  • As we’re trying to “deliver” a sw tool for regular people, if it is unusable  failure
  • More complex than simple “semantic web browsing”:
    • Editing must be taken into considerations
    • Filtering, revocations, ontologies, P2P must be armonized at user level by an appropriate facade
dbin domain applications brainlets
DBin “domain applications”: brainlets
  • A single, downloadable domain specific application to run on top of DBin
  • Brainlets creation does NOT require programming knowledge, just XML editing.
      •  Communities can be started by domain experts rather than SW hackers!
dbin domain applications brainlets1
DBin “domain applications”: brainlets
  • A single, downloadable package containing:
  • The setup information for the RDFGrowth the transport layer
  • The ontologies to be used for annotations in the domain (e.g. The beer ontology).
  • A general GUI layout;. which components to visualize (e.g. A message board, an ontology browser, a “detail” view) and how they\'re cascaded in terms of selection/reaction
  • Templates for domain specific “annotations”, e.g. a “movie review template”
  • Templates for readily available, “pre cooked” domain queries, which are structurally complex domain queries with only a few simple free parameters,
  • A suggested trust model and information filtering rules for the domain. e.g. Public keys of well known “founding members” or authorities, preset “browsing levels”.
  • Support material, customized icons, help files etc..
  • A basic RDF knowledge package
dbin eats this annotation ontologies
DBin eats this.. (+ annotation ontologies)

<Brainlet name="Beer"author="Onofrio Panzarino" version="1.0">

<Ontology file="brainlet/beer.owl"/>

<GUED name="Beer">

<Topic name="Beers" uri="http://www.purl.org/net/ontology/beer#Beer">

<Child

query="SELECT X FROM {X} <rdfs:subClassOf> {$parent} WHERE X != $parent"

recursive="true">

<Child subjectBy="rdf:type" icon="/icons/beer.gif"/>

</Child>

<Child subjectBy="rdf:type" icon="/icons/beer.gif"/>

</Topic>

<Topic name="Ingredients" uri="http://www.purl.org/net/ontology/beer#Ingredient">

<Child

query="SELECT X FROM {X} <rdfs:subClassOf> {$parent} WHERE X != $parent"

recursive="true">

<Child subjectBy="rdf:type"/>

</Child>

<Child subjectBy="rdf:type"/>

</Topic>

</GUED>

<View id="Focus" />

<View id="GUEDNavigator" title="BeerNavigator" icon="icons/nav.gif" selecterFor="main" />

<View id="Comments" title="Comments" listenTo="main" selecterFor="comments" />

<View id="Comment" title="Details" listenTo="comments" />

<View id="Gallery" listenTo="main" />

</Brainlet>

slide41
DBin

Based on the Eclipse RCP so:

Looks nice

Multiplatform

Completely plug-in based

Lots of possible plugins

Open source

Demo time!

conclusions
Conclusions
  • Its RDF for the masses!
  • DBin is an early tool to explore this scenario, we don’t claim its fit for the real task yet, notably:
      • Performance issues
      • Real world hardening
      • Usability testing in real communities
  • There are many alternative to each of the blocks
  • A lot of cool ideas are within a plugin reach (Semantic desktop integration, maps, WS integration etc)

Hurray! 

thanks for your attention

SEMEDIA

Semantic Web and Multimedia

Thanks for your attention

Get DBin at http://www.dbin.org

ad