DBin: an all round Semantic Web platform for user communities
Download
1 / 43

Università Politecnica delle Marche - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

DBin: an all round Semantic Web platform for user communities Giovanni Tummarello, Ph. D. SEMEDIA Semantic Web and Multimedia. Universita' Politecnica delle Marche, Ancona, Italy. http://semedia.deit.univpm.it.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Università Politecnica delle Marche' - jarvis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

DBin: an all round Semantic Web platform for user communities

Giovanni Tummarello, Ph. D

SEMEDIA

Semantic Web and Multimedia

Universita' Politecnica delle Marche, Ancona, Italy

http://semedia.deit.univpm.it

TOWARD A SCALABLE MULTIMEDIA METADATA INFRASTRUCTURE USING DISTRIBUTED COMPUTING AND SEMANTIC WEB TECHNOLOGIES

Patrizia Asirelli1, Maria Grazia Di Bono1, Massimo Martinelli1, Ovidio Salvetti1, Oreste Signore1

1Institute of Information Science and Technologies (ISTI), Italian National Research Council (CNR), via Moruzzi 1, 56124 Pisa, Italy

[email protected], [email protected], [email protected], [email protected], [email protected]

Michele Catasta2, Christian Morbidoni2, Francesco Piazza2 , Giovanni Tummarello2

2SeMedia, Universita' Politecnica delle Marche, Ancona, Italy

http://semedia.deit.univpm.it

[email protected], [email protected], [email protected], [email protected]

Università Politecnica

delle Marche

SEMEDIA

Semantic Web and Multimedia

Università Politecnica

delle Marche

SEMEDIA

Semantic Web and Multimedia


Accessing the semantic web
“Accessing” the Semantic Web communities

The direct approach:

  • “The Semantic Web consists of many RDF graphs nameable by URIs.”

    Carroll, Bizer, Hayes, Stickler

    ISWC 2004, www2005, etc.

  • Perfectly supported by SparQL

    • need Ink cartdrige? So easy to ask HP.com

    • need a data cable? So easy to ask Nokia.com


The semantic web
The Semantic Web communities

What if I was interested in “datacables that work for my Nokia 1234” (no matter who produces them)?

Or “all about beer Peroni” (Reviews, comments, places where I can buy it with prices, pictures of its glass, its brewery) ?

Inverse approach:

  • The Semantic Web consists of many concepts (URIs) which are annotated at global scale


Scalability issues
Scalability issues communities

Direct: Accessing Nokia.com/data.rdf

  • I know exactly who to ask to

  • Network traffic= the size of the document,

  • Computational complexity: neglectable

    Inverse: “Something about” my Nokia1234

  • Many parties will have something to say

    Find them/distribute the query/collection traffic

    Impose them the query answering burden

    How to join local data?


A P2P / Personal Semantic Space approach communities

File sharing, P2P “philosophy”:

  • Downloads a lot, too much not a problem

  • Shares what downloaded

  • Uncommitted, no guarantees, join and leave at will

    But for the Semantic Web:

  • Exchanges, downloads and serves “RDF” rather than “files”

  • Searches about user interests rather than file names (“sicilian cucine”, “Scottish pubs”)

  • Remembers (almost) all  grows a local triplestore


Storing a lot of metadata locally why

SW P2P a la Napster, scenario and possibilities (2) communities

Storing a lot of metadata locally..why?

1) Why not, disk space is very cheap! (and its just metadata)

2) Key enabler to the global scalability!

 “use the Semantic Web” without direct network traffic or external computational burden

3) Maximally fast and interactive (high speed local queries)

4) Gets your local CPUs at work!

 much more powerful than what a server can give you for free, allows sophisticate information processing (reasoning, filters)

 Its your computer, “your” data

 personalized algorithms for rating and trust

 relate it to your local resources (SW desktop integration)


One size p2p model doesn t fit all
One size (P2P model) doesn’t fit all communities

Several SW P2P approaches have been proposed:

  • Centralized + Crawlers/feeds

  • Distributed queries (Edutella et Al.)

  • Distributed RDF storage (RDFPeers)

    Different scenarios, not the one studied here, see

    RDFGrowth paper.


Rdfgrowth design essentials
“RDFGrowth” - Design essentials communities

In this scenario we don’texpect others to:

  • Execute external arbitrary graph queries

  • Perform active “information hunt” for us.

    • No replicating queries, no query forwarding or routing. In general, no operations that induce non constant burden

  • Provide a service if not in a purely “best effort” fashion

    • No uptime guarantees, no service guarantees


Rdfgrowth groups
RDFGrowth Groups communities

Based on a shared definition of “interesting URIs” via a local semantic query:

Example Beer&Breweries Group:

Select x where {x} <rdf:type> {<beer:Beer>}

Select x where {x} <rdf:type> {<beer:Brewery>}

Those who join will execute the query and share information about the resulting URIs


Information surrounding a uri rdf neighbours
Information “Surrounding” a URI: RDF “Neighbours” communities

MSG(statement) (approx def). The “blank node closure” of the statement.

RDFN (def). The RDFN of a resource is the graph composed by all the MSGs involving the resource itself.

Similar to a Concise Bounded Resource Description (CBRD) given in [URIQA], but is differs mainly by the use of the “involves”

RDFN(Uri) is the only remote query allowed in RDFGrowth


Locating news rdfn hash set
Locating “News”: communitiesRDFN Hash Set

RHS(URI)=Hashes(canonicalize(RDFN(URI)))

  • Concise values exposed to the network to reppresent the knowledge a peer has about a URI

  • Peers looking for information about a URI use the published RHS to select who to talk to (i.e. the most “interesting” peer)






communitiesA lot of pragmatic decisions

A complete Semantic Web application today means…

Deliverable integration platform

Domain application/GUI

Trust policies tools

Data flow pipeline

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage


URL Data Handling and URI Minting communities

URL Data handling (Up/Down)URI Minting


URL Data Handling and URI Minting


RDF Storage communities

URL Data handling (Up/Down)URI Minting

RDF Storage


Rdf s storage
RDF/S Storage communities

  • Many choices!

  • We chose Sesame (SeRQL was schema aware long ago)

  • Thanks Sesame guys! New features being added.. (See trust filtering, pipelining)


RDF P2P Transport Layer communities

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage


Ontology Import Policies communities

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage


Ontology importing need care
Ontology importing need care communities

  • They have an active role (see Sesame forward inference)

  • Policies to control import and export are needed

  • Our approach: DBin will “suggest” that the import of ontologies when discovered, but the process is never fully automated and can be reversed


RDF Signing Methodologies communities

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage



The msg theory comes handy
The MSG theory comes handy communities

From the RDF blank nodes semantics:

 A MSG is also the minimum unit that can be sent across a P2P so that once merged the original graph will be restored.

From the MSG definition:

If s and t are distinct statements and t belong to MSG(s), then MSG(t) = MSG(s).

Each statement belong to one and only one MSG.

A graph can be univocally decomposed in MSGs.

the signature can be attached to a single, arbitrary triple in a MSG!


So rdftrusttoolkit
So RDFTrustToolkit.. communities

  • Given a URI will list the MSG around it

  • Given a MSG will list and verify existing signatures

  • Can remove existing signatures or add new ones


Signing a minimum selfcontained graph msg
Signing a Minimum Selfcontained Graph (MSG) communities

mbz:artistid=15290

IdKtR...j4c=

dbin:Base64sigvalue

mus:is_part_of

http://public../69..bd.pem

rdf:subject

dbin:X509Certificate

mus:plays

rdf:type

rdf:object

rdf:predicate

rdf:type

mus:Song

rdf:statement

rdf:type

mus:file

mus:Band

MD5:123123

Larger MSG lowers %overhead. In the DBin, signign overhead approx 25%.


Example rdftrusttoolkit run
Example (RDFTrustToolkit run) communities

Original MSG

<rdf:RDF

xmlns:dbin="http://dbin.org#"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >

<rdf:Description rdf:about="http://dbin.org/Home/Panaioli">

<dbin:student>Panaioli Fabio</dbin:student>

</rdf:Description>

</rdf:RDF>

Signed MSG

<rdf:RDF

xmlns:dbin="http://dbin.org#"

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >

<rdf:Description rdf:about="http://dbin.org/Home/Panaioli">

<dbin:student>Panaioli Fabio</dbin:student>

</rdf:Description>

<rdf:Description rdf:nodeID="A0">

<rdf:predicate rdf:resource="http://dbin.org#student"/>

<dbin:PGPCertificate>http://public.dbin.org/cont/238785872.asc</dbin:PGPCertificate>

<dbin:Base64SigValue>MCwCFOPX….A7xIaUgBzhkjcB5w==</dbin:Base64SigValue>

<rdf:subject rdf:resource="http://dbin.org/Home/Panaioli"/>

<rdf:object>Panaioli Fabio</rdf:object>

<rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement"/>

</rdf:Description>

</rdf:RDF>

Canonical Reppresentation

[http://dbin.org/Home/Panaioli

http://dbin.org#student Panaioli Fabio]

Signature , Base 64 encoding

MCwCFOPX….A7xIaUgBzhkjcB5w==

A triple is reified and both the signature and a URI to a public key certificate are attached


Trust Policy Tools communities

Trust policies tools

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage


From authorship to trust
From authorship to trust communities

Given the DS infrastructure,

Given it’s a local, personal DB repository

 many solutions!

Examples:

  • I trust Giovanni and Christian (only)

  • I trust who Giovanni and Christian trust.

  • Etc etc..


Data Flow Pipeline communities

Trust policies tools

Data flow pipeline

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage


Metadata pipeline
Metadata Pipeline communities

This P2P scenario requires a pipeline of RDF processing.

At low Pipeline levesraw growth, monotonicity, no inference

At higher level inference, trusted growth, information revision, filtering etc..

Non monotonic filters (revocation)

Raw RDF Repository

RDFTrust

filtering

RDFS inference enabled Repository

Even smarter Repository

RDFGrowth P2P

OWL, Domain rules, etc..

User selected policies

Approved Schema Repository


Domain Application GUI communities

Domain application/GUI

Trust policies tools

Data flow pipeline

RDF signing methodologies

Ontology Import Policies

RDF P2P transport layer

URL Data handling (Up/Down)URI Minting

RDF Storage


User interface
User interface communities

All but a detail!

  • As we’re trying to “deliver” a sw tool for regular people, if it is unusable  failure

  • More complex than simple “semantic web browsing”:

    • Editing must be taken into considerations

    • Filtering, revocations, ontologies, P2P must be armonized at user level by an appropriate facade


Dbin domain applications brainlets
DBin “domain applications”: brainlets communities

  • A single, downloadable domain specific application to run on top of DBin

  • Brainlets creation does NOT require programming knowledge, just XML editing.

    •  Communities can be started by domain experts rather than SW hackers!


Dbin domain applications brainlets1
DBin “domain applications”: brainlets communities

  • A single, downloadable package containing:

  • The setup information for the RDFGrowth the transport layer

  • The ontologies to be used for annotations in the domain (e.g. The beer ontology).

  • A general GUI layout;. which components to visualize (e.g. A message board, an ontology browser, a “detail” view) and how they're cascaded in terms of selection/reaction

  • Templates for domain specific “annotations”, e.g. a “movie review template”

  • Templates for readily available, “pre cooked” domain queries, which are structurally complex domain queries with only a few simple free parameters,

  • A suggested trust model and information filtering rules for the domain. e.g. Public keys of well known “founding members” or authorities, preset “browsing levels”.

  • Support material, customized icons, help files etc..

  • A basic RDF knowledge package


Dbin eats this annotation ontologies
DBin eats this.. ( communities+ annotation ontologies)

<Brainlet name="Beer"author="Onofrio Panzarino" version="1.0">

<Ontology file="brainlet/beer.owl"/>

<GUED name="Beer">

<Topic name="Beers" uri="http://www.purl.org/net/ontology/beer#Beer">

<Child

query="SELECT X FROM {X} <rdfs:subClassOf> {$parent} WHERE X != $parent"

recursive="true">

<Child subjectBy="rdf:type" icon="/icons/beer.gif"/>

</Child>

<Child subjectBy="rdf:type" icon="/icons/beer.gif"/>

</Topic>

<Topic name="Ingredients" uri="http://www.purl.org/net/ontology/beer#Ingredient">

<Child

query="SELECT X FROM {X} <rdfs:subClassOf> {$parent} WHERE X != $parent"

recursive="true">

<Child subjectBy="rdf:type"/>

</Child>

<Child subjectBy="rdf:type"/>

</Topic>

</GUED>

<View id="Focus" />

<View id="GUEDNavigator" title="BeerNavigator" icon="icons/nav.gif" selecterFor="main" />

<View id="Comments" title="Comments" listenTo="main" selecterFor="comments" />

<View id="Comment" title="Details" listenTo="comments" />

<View id="Gallery" listenTo="main" />

</Brainlet>



DBin communities

Based on the Eclipse RCP so:

Looks nice

Multiplatform

Completely plug-in based

Lots of possible plugins

Open source

Demo time!


Conclusions
Conclusions communities

  • Its RDF for the masses!

  • DBin is an early tool to explore this scenario, we don’t claim its fit for the real task yet, notably:

    • Performance issues

    • Real world hardening

    • Usability testing in real communities

  • There are many alternative to each of the blocks

  • A lot of cool ideas are within a plugin reach (Semantic desktop integration, maps, WS integration etc)

  • Hurray! 


    Thanks for your attention

    SEMEDIA communities

    Semantic Web and Multimedia

    Thanks for your attention

    Get DBin at http://www.dbin.org


    ad