Search engines for semantic web knowledge l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 53

Search Engines for Semantic Web Knowledge PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on
  • Presentation posted in: General

Search Engines for Semantic Web Knowledge. Tim Finin University of Maryland, Baltimore County Joint work with Li Ding, Anupam Joshi, Yun Peng, Pranam Kolari, Pavan Reddivari, Sandor Dornbush, Rong Pan, Akshay Java, Joel Sachs, Scott Cost and Vishal Doshi.

Download Presentation

Search Engines for Semantic Web Knowledge

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Search engines for semantic web knowledge l.jpg

Search Engines for Semantic WebKnowledge

Tim Finin

University of Maryland, Baltimore County

Joint work with Li Ding, Anupam Joshi, Yun Peng, Pranam Kolari, Pavan Reddivari, Sandor Dornbush, Rong Pan, Akshay Java, Joel Sachs, Scott Cost and Vishal Doshi

 http://creativecommons.org/licenses/by-nc-sa/2.0/ This work was partially supported by DARPA contract F30602-97-1-0215, NSF grants CCR007080 and IIS9875433 and grants from IBM, Fujitsu and HP.


This talk l.jpg

This talk

  • Motivation

  • Semantic web 101

  • Swoogle Semantic Websearch engine

  • Use cases and applications

  • State of the Semantic Web

  • Conclusions


Google has made us smarter l.jpg

Google has made us smarter


But what about our agents l.jpg

tell

register

But what about our agents?

Agents still have a very minimal understanding of text and images.


This talk5 l.jpg

This talk

  • Motivation

  • Semantic web 101

  • Swoogle Semantic Websearch engine

  • Use cases and applications

  • State of the Semantic Web

  • Conclusions


Xml helps l.jpg

XML helps

“XML is Lisp's bastard nephew, with uglier syntax and no semantics. Yet XML is poised to enable the creation of a Web of data that dwarfs anything since the Library at Alexandria.”

-- Philip Wadler, Et tu XML? The fall of the relational empire, VLDB, Rome, September 2001.


Semantic web adds semantics l.jpg

Semantic Web adds semantics

“The Semantic Web will globalize KR, just as the WWW globalize hypertext”

-- Tim Berners-Lee


Semantic web 101 l.jpg

<?xml version="1.0" encoding="utf-8"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:foaf=http://xmlns.com/foaf/0.1/

xmlns:uni=http//ebiquity.umbc.edu/ontologies/uni/>

<uni:Student>

<foaf:name>Li Ding</foaf:name>

<foaf:mbox rdf:resource=“mailto:[email protected]/>

</uni:Student>

</rdf:RDF>

foaf:name

Li Ding

uni:Student

rdf:type

Semantic Web 101

  • RDF/XML

  • rdf:RDF tag

  • namespaces  ontologies

  • Semantic graph, URIs as nodes & links

  • triples


But what about our agents9 l.jpg

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

Swoogle

tell

register

But what about our agents?

A Google for knowledge on the Semantic Web is needed by software agents and programs


This talk10 l.jpg

This talk

  • Motivation

  • Semantic web 101

  • Swoogle Semantic Websearch engine

  • Use cases and applications

  • State of the Semantic Web

  • Conclusions


Slide11 l.jpg

  • http://swoogle.umbc.edu/

  • Running since summer 2004

  • 1.4M RDF documents, 250M RDF triples, 10K ontologies


Swoogle architecture l.jpg

Analysis

SWD classifier

Ranking

Index

Search Services

Semantic Web

metadata

IR Indexer

Web

Server

Web

Service

SWD Indexer

html

rdf/xml

Discovery

the Web

document cache

SwoogleBot

Semantic Web

Candidate

URLs

Bounded Web Crawler

Google Crawler

human

machine

Legends

Information flow

Swoogle‘s web interface

Swoogle Architecture


A hybrid harvesting framework l.jpg

A Hybrid Harvesting Framework

true

Swoogle

Sample

Dataset

Manual submission

Inductive learner

would

Seeds R

Seeds M

Seeds H

Meta crawling

Bounded HTML crawling

RDF crawling

google

Google API call

crawl

crawl

the Web


This talk14 l.jpg

This talk

  • Motivation

  • Semantic web 101

  • Swoogle Semantic Websearch engine

  • Use cases and applications

  • State of the Semantic Web

  • Conclusions


Applications and use cases l.jpg

Applications and use cases

  • Supporting Semantic Web developers

    • Ontology designers, vocabulary discovery, who’s using my ontologies or data?, use analysis, errors,statistics, etc.

  • Searching specialized collections

    • Spire: aggregating observations and data from biologists

    • InferenceWeb: searching over and enhancing proofs

    • SemNews: Text Meaning of news stories

  • Supporting SW tools

    • Triple shop: finding data for SPARQL queries


Slide17 l.jpg

80 ontologies were found that had these three terms

By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by recency or size.

Let’s look at this one


Slide18 l.jpg

Basic Metadata

hasDateDiscovered:  2005-01-17

hasDatePing:  2006-03-21

hasPingState:  PingModified

type:  SemanticWebDocument

isEmbedded:  false

hasGrammar:  RDFXML

hasParseState:  ParseSuccess

hasDateLastmodified:  2005-04-29

hasDateCache:  2006-03-21

hasEncoding:  ISO-8859-1

hasLength:  18K

hasCntTriple:  311.00

hasOntoRatio:  0.98

hasCntSwt:  94.00

hasCntSwtDef:  72.00

hasCntInstance:  8.00


Slide21 l.jpg

These are the namespaces this ontology uses. Clicking on one shows all of the documents using the namespace.

All of this is available in RDF form for the agents among us.


Slide22 l.jpg

Here’s what the agent sees. Note the swoogle and wob (web of belief) ontologies.


Slide23 l.jpg

We can also search for terms (classes, properties) like terms for “person”.


Slide24 l.jpg

10K terms associatged with “person”! Ordered by use.

Let’s look at foaf:Person’s metadata


Umbc triple shop l.jpg

UMBC Triple Shop

  • http://sparql.cs.umbc.edu/

  • Online SPARQL RDF query processing basedon HP’s Jena and Joseki with several interesting features

  • Selectable level of inference over model

  • Automatically finds SWDs for give queries using Swoogle backend database

    • Provide dataset creation wizard

    • Dataset can be stored on our server or downloaded

    • Tag, share and search over saved datasets


Web scale semantic web data access l.jpg

Web-scale semantic web data access

data access service

the Web

agent

Index RDF data

ask (“person”)

Search vocabulary

Search URIrefs

in SW vocabulary

inform (“foaf:Person”)

Compose query

ask (“?x rdf:type foaf:Person”)

Search URLs

in SWD index

Populate

RDF database

inform (doc URLs)

Fetch docs

Query local

RDF database


Slide33 l.jpg

Who knows Anupam Joshi?

Show me their names, email address and pictures


Slide34 l.jpg

The UMBC ebiquity site publishes lots of RDF data, including FOAF profiles


Slide35 l.jpg

No FROM clause!

Constraints on wherethe data comes from


Slide36 l.jpg

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?p2name ?p2mbox ?p2pix

WHERE {

?p1 foaf:name "Anupam Joshi" .

?p1 foaf:mbox ?p1mbox .

?p2 foaf:knows ?p3 .

?p3 foaf:mbox ?p1mbox .

?p2 foaf:name ?p2name .

?p2 foaf:mbox ?p2mbox .

OPTIONAL { ?p2 foaf:depiction ?p2pix } .

}

ORDER BY ?p2name


Slide38 l.jpg

Swoogle found 292 RDF data files that appear relevant to answering our query


Slide39 l.jpg

Let’s save the dataset before we use it


Slide41 l.jpg

And tag it so we and others can find it more easily.


Slide42 l.jpg

Here we are using it to get an answer to “Who knows Anupam Joshi”


Slide43 l.jpg

He has many friends!


This talk45 l.jpg

This talk

  • Motivation

  • Semantic web 101

  • Swoogle Semantic Websearch engine

  • Use cases and applications

  • State of the Semantic Web

  • Conclusions


Will it scale how l.jpg

Will it Scale? How?

Here’s a rough estimate of the data in RDF documents on the semantic web based on Swoogle’s crawling

We think Swoogle’s centralized approach can be made to work for the next few years if not longer.


How much reasoning l.jpg

How much reasoning?

  • SwoogleN (N<=3) does limited reasoning

    • It’s expensive

    • It’s not clear how much should be done

  • More reasoning would benefit many use cases

    • e.g., type hierarchy

  • Recognizing specialized metadata

    • E.g., that ontology A some maps terms from B to C


This talk48 l.jpg

This talk

  • Motivation

  • Semantic web 101

  • Swoogle Semantic Websearch engine

  • Use cases and applications

  • State of the Semantic Web

  • Conclusions


Conclusion l.jpg

Conclusion

  • The web will contain the world’s knowledge in forms accessible to people and computers

    • We need better ways to discover, index, search and reason over SW knowledge

  • SW search engines address different tasks than html search engines

    • So they require different techniques and APIs

  • Swoogle like systems can help create consensus ontologies and foster best practices

    • Swoogle is for Semantic Web 1.0

    • Semantic Web 2.0 will make different demands


Slide50 l.jpg

For more information

http://ebiquity.umbc.edu/

Annotatedin OWL


Backup l.jpg

backup


  • Login