The semantic web in use analyzing foaf documents
This presentation is the property of its rightful owner.
Sponsored Links
1 / 41

The Semantic Web in use: Analyzing FOAF Documents PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

The Semantic Web in use: Analyzing FOAF Documents. Li Ding, Lina Zhou, Tim Finin and Anupam Joshi University of Maryland, Baltimore County. DARPA contract F30602-00-0591and NSF awards ITR-IIS-0326460 and ITR-IIS-0325464 provided partial research support for this work. Outline. Motivation

Download Presentation

The Semantic Web in use: Analyzing FOAF Documents

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The semantic web in use analyzing foaf documents

The Semantic Web in use:AnalyzingFOAF Documents

Li Ding, Lina Zhou,Tim Finin and Anupam Joshi

University of Maryland, Baltimore County

DARPA contract F30602-00-0591and NSF awards ITR-IIS-0326460 and ITR-IIS-0325464 provided partial research support for this work


Outline

Outline

  • Motivation

  • Introduction

    • The six popular ontologies

    • FOAF vocabulary

    • Why FOAF

  • Building FOAF Document collection

    • FOAF Document Identification

    • FOAF Document Discovery

    • Popular Properties of foaf:Person

  • Applications

    • Personal Information Fusion

    • Social Network Analysis


The semantic web

The Semantic Web

  • The semantic web vision is that information and services are described using shared ontologies in KR-like markup languages, making them accessible to machines (programs).

  • How do we get there?

    • What kind of ontologies? IEEE SUO? Cyc?

    • What kind of languages? RDF? OWL? RuleML?

  • It’s reasonable to start with the simple and move toward the complex

    • From Dublin Core to CYC

    • From RDF to OWL and beyond

  • Significant semantic web content exists today

    • Using simple vocabularies (e.g., FOAF) and RDF/RDFS


The semantic web1

The Semantic Web

  • The more important word in “Semantic Web” is the latter

  • The KR aspects of the SW were taken off the shelf, the result of 25 years of research done in the AI community

  • Remember hypertext? It was a nice research backwater going back to the 50’s (recall Memex and Xanadu)

    • Hypertext was forever change by the Web

    • So maybe the web will forever change KR

  • TBL: “The Semantic Web will globalize KR, just as the WWW globalize hypertext”


Web of what

Web of what?

  • What features does the web bring to the table?

  • “Anyone can say anything about anything”

  • The meaning of RDF terms will be (partly) determined socially

  • It’s a web of documents, services, agents and people


What kind of ontologies

What kind of Ontologies?

Thesauri

“narrower

term”

relation

space of interest

Disjointness, Inverse,part of…

Frames

(properties)

Formal

is-a

Catalog/ID

CYC

DB Schema

UMLS

RDF

RDFS

DAML

Wordnet

OO

OWL

IEEE SUO

Formal

instance

General

Logical

constraints

Informal

is-a

Value Restriction

Terms/

glossary

Vocabularies

SimpleOntologies

ExpressiveOntologies

Taxonomies

After Deborah L. McGuinness (Stanford)


The semantic web today

The Semantic Web Today

  • There are several simple RDF vocabularies that are widely used today

    • Dublin Core

    • RSS

    • FOAF

  • It’s instructive to study how these are being used today

  • And to track how their usage changes


The six most popular ontologies

The Six Most Popular Ontologies

RDF

DC

RSS

MCVB

FOAF

RDFS

The statistics is generated by http://swoogle.umbc.edu


A usecase foaf

A usecase: FOAF

  • FOAF (Friend of a Friend) is a simple ontology to describe people and their social networks.

    • See the foaf project page: http://www.foaf-project.org/

  • We recently crawled the web and discovered over 1,500,000 valid RDF FOAF files.

    • Most of these are from seveal blogging system that encode basic user info in foaf

    • See http://apple.cs.umbc.edu/semdis/wob/foaf/

  • <foaf:Person>

    • <foaf:name>Tim Finin</foaf:name>

    • <foaf:mbox_sha1sum>2410…37262c252e</foaf:mbox_sha1sum>

    • <foaf:homepage rdf:resource="http://umbc.edu/~finin/" />

    • <foaf:img rdf:resource="http://umbc.edu/~finin/images/passport.gif" />

  • </foaf:Person>


The semantic web in use analyzing foaf documents

FOAF vocabulary

http://xmlns.com/foaf/0.1/

@


Foaf why rdf extensibility

FOAF: why RDF? Extensibility!

  • FOAF vocabulary provides 50+ basic terms for making simple claims about people

  • FOAF files can use other RDF terms too: RSS, MusicBrainz, Dublin Core, Wordnet, Creative Commons, blood types, starsigns, …

  • RDF guarantees freedom of independent extension

    • OWL provides fancier data-merging facilities 

  • Result: Freedom to say what you like, using any RDF markup you want, and have RDF crawlers merge your FOAF documents with other’s and know when you’re talking about the same entities. 

After Dan Brickley, [email protected]


No free lunch

No free lunch!

Consequence:

  • We must plan for lies, mischief, mistakes, stale data, slander

  • Dataset is out of control, distributed, dynamic

  • Importance of knowing who-said-what

    • Anyone can describe anyone

    • We must record data provenance

    • Modeling and reasoning about trust is critical

  • Legal, privacy and etiquette issues emerge

  • Welcome to the real world

After Dan Brickley, [email protected]


Foaf example using xml

FOAF example using XML

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:foaf="http://xmlns.com/foaf/0.1/">

<foaf:Person>

<foaf:name>Tim Finin</foaf:name>

<foaf:mbox rdf:resource="mailto:[email protected]"/>

</foaf:Person>

</rdf:RDF>


Foaf example using xml1

FOAF example using XML

<foaf:Person>

<foaf:name>Tim Finin</foaf:name>

<foaf:mbox rdf:resource="mailto:[email protected]"/>

<foaf:nick>Tim</foaf:nick>

<foaf:homepage rdf:resource="http://umbc.edu/~finin/"/>

<foaf:img rdf:resource= "http://umbc.edu/~finin/passport.gif"/>

</foaf:Person>


Foaf example using xml2

FOAF example using XML

<foaf:Person>

<foaf:name>Tim Finin</foaf:name>

<foaf:knows>

<foaf:Person>

<foaf:name>Anupam Joshi</foaf:name>

<rdf:seeAlso rdf:resource = "http://umbc.edu/~joshi/joshi.foaf"/>

<foaf:knows>

</foaf:Person>


Foaf isn t the only one

FOAF isn’t the only one

  • Other ontologies are used to publish social information

  • Swoogle finds >360 RDFs or OWL classes with the local name “person.”


Lots of foaf tools

Lots of FOAF tools


Why foaf

Why FOAF

  • Information Creators

    • Community membership management

    • Unique Person Identification (privacy preserved)

    • Indicating Authorship

  • Information Consumers

    • Provenance tracking

    • Social networking

      • Expose community information to new comers

      • Match interests

    • Trust building block


Studying how foaf is being used

Studying how FOAF is being used

  • What counts as a FOAF document?

  • How can we find foaf documents?


Identify a foaf document

Identify a FOAF document

  • D is a generic FOAF document when 1,2,3 met

  • D is a strict FOAF document when 1,2,3,4 met

  • D is an RDF document.

  • D uses FOAF namespace

  • The RDF graph serialized by D contains the sub-graph below

  • D defines one and only one Person instance

foaf:Person

rdf:type

X

foaf:Y

Z


Different foaf collections

Different FOAF collections

  • DS-Swoogle

    • Foaf documents selected from Swoogle’s database of ~340K semantic web documents

    • Swoogle selects at most 1000 documents from any site

  • DS-FOAF

    • Custom crawler found 1.5M foaf documents, most from a few large blog sites (e.g., livejournal)

  • DS-FOAF-Small

    • Subset of ~7K non-blog foaf documents from ~1K sites defining ~37K people


Foaf document discovery

FOAF document Discovery

  • Bootstrap: using web search engine (Got 10,000 docs)

  • Discovery: using rdfs:seeAlso semantics (Got 1.5M docs)

Top 7 FOAF websites


From ds swoogle

From DS-Swoogle

  • 17 SWDs add to the definition of foaf:Person

    • e.g., defining superclasses, disjointness, etc.

  • 162 properties are defined for foaf:Person

    • e.g., properties whose domain is foaf:Person

  • 74 properties defined as relations between people

    • e.g., properties with both domain and range of foaf:Person

  • 582 properties used

    • e.g., used to assert something of a foaf:Person instance


Popular properties of foaf person

Popular properties of foaf:Person

Top 10 popular properties (per document)

*DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.


Popular properties of foaf person1

Popular properties of foaf:Person

Top 10 popular properties (per instance)

*DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.


Extracting social networks

Extracting social networks

Three steps

  • Discovering foaf instances

  • Merging instances representing the same person

  • Linking people via foaf:knows and other foaf based relations

    • e.g., quaffing:drankBeerWith

  • Integrating other SNA data

    • e.g., from co-author relationships mined from citeseer


Merging instances

Merging instances

  • Named instances

  • Inverse functional properties

  • Set of nearly inverse functional properties

  • OWL constraints

  • Rdf:seeAlso


Collecting personal information

Collecting Personal Information

http://www-2.cs.cmu.edu/People/fgandon/foaf.rdf

http:www.cs.umbc.edu/~dingli1/foaf.rdf


Caution collision mistake

Caution: Collision? Mistake!

caution

http://www.ilrt.bris.ac.uk/people/cmdjb/webwho.xrdf

http://www.mindswap.org/~katz/2002/11/jordan.foaf


Sna1 instances of foaf person doc

SNA1: Instances of foaf:Person/doc

  • Zipf’s distribution

  • Sloppy tail: few foaf documents contain thousands of instances

Cumulative distribution


Sna2 instances of foaf person group

SNA2: Instances of foaf:Person/group

A group refers to a fused person

  • Zipf’s distribution

  • Sloppy tail: some instances are wrongly fused due to incorrect FOAF documents

Cumulative distribution


Degree analysis

Degree analysis

  • For social networks, the in-degree and out-degree measure of a person is of interest

  • Can be used to identify hubs and authorities or to compute other interesting properties or rankings

  • Analyzing most large social networks reveals that in-degree and out-degree follows a power law or Zipf distribution

  • We found that to be the case for social networks induced by foaf documents.


Sna3 in degree of group

SNA3: In-degree of group

  • Zipf’s Distribution

  • Sharp tail: few FOAF documents have large in-degrees

Cumulative distribution


Sna4 out degree of group

SNA4: Out-degree of group

  • Zipf’s distribution

  • Sloppy tail: few person directory documents

Cumulative distribution


Sna5 patterns of foaf network

SNA5: Patterns of FOAF Network

  • Four types of group

    • Isolated

    • Only in

      only one inlink (97%)

    • Only out

    • Both (intermediate)

  • Basic Patterns:

    • Singleton: (isolated)

    • Star: (only out) an active person publishes friends

    • Clique: a small group


Sna6 size of components

SNA6: Size of components

  • Zipf’s distribution

  • Sloppy head: singleton

  • Sloppy tail: blog websites (e.g. www.livejournal.com)

Cumulative distribution


Sna7 growth of foaf network

SNA7: Growth of FOAF network

The data suggests that there is a natural evolution for a social network

(1) disjointed star-like, connected components

(2) link together to form trees and forests,

(3) eventually forming a scale-free network


Sna7 growth of foaf network1

SNA7: Growth of FOAF network

3

1

2


The map of foaf network

The Map of FOAF network

Blog.livedoor.jp

non-blog

www.ecademy.com

June 2004

www.livejournal.com


Conclusions

Conclusions

  • The semantic web is evolving

  • There is a growing volume of RDF content

  • FOAF is one of the one of the early successes.

  • FOAF data is being used

  • FOAF data is relatively easy to collect and analize

  • FOAF data is a good source for social network information


Questions

Questions?

Demo: http://apple.cs.umbc.edu/semdis

Swoogle: http://swoogle.umbc.edu/

ebiquity group: http://ebiquity.umbc.edu


  • Login