1 / 34

Publishing to the Semantic Web

Publishing to the Semantic Web. Dr Owen Conlan Dr Alexander O’Connor. The Long Road for Semantic Web. The Linked Data Movement. Tim Berners Lee driven “Linked Data uses a small slice of the technologies that make up the Semantic Web” Treat Schemas as Vocabularies Reuse existing schemas.

lemuel
Download Presentation

Publishing to the Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Publishing to the Semantic Web Dr Owen Conlan DrAlexander O’Connor

  2. The Long Road for Semantic Web

  3. The Linked Data Movement • Tim Berners Lee driven • “Linked Data uses a small slice of the technologies that make up the Semantic Web” • Treat Schemas as Vocabularies • Reuse existing schemas

  4. Linking Open Data Project • Community project with W3C support started in early 2007 • Idea: take existing (open) data sets and make them available on the Web in RDF • Interlink them with other data sets

  5. A Pretty (Scary) Diagram

  6. DBpedia • Transforming Wikipedia into a knowledge base • Structure from • Infoboxes • HTML (titles) • Categories Links • other languages, redirects, disambiguations, etc • Uses: as a controlled vocab, as an ontology Check out : http://dbpedia.org/page/Trinity_College,_Dublin (or Google “Trinity College Dublin dbpedia”)

  7. Linking Data • Publish structured data in RDF on the web using URIs and shared vocabularies rather than the traditional Semantic Web focus on ontologies and inference • Lowers barriers to entry • Fosters widespread adoption • Mature tools, techniques, patterns

  8. Linked Data Principles • Formulated by Tim Berners-Lee (2006): • Use URIs as names for things • Use HTTP URIs so that people/apps can lookup these names • When someone/an app looks up a URI, provide useful information • Include links to other URIs so that they can discover more things • This not an unambiguous specification, just a set of principles.

  9. http://what.is.http.org/ask?question • The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, protocol which can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred. [RFC2616]

  10. URIs • A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource [RFC3986] • Syntax: URI = scheme “:” hier-part [“?” query] [“#” fragment] • Example Note: scheme not the same as protocol

  11. Identifying Linked Data Resources • Linked data needs dereferenceable URIs (ones we can use HTTP to retrieve a description of that resource) • But we cannot serialise people, things over the internet (yet?) => we publish RDF documents on the web that describe them • A real-word object != a document about that object • e.g. creation-date for you != creation-date for your web-page

  12. Identifying Linked Data Resources • URI that identifies a real-word object != URI that identifies a document about that object • Can make statements about object and can make statements about the document describing it How do we link these 2 URIs together?

  13. URI styles for Linked Data • 303 Redirect (e.g. http://example.uk/people/dave-smith) • Used for large, dynamic data sets • Flexible because redirection can be separately configured for each resource • e.g. can store data in multiple files or DB. Can change this at deployment/run-time. • Typically used for resource descriptions in large data-sets

  14. URI styles for Linked Data • Fragment (e.g. http://example.uk/people#dave-smith) • Used for small, static data sets • Reduced number of HTTP round-trips => reduced latency • A single HTTP request retrieves the entire document • May transmit unnecessary data across the web • Used for RDFa (defined via RDFa “about=” attribute) • Typically used for vocabulary definitions

  15. 303 Redirect Approach • Create URIs for concept/thing and documents • e.g. http://biglynx.co.uk/people/dave-smith (URI identifying the person Dave Smith) • http://biglynx.co.uk/people/dave-smith.rdf (URI for RDF/XML document describingDaveSmith) • http://biglynx.co.uk/people/dave-smith.html(URI for HTML document describing Dave Smith) • Use HTTP redirects/content negotiation to access the desired resource description for the specific user agent • Client HTTP GET request on a URI identifying a object • Server recognizes URI, it answers using the HTTP 303 to send the URI of a description of the object • Client HTTP GET request on new URI • Server sends document from new URI

  16. Huh? • The picture below shows how dereferencing a HTTP URI identifying a non-information resource plays together with content negotiation: • Simples…

  17. Fragment Approach • Assign a URI to the RDF document defining the concepts • e.g. http://biglynx.co.uk/vocab/sme/ (document URI) • Assign fragment identifiers to concepts within the document • e.g. http://biglynx.co.uk/vocab/sme#SmallMediumEnterprise • http://biglynx.co.uk/vocab/sme#Team • Use HTTP requests to get the description • Client truncates a fragment URI to just refer to the document • Send HTTP GET to request the document • Server sends back the full document • Linked data application now inspects triples to find fragment

  18. Now we can refer to stuff • Class <!-- http://www.pizza.com/ontologies/pizza.owl#ThinAndCrispyBase  <owl:Class rdf:about="&pizza;ThinAndCrispyBase"> <rdfs:subClassOf rdf:resource="&pizza;PizzaBase"/> </owl:Class> • Property <!-- http://www.pizza.com/ontologies/pizza.owl#hasIngredient --> <owl:ObjectProperty rdf:about="&pizza;hasIngredient"> <rdf:type rdf:resource="&owl;TransitiveProperty"/> <owl:inverseOf rdf:resource="&pizza;isIngredientOf"/> </owl:ObjectProperty>

  19. 5 Steps to Publishing Linked Data

  20. Understand the Principles • Understand your Data • Choose URIs for Things in your Data • Set up Your Infrastructure • Link to other Data Sets

  21. Step 1: Understanding the Principles • Use URIs as names for things • Anything, not just documents • You are not your homepage • Information resources (can be transmitted electronically) and non-information resources (cannot be transmitted electronically, e.g. a person!) • Use HTTP URIs • Globally unique names, distributed ownership • Allows people to lookup those names

  22. Step 1: Understanding the Principles (cont.) • Provide useful information in RDF when someone looks up a URI • We can include RDF triple statements! • Include RDF links to other URIs • To enable discovery of related information e.g. via “follow your nose” browsing • Relationship Links – to add context • Identity Links – for URI aliases in other sources • Vocabulary Links – to enable self-description

  23. Step 2: Understand your Data • What are the key things in your data? • People • Places • Events • Book • Flims • Musician • … • This why domain expertise are critically important

  24. Step 2: Understand your Data (cont.) • What vocabularies can be used to describe these? • Principles: • Reuse, don’t reinvent • Mix liberally • Examples: • foaf-- Friend-of-a-Friend ontology • geonames-- GeoNamesontology • skos-- Simple Knowledge Organization System • ckan.net

  25. Step 2: Common Vocabularies • bibo -- Bibilographicontology • cc -- Creative Commons ontology • damltime-- Time Zone ontology • doap-- Description of a Project ontology • event -- Event ontolog • foaf-- Friend-of-a-Friend ontology • frbr-- Functional Requirements for Bibliographic Records • geo -- Geo wgs84 ontology • geonames-- GeoNamesontology • mo-- Music Ontology • opencyc-- OpenCyc knowledge base • owl -- Web Ontology Language • pim_contact -- PIM (personal information management) Contacts ontology • po -- Programmes Ontology (BBC) • rss -- Really Simple Syndicate (1.0) ontology • sioc -- Socially Interlinked Online Communities ontolog • sioc_types -- SIOC extension • skos -- Simple Knowledge Organization System • umbel -- Upper Mapping and Binding Exchange Layer ontology • wordnet -- WordNet lexical ontology • yandex_foaf -- FOAF (Friend-of-a-Friend) Yandex extension ontology

  26. Step 3: Choosing URIs • Use HTTP URIs Keep out of other people’s namespaces • Create own URI and include alias information • Abstract away from implementation details: • http://dbpedia.org/resource/Berlin • Is better than this: • http://www4.wiwisss.fu-berlin.de:2020/demos/dbpedia/cgi-bin/resource.php?id=/Berlin • Use Natural Keys within URIs: • Need to ensure the uniqueness of URIs • Useful to base them on some existing primary key • Whenever possible, use a key that is meaningful within the domain of the data set. e.g. use the ISBN as part of the URI of a book

  27. Step 3: Choosing URIs (cont.) • Common patterns for URIs: • http://dbpedia.org/resource/Berlin  Thing • http://dbpedia.org/data/Berlin  RDF • http://dbpedia.org/page/Berlin  HTML • Or use the file name extension: • http://biglynx.co.uk/people/dave-smith • http://biglynx.co.uk/people/dave-smith.html • http://biglynx.co.uk/people/dave-smith.rdf

  28. Step 4: Set up Your Infrastructure • Describe the Data-set! • e.g. dataset name, authorship, updates, licensing terms, crawler support, SPARQL endpoint location, ... • Vocabulary of Interlinked Datasets (VoID) • A little later… • Pick a Publication Pattern • Is your input data: queryable, structured or text? • What is the data volume? • Is it static or dynamic? • Test it

  29. Step 4: Set up Your Infrastructure (cont.)

  30. Step 5: Linking • Popular predicates for linking • owl:sameAs • Foaf:depection • Foaf:homepage • Foaf:topic • Foaf:based_near • Foaf:maker/foaf:made • Foaf:page • Foaf:primaryTopic • Rdfs:seeAlso

  31. Step 5: Linking (cont.) • VoID (from "Vocabulary of Interlinked Datasets") is an RDF based schema to describe linked datasets • A dataset is a collection of data, published and maintained by a single provider, available as RDF, and accessible, for example, through dereferenceable HTTP URIs or a SPARQL endpoint http://semanticweb.org/wiki/VoiD

  32. Understand the Principles • Understand your Data • Choose URIs for Things in your Data • Set up Your Infrastructure • Link to other Data Sets

  33. Thank you! Owen.Conlan@scss.tcd.ie

  34. References • http://linkeddata.org • Debugging Semantic Web sites with cURL, http://dowhatimean.net/2007/02/debugging-semantic-web- sites-with-curl • Linked Data Tutorial, http://www.slideshare.net/mediasemanticweb/linked-data- michael-hausenblas-2009-03-05 • Linked Data Applications, M Hausenblas, DERI Technical Report 2009 • Linked Data: Evolving the Web into a Global Data Space, Tom Heath , Christian Bizer http://linkeddatabook.com

More Related