1 / 129

An Introduction to RDF and the Semantic Web

An Introduction to RDF and the Semantic Web. Dr. Randy Kaplan. Resource Description Framework. RDF Least Understood standard to come from the W3C May be the most powerful In order that the web achieve its potential May be the most important In order that the web achieve its potential.

Download Presentation

An Introduction to RDF and the Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to RDF and the Semantic Web • Dr. Randy Kaplan

  2. Resource Description Framework • RDF • Least Understood standard to come from the W3C • May be the most powerful • In order that the web achieve its potential • May be the most important • In order that the web achieve its potential

  3. Resource Description Framework • Why RDF? • With HTML and XML we can swap our documents easily • No meaning is attached to them - they are just data • RDF addresses the problem of meaning in the data on the web

  4. What We Need To Know • When we exchange data we need to know things like, • Who wrote the data • When was the data written • When was the data last updated • These pieces of data are not data per se but the data about the data or meta data

  5. XML • Promised to deliver us from the unstructured data that makes up the Internet • XML brings structure to the data • Because HTML combined the appearance of the document with the content of the document it, the content was extremely hard to extract • XML separated content from presentation

  6. XML • XML specifically dealt with the data of the content <music genre =”classical”> <title>Eine Kleine Nacht Muzik</title> <composer>Mozart</composer> <key>E Flat</key> <tempo>2/4</tempo> </music>

  7. XML • We could convey some of the same information with different data <document type =”classical music”> <name>Eine Kleine Nacht Muzik</name> <author>Mozart</author> </document>

  8. XML • What if we wanted to find all pieces of music composed by Mozart? • We would have to find all documents where the <composer> element had a value of ‘Mozart’. • We would also have to find all documents where the <author> element had a value of ‘Mozart’.

  9. XML • If there was another element used to denote the creator of the music then that term would have to be searched for also • In order to be able to find all compositions written by Mozart without having to identify all elements designating the creator of the music then the same term would have to be used to identify the creator

  10. XML • This problem could also be solved by indicating that when the term composer is used, it means the same when another document says written by, and another says created by • This would be quite an undertaking though as it involves identifying all words and phrases in all languages having this meaning

  11. Missing • Our ability to know that one or more terms mean the same thing is the thing that is missing from the Internet • If we can build this layer into the Internet, it will take the information to a fundamentally different level

  12. Dublin Core • 1995 • Conference in Dublin, Ohio • Discussed issues of semantics • Agreed to a core set of themes common to all documents • Set of properties became known as the Dublin Core (DC) initiative

  13. Dublin Core • 3 Core Properties • DC.Title • DC.Creator • DC.Subject • 15 core properties were defined in the Dublin core (originally)

  14. Dublin Core • The Dublin Core can be applied to XML <music genre =”classical”> <title>Eine Kleine Nacht Muzik</title> <Creator>Mozart</Creator> <key>E Flat</key> <tempo>2/4</tempo> </music> <document type =”classical music”> <name>Eine Kleine Nacht Muzik</name> <Creator>Mozart</Creator> </document>

  15. Dublin Core • Even though we now have used the same element to identify the entity responsible for creating the we don’t know if the meaning of “Creator” is the same in both of these instances • The only way to be sure is to use a very precise mechanism to identify the element being used

  16. Dublin Core • The Dublin Core can be applied to XML <music genre =”classical”> <title>Eine Kleine Nacht Muzik</title> <dc.Creator xmlns:dc=”http://purl.org/dc/elements/1.1/”>Mozart</dc.Creator> <key>E Flat</key> <tempo>2/4</tempo> </music> <document type =”classical music”> <name>Eine Kleine Nacht Muzik</name> <dc.Creator xmlns:dc=”http://purl.org/dc/elements/1.1/”>Mozart</dc.Creator> </document> • Now we can see that these elements refer to exactly the same concept

  17. CD Database • Suppose you keep a small database of CDs on your computer • There is a table in the database as below

  18. Another CD Database • There is a second database kept by another person who has a CD collection • A table in the database is shown below

  19. Comparing Databases • Exchanging Information • If we wanted to share information there would be a problem since the tuple names are different • The same solution we used in the XML can be used in the database - the unique identifier

  20. Another CD Database • There is a second database kept by another person who has a CD collection • A table in the database is shown below

  21. Another CD Database • There is a second database kept by another person who has a CD collection • A table in the database is shown below

  22. URI’s • Uniform Resource Identifiers (URI’s) give us a way to insure that the meaning of the column of data between databases is the same so long as the column is labeled with the same URI

  23. Other Problems • Unfortunately when we look at the databases we notice some other problems

  24. Other Problems • Problem 1 • Albums which may be the same have different names • Problem 2 • Different names are used to denote the same composers

  25. Taxonomies • These problems can be solved through the use of taxonomy • A taxonomy is a - • Controlled vocabulary of words • Usually about a constrained topic • Unique identifiers are key to developing taxonomies

  26. Taxonomies • If we were to devise a controlled classification list so we could tell which CD’s were which genre then we would avoid problems like having one CD labeled as classical and another CD labeled as classic

  27. Taxonomies • CD Taxonomy • Jazz • Classical • Soul • Pop • Hip Hop • Folk

  28. Taxonomies • We are not limited to taxonomies of of music • We could have type of performance, i.e., play, movie, live performance, etc.

  29. Moving the Problem • We really didn’t solve the problem we described earlier • We only moved the problem up a level • We now have the problem with having more than one taxonomy for the same thing

  30. Moving the Problem • Consider • http://taxonomies.org/Plays/PorgyAndBess • http://taxonomies.org/Albums/PorgyAndBess • We do not know whether the PorgyAndBess in the first reference is the same as the PorgyAndBess in the same reference

  31. We Need An Authority Figure • Let us imagine that there is some authority that keeps track of al CDs that are released • This is similar to books and their ISBN numbers which are unique • We will call the fictitious authority MuzicBiz.org • MuzicBiz.org maintains a central database of CDs that have been released

  32. Tables Now ...

  33. Unique Identifiers • Since we are guaranteed that these identifiers ALWAYS refer to the same CD any table row having a specific key will ALWAYS refer to the same CD - there is NO reason to doubt this • Data validity is enforced

  34. Meta-Data • Meta-Data • Data that describes data • Creator, Type, Date are all kinds of meta-data • So far the meta-data we have described consists of two values - an attribute name and an attribute value

  35. Meta-Data • To be precise we need to add one more piece of meta-data to complete any meta-data we might have • Since it is entirely possible to have as Creator, the value Mozart, we need to identify what/where Mozart is the creator of - the so-called DOCUMENT

  36. Triples • The combination of Source, Attribute name, and Value makes what is called in the RDF-biz a TRIPLE and that constitutes a fundamental element in RDF

  37. Transporting Triples • We will assume the following - • Meta-data can be expressed as a set of triples • Key to sharing meta-data is the URI • Now given that we accept this representation, the next challenge is to decide how we will share this information (transport)

  38. Sharing Meta-Data and Data • The database contains the information as organized in the table above • We need to transform this data into the accepted form, i.e., triples

  39. Sharing Data and Meta-Data

  40. Sharing Data and Meta-Data • We have adequately represented the meta-data and it is “ready” for transport via XML • But this table only represents the meta-data and does not relate to any data described by it

  41. Sharing Data and Meta-Data • We need a way to identify the document that the meta-data describes • For this purpose we add a name/value pair that names the URL of the document

  42. Sharing Data and Meta-Data <document type="News Item" url="http://www.ePolitix.com/Articles/0000005a4787.htm" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:Title>I will stand says Portillo</dc:Title> <dc:Creator>Craig Hoiy</dc:Creator> <dc:Subject>Tory leadership contest</dc:Subject> </document>

  43. RDF: Model and Syntax • RDF Model • In this case the model we are speaking of are the triples • The definition of RDF is representation independent • This means that XML is only one way of writing RDF

  44. RDF Terminology • In RDF terminology a STATEMENT is used to describe a triple • This term arises from using a triple to make a statement about a document

  45. RDF Terminology • Triples • Resources and Properties • In the RDF specification the name part of the name/value pair is regarded as a PROPERTY • The subject of the meta data is regarded as a RESOURCE

  46. RDF Terminology • Triples • A triple is the combination of the three parts - a resource with a property and a value

  47. RDF Terminology • A triple can express a relationship between resources Track http://MuzicBiz.org/Albums/7655432 http://MuzicBiz.org/Tracks/1667653

  48. RDF Terminology Track http://MuzicBiz.org/Albums/7655432 http://MuzicBiz.org/Tracks/1667653 • The terminology for this model is the SUBJECT of our statement is the album and the track is the OBJECT • The two resources are joined by a PREDICATE • The predicate specifies the nature of the relationship between the two resources

  49. RDF Terminology • Notation • When writing about RDF it is useful to be able to show statements or sets of triples for discussion

  50. Notation • English • English is simplist • Craig Hoy is the author of http://www.ePolitix.com/Articles/0000005a4787.htm

More Related