1 / 85

XML Design (A Gentle Transition from XML to RDF)

XML Design (A Gentle Transition from XML to RDF). Roger L. Costello David B. Jacobs The MITRE Corporation (The creation of this tutorial was sponsored by DARPA). Acknowledgments.

bryony
Download Presentation

XML Design (A Gentle Transition from XML to RDF)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Design(A Gentle Transition from XML to RDF) Roger L. Costello David B. Jacobs The MITRE Corporation (The creation of this tutorial was sponsored by DARPA)

  2. Acknowledgments • We are very grateful to the Defense Agency Research Projects Agency (DARPA) for funding the creation of this tutorial. We are especially grateful to Murray Burke (DARPA) and John Flynn (BBN) for making it all happen. • Special thanks to Frank Manola for answering our many questions.

  3. What is the Purpose of RDF? • The purpose of RDF (Resource Description Framework) is to give a standard way of specifying data "about" something. • Here's an example of an XML document that specifies data about China's Yangtze river: <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> "Here is data about the Yangtze River. It has a length of 6300 kilometers. Its startingLocation is western China's Qinghai-Tibet Plateau. Its endingLocation is the East China Sea."

  4. Modify the following XML document so that it is also a valid RDF document: <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> XML Yangtze.xml "convert to" <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> RDF Yangtze.rdf XML --> RDF

  5. The RDF Format RDF provides an ID attribute for identifying the resource being described. 1 The ID attribute is in the RDF namespace. 2 <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> 3 Add the "fragment identifier symbol" to the namespace.

  6. 2 Identifies the resource being described. This resource is an instance of River. Identifies the type (class) of the resource being described. 1 <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> These are properties, or attributes, of the type (class). 3 Values of the properties 4 The RDF Format (cont.)

  7. Question: Why was "#" placed onto the end of the namespace? E.g., xmlns="http://www.geodesy.org/river#" Answer: RDF is very concerned about uniquely identifying things - uniquely identifying the type (class) and uniquely identifying the properties. If we concatenate the namespace with the type then we get a unique identifier for the type, e.g., Best Practice Best Practice http://www.geodesy.org/river#River If we concatenate the namespace with a property then we get a unique identifier for the property, e.g., http://www.geodesy.org/river#length http://www.geodesy.org/river#startingLocation http://www.geodesy.org/river#endingLocation Thus, the "#" symbol is simply a mechanism for separating the namespace from the type name and the property name. Namespace Convention

  8. The RDF Format <?xml version="1.0"?> <Classrdf:ID="Resource" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="uri"> <property>value</property> <property>value</property> ... </Class>

  9. Advantage of using the RDF Format Interoperability • You may ask: "Why should I bother designing my XML to be in the RDF format?" • Answer: there are numerous benefits: • The RDF format, if widely used, will help to make XML more interoperable: • Tools can instantly characterize the structure, "this element is a type (class), and here are its properties”. • RDF promotes the use of standardized vocabularies ... standardized types (classes) and standardized properties. • The RDF format gives you a structured approach to designing your XML documents. The RDF format is a regular, recurring pattern. • It enables you to quickly identify weaknesses and inconsistencies of non-RDF-compliant XML designs. It helps you to better understand your data! • You reap the benefits of both worlds: • You can use standard XML editors and validators to create, edit, and validate your XML. • You can use the RDF tools to apply inferencing to the data. • It positions your data for the Semantic Web! Network effect

  10. Disadvantage of using the RDF Format • Constrained: the RDF format constrains you on how you design your XML (i.e., you can't design your XML in any arbitrary fashion). • RDF uses namespaces to uniquely identify types (classes), properties, and resources. Thus, you must have a solid understanding of namespaces. • Another XML vocabulary to learn: to use the RDF format you must learn the RDF vocabulary.

  11. Uniquely Identify the Resource • Earlier we said that RDF is very concerned about uniquely identifying the type (class) and the properties. RDF is also very concerned about uniquely identifying the resource, e.g., This is the resource being described. We want to uniquely identify this resource. <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

  12. rdf:ID • The value of rdf:ID is a "relative URI". • The "complete URI" is obtained by concatenating the URL of the XML document with "#" and then the value of rdf:ID, e.g., <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> Yangtze.rdf Suppose that this RDF/XML document is located at this URL: http://www.china.org/geography/rivers. Thus, the complete URI for this resource is: http://www.china.org/geography/rivers#Yangtze

  13. xml:base • On the previous slide we showed how the URL of the document provided the base URI. • Depending on the location of the document is brittle: it will break if the document is moved, or is copied to another location. • A more robust solution is to specify the base URI in the document, e.g., <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> Resource URI = concatenation(xml:base, '#', rdf:ID) = concatenation(http://www.china.org/geography/rivers, '#', "Yangtze") = http://www.china.org/geography/rivers#Yangtze

  14. rdf:about • Instead of identifying a resource with a relative URI (which then requires a base URI to be prepended), we can give the complete identity of a resource. However, we use rdf:about, rather than rdf:ID, e.g., <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River>

  15. http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#length of 6300 kilometers property resource value http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#startingLocation of western China's ... property resource value http://www.china.org/geography/rivers#Yangtze has a http://www.geodesy.org/river#endingLocation of East China Sea property resource value Triple -> resource/property/value

  16. The RDF Format = triples! • The fundamental design pattern of RDF is to structure your XML data as resource/property/value triples! <?xml version="1.0"?> <Resource-A> <property-A> <Resource-B> <property-B> <Resource-C> <property-C> Value-C </property-C> </Resource-C> </property-B> </Resource-B> </property-A> </Resource-A> Notice that the RDF design pattern is an alternating sequence of resource-property. This pattern is known as "striping". value of property-A value of property-B The value of a property can be a literal (e.g., length has a value of 6300 kilometers). Also, the value of a property can be a resource, as shown above (e.g., property-A has a value of Resource-B, property-B has a value of Resource-C). We will see examples of properties having a resource value in a little bit.

  17. Naming Convention • The convention is to use a capital letter to start a type (class) name, and use a lowercase letter to start a property name. • This helps the eye quickly discern the striping pattern. <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> uppercase lowercase

  18. RDF Model (graph) Legend: Ellipse indicates "Resource" Rectangle indicates "literal string value"

  19. rdf:Description + rdf:type • There is still another way of representing the XML. This way makes it very clear that you are describing something, and it makes it very clear what the type (class) is of the thing you are describing: <?xml version="1.0"?> <rdf:Description rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <rdf:type rdf:resource="http://www.geodesy.org/river#River"/> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </rdf:Description> This is read as: "This is a Description about the resource http://www.china.org/geography/rivers#Yangtze. This resource is an instance of the River type (class). The http://www.china.org/geography/rivers#Yangtze resource has a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea." Note: this form of describing a resource is called the "long form". The form we have seen previously is an abbreviation of this long form. An RDF Parser interprets the abbreviated form as if it were this long form.

  20. Alternative • Alternatively we can use rdf:ID rather than rdf:about, as shown here: <?xml version="1.0"?> <rdf:Descriptionrdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <rdf:type rdf:resource="http://www.geodesy.org/river#River"/> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </rdf:Description>

  21. <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> <?xml version="1.0"?> <rdf:Descriptionrdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <rdf:type rdf:resource="http://www.geodesy.org/river#River"/> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </rdf:Description> Equivalent Representations! Note: In the RDF literature the examples are typically shown in this form.

  22. http://www.w3.org/1999/02/22-rdf-syntax-ns# ID about type resource Description RDF Namespace

  23. predicate Subject Object Equivalent! property Resource Value Terminology • As you read the RDF literature you may see the following terminology: • Subject: this term refers to the item that is playing the role of the resource. • predicate: this term refers to the item that is playing the role of the property. • Object: this term refers to the item that is playing the role of the value.

  24. Do Lab1 RDF Parser • There is a nice RDF parser at the W3 Web site: http://www.w3.org/RDF/Validator/ This RDF parser will tell you if your XML is in the proper RDF format.

  25. Example #2 Modify the following XML document so that it is RDF-compliant: <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </River> Yangtze2.xml

  26. Note the two types (classes) Dam River Instance: Yangtze Properties: length startingLocation endingLocation Instance: ThreeGorges Properties: name width height cost

  27. Dam - out of place <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </River> Dam Types (classes) contain properties . Here we see the River type containing the properties - length, startingLocation, and endingLocation. It also shows River containing a type - Dam. Thus, there is a Resource that contains another Resource. This is inconsistent with RDF design pattern. (We are seeing one of the benefits of using the RDF format - to identify inconsistencies in an XML design.)

  28. Property value must be a Literal or a Resource <length>6300 kilometers</length> property Value is a Literal <obstacle> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> property Value is a Resource

  29. Modified XML (to make it consistent) <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River> Yangtze2,v2.xml "The Yangtze River has an obstacle that is the ThreeGorges Dam. The Dam has a name - The Three Gorges Dam. It has a width of 1.5 miles, a height of 610 feet, and a cost of $30 billion."

  30. RDF Format <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River> Changed id to rdf:ID Added the '#' symbol As always, the other representations using rdf:about and rdf:Description are available.

  31. RDF Model (graph)

  32. Alternatively, suppose that someone has already created a document containing information about the Three Gorges Dam: <?xml version="1.0"?> <Dam rdf:ID="ThreeGorges" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/dam#" xml:base="http://www.china.org/geography/rivers"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> Three-Gorges-Dam.rdf Then we can simply reference the Three Gorges Dam resource using rdf:resource, as shown here: <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle rdf:resource="http://www.china.org/geography/rivers#ThreeGorges"/> </River> Yangtze.rdf

  33. Why was this the reference: <obstacle rdf:resource="http://www.china.org/geography/rivers #ThreeGorges"/> and not this: <obstacle rdf:resource="http://www.china.org/geography/rivers /Three-Gorges-Dam.rdf"/> That is, why wasn't the reference to a "file"? Answer: 1. What if the file moved? Then the reference would break. 2. By using an identifier of the Three Gorges Dam, and keeping a particular file unspecified, then an "aggregator tool" will be able to collect information from all the files that talk about the Three Gorges Dam resource (see next slide). Do Lab2 Note: reference is to a resource, not to a file

  34. Anyone, Anywhere, Anytime Can Talk About a Resource • In all of our examples we have provided a unique identifier to resources, e.g., http://www.china.org/geography/rivers#Yangtze • Consequently, if another RDF document identifies the same resource then the data that it specifies gives additional data about that resource. • An aggregator tool will be able to collect all data about a resource and present a consolidated set of data for the resource. That's powerful!

  35. rdf:ID versus rdf:about • When should rdf:ID be used? When should rdf:about be used? • When you want to introduce a resource, and provide an initial set of information about a resource use rdf:ID • When you want to extend the information about a resource use rdf:about • The RDF philosophy is akin to the Web philosophy. That is, anyone, anywhere, anytime can provide information about a resource.

  36. <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> http://www.china.org/geography/rivers/yangtze.rdf <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <name>Dri Chu - Female Yak River</name> <name>Tongtian He, Travelling-Through-the-Heavens River</name> <name>Jinsha Jiang, River of Golden Sand</name> </River> http://www.encyclopedia.org/yangtze-alternate-names.rdf Aggregator tool collects data about the Yangtze <?xml version="1.0"?> <River rdf:about="http://www.china.org/geography/rivers#Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <name>Dri Chu - Female Yak River</name> <name>Tongtian He, Travelling-Through-the-Heavens River</name> <name>Jinsha Jiang, River of Golden Sand</name> </River> A distributed network of data! Aggregated Data!

  37. <?xml version="1.0"?> <Dam rdf:ID="ThreeGorges" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/dam#" xml:base="http://www.china.org/geography/rivers"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> Another Example of Aggregation http://www.encyclopedia.org/three-gorges-dam.rdf <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle rdf:resource="http://www.china.org/geography/rivers #ThreeGorges"/> </River> Aggregate! http://www.china.org/geography/rivers/yangtze.rdf <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xml:base="http://www.china.org/geography/rivers"> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River> Note that the reference to the ThreeGorges Dam resource has been replaced by whatever information the aggregator could find on this resource!

  38. Example #3 Notice that in this XML document there is no unique identifier: <?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> XML Yangtze3.xml <?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> The RDF is identical to the XML! RDF Yangtze3.rdf

  39. Interpreting the RDF <?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> Yangtze3.rdf This is read as: "This is an instance of the River type (class). The River has a name of Yangtze, a length of 6300 kilometers, a startingLocation of western China's Qinghai-Tibet Plateau, and an endingLocation of the East China Sea." In this document the resource is anonymous - it has no identifier.

  40. Disadvantage of anonymous resources <?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <length>6300 kilometers</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> http://www.china.org/geography/rivers/yangtze.rdf <?xml version="1.0"?> <River xmlns="http://www.geodesy.org/river#"> <name>Yangtze</name> <name>Dri Chu - Female Yak River</name> <name>Tongtian He, Travelling-Through-the-Heavens River</name> <name>Jinsha Jiang, River of Golden Sand</name> </River> http://www.encyclopedia.org/yangtze-alternate-names.rdf Aggregate An aggregator tool will not be able to determine if these documents are talking about the same resource.

  41. <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure"> <length uom:units="kilometers">6300</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> XML Example #4 Yangtze4.xml <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length> <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> RDF Yangtze4.rdf

  42. <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length> <value>6300</value> <uom:units>kilometers</uom:units> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> However, now the length property has as its value two values. RDF only binary relations i.e., a single value for a property. <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length uom:units="kilometers">6300</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> Yangtze4.xml RDF does not allow attributes on the properties (except for special RDF attributes such as rdf:resource). So we need to make the uom:units attribute a child element. Your first instinct might be to modify length to have two child elements:

  43. rdf:value 6300 length kilometers length has two values - 6300 and kilometers. RDF provides a special property, rdf:value, to be used for specifying the "primary" value. In this example, 6300 is the primary value, and kilometers is a value which provides additional information about the primary value.

  44. RDF Format <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length> <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> An anonymous resource Yangtze4.rdf Read this as: "The Yangtze River has a length whose value is a resource which has a value of 6300 and whose units is kilometers.

  45. <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> This is an anonymous resource. Its purpose is solely to provide a context for the two properties. Other RDF documents will have no need to amplify this resource. So, in this case, there is no reason for giving the resource an identifier. In this case it makes good sense to use an anonymous resource. Advantage of anonymous resources

  46. RDF Model (graph) Legend: An anonymous resource (also called a "blank node"). That is, a resource with no identifier. (Note: RDF Parsers will typically generate a unique identifier for anonymous resources, to distinguish one anonymous resource from another.)

  47. rdf:parseType="Resource" If the value of a property is comprised of several values then one option is to create an anonymous resource, as we saw. RDF provides a shorthand, so that you don't need to create an rdf:Description element, by using rdf:parseType="Resource", as shown here: <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length rdf:parseType="Resource"> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> </River> Yangtze4,v2.rdf The meaning of this is identical to that shown on the previous slide.

  48. Do Lab3 Equivalent! <length> <rdf:Description> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </rdf:Description> </length> <length rdf:parseType="Resource"> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </length>

  49. Summary Modify the following XML document so that it is also a valid RDF document: <?xml version="1.0"?> <River id="Yangtze" xmlns="http://www.geodesy.org/river" xmlns:uom="http://www.measurements.org/units-of-measure#"> <length uom:units="kilometers">6300</length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <Dam id="ThreeGorges" xmlns="http://www.geodesy.org/dam"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </River> Yangtze.xml See next slide -->

  50. RDF Format! <?xml version="1.0"?> <River rdf:ID="Yangtze" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.geodesy.org/river#" xmlns:uom="http://www.measurements.org/units-of-measure#" xml:base="http://www.china.org/geography/rivers"> <length rdf:parseType="Resource"> <rdf:value>6300</rdf:value> <uom:units>kilometers</uom:units> </length> <startingLocation>western China's Qinghai-Tibet Plateau</startingLocation> <endingLocation>East China Sea</endingLocation> <obstacle> <Dam rdf:ID="ThreeGorges" xmlns="http://www.geodesy.org/dam#"> <name>The Three Gorges Dam</name> <width>1.5 miles</width> <height>610 feet</height> <cost>$30 billion</cost> </Dam> </obstacle> </River> With relatively few changes the XML document is now usable by both XML tools and RDF tools! Yangtze.rdf

More Related