1 / 93

XML & XML Query

XML & XML Query. Ling Wang Luping Ding. Introduction. The Web opens a new challenges in: - information technology - database framework. Why? - Data sources on the Web do NOT typically conform to any well-known structure.

clive
Download Presentation

XML & XML Query

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML & XML Query Ling Wang Luping Ding XML & XML Query

  2. Introduction • The Web opens a new challenges in: • - information technology • - database framework. • Why? • - Data sources on the Web do NOTtypically conform to any well-known structure. • - Traditional databases technology is not adequate in dealing with rich data: • eg: audio, video, nested data structures … XML & XML Query

  3. Features of Web Data • Web data characteristics, called semistructured: • Object-like • a collection of complex objects from CODM. • Schema-less • Not typically conform to any type traditional structure. • Self-describing • meaning of the data is carried along with the data itself. • So, we need new database technologies to support those Web-based applications. XML & XML Query

  4. What is XML? • XML---- Extensible Markup Language • - A mark up language for documents containing structured information. • - Universal format for structured documents and data on the Web. • - An HTML-like language. • XML specification defines a standard way to add markup to documents. • Note: Structured information , Markup language XML & XML Query

  5. What is XML ---- example A XML example for customer information: <customer-details id="AcPharm39156"> <name>Acme Pharmaceuticals Co.</name> <address country="US"> <street>7301 Smokey Boulevard</street> <city>Smallville</city> <state>Indiana</state> <postal>94571</postal> </address> </customer-details> XML & XML Query

  6. XML vs. HTML? XML & XML Query

  7. Overview of XML • Mechanisms for specifying document structure: • ---- a set of rules for structuring an XML document. • DTD ---- Document type definition language • (A part of XML standard ) • XML Schema ---- A more recent specification • Query languages for XML: • XPath , XSLT, XQuery XML & XML Query

  8. Attribute Value name Basic concept in XML ---- element & attributes • XML element • Any properly nested piece of text of the form • <sometag>…</sometag>. • eg: <street>7301 Smokey Boulevard</street> • XML Attributes • also a tools for datapresentation. • eg: <customer-details id="AcPharm39156"> </customer-details> content name XML & XML Query

  9. Basic concept in XML ---- namespace • Namespaces • - Why? • Element names in XML are not fixed, name conflict. • - How? • Different authors use different namespace identifiers for different domains. • The general structure “namespace:local-name” • Namespace ---- URI (uniform resource identifier): URL (uniform resource locator) or URN (universal resource name). • Local name ---- same form as regular XML tags. • No a “:” in it. XML & XML Query

  10. Basic concept in XML ---- namespace • An example of Namespaces : • <item xmlns="http://www.acmeinc.com/jp#supplies"> • xmlns:toy=“http://www.acmeinc.com/jp#toys”> • <name>African Coffee Table</name> • <feature> • <toy:item> • <toy:name>cyberpet</toy:name> • </toy:item> • </feature> • </item> default namespace XML & XML Query

  11. DTD ---- Document Type Definitions • Why DTD? • - XML files carry a description of its own format with it. • - Independent groups of people can agree with interchanging data. • - Application verify received data from the outside world • - Also verify own data. • How? • - DTD is included in your XML source file • <!DOCTYPE root-element [element-declarations]> • - DTD is external to your XML source file • <!DOCTYPE root-element SYSTEM "filename"> XML & XML Query

  12. DTD ---- example • Example XML document with a DTD: • <?xml version="1.0"?> • <!DOCTYPE note [ • <!ELEMENT note (to,from,heading,body)> • <!ELEMENT to (#PCDATA)> • <!ELEMENT from (#PCDATA)> • <!ELEMENT heading (#PCDATA)> • <!ELEMENT body (#PCDATA)> • ]> • <note> • <to>Tove</to> • <from>Jani</from> • <heading>Reminder</heading> • <body>Don't forget me this weekend</body> • </note> XML & XML Query

  13. DTD ---- example XML document with an external DTD: <?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> "note.dtd" containing the DTD: <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> XML & XML Query

  14. DTD ---- Inadequacy • Inadequacy of DTD: • - Not designed with namespaces.. • - Use syntax ---- quite different from XML document. • - A very limited set of basic types • - Provide only limited means for expressing data consistency constraints. • No keys • Referential integrity is weak: • Attributes can be type ID, IDREF, IDREFS. • No for element. XML & XML Query

  15. DTD ---- Inadequacy • Inadequacy of DTD: • - No ways of enforcing referential integrity for elements. • - Use alternatives to state that the order of elements is immaterial. Terrible as the number of attributes grows. • - Element definitions are global to the entire document. XML & XML Query

  16. XML Schema • XML Schemas • An attempt to solve all those problems in DTD • - Powerful data typing • - Range checking • - Namespace-aware validation based on namespace URIs rather than on prefixes • - Extensibility and scalability XML & XML Query

  17. XML Schema ---- example • Here is a simple example about XML Schema: • <?xml version="1.0"?> • <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> • <xsd:element name="SONG" type="SongType"/> • <xsd:complexType name="SongType"> • <xsd:sequence> • <xsd:element name="TITLE" type="xsd:string"/> • <xsd:element name="COMPOSER" type="xsd:string"/> • <xsd:element name="PRODUCER" type="xsd:string"/> • <xsd:element name="PUBLISHER" type="xsd:string"/> • <xsd:element name="LENGTH" type="xsd:string"/> • <xsd:element name="YEAR" type="xsd:string"/> • <xsd:element name="ARTIST" type="xsd:string"/> • <xsd:element name="PRICE" type="xsd:string"/> • </xsd:sequence> • </xsd:complexType> • </xsd:schema> XML & XML Query

  18. XML Schema ---- example • The root element ---- “schema”. • Default namespace ---- http://www.w3.org/2001/XMLSchema with prefix xsd or xs. • Elements ---- xsd:element. • divided into simple type and complex type. • simple type element is one that can only contain text and does not have any attributes. It cannot contain any child elements. • Syntax: <xs:element name="name" type="type"/> • Examples: <xs:element name="to" type="xs:string"/> XML & XML Query

  19. XML Schema ---- example Complex type define a new type which can have attributes and can have child elements. This is very flexible. Syntax: <xs:element name="name"> <xs:complexType> . element content </xs:complexType> </xs:element> Example: <xs:element name="note"> <xs:complexType> <xs:sequence> <element name="to" type="xs:string"/> <element name="from" type="xs:string"/> <element name="heading" type="xs:string"/> <element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> XML & XML Query

  20. XML Schema ---- features • Simple Types • - 44 built-in simple types in the W3C XML Schema language. • - Divided into seven groups: • Numeric types • Time types • XML types • String types • The boolean type • The URI reference type • The binary types XML & XML Query

  21. XML Schema ---- features • Deriving Simple Types • Not limited to the 44 simple types • Create new data types by deriving from the existing types • restrict a type to a subset of its normal values. • eg: A schema that derives a Str255 data type from xsd:string • <xsd:simpleType name="Str255"> • <xsd:restriction base="xsd:string"> • <xsd:minLength value="1"/> • <xsd:maxLength value="255"/> • </xsd:restriction> • </xsd:simpleType> XML & XML Query

  22. XML Schema ---- features • create enumerated types • Example: • <xsd:simpleType name="PublisherType"> • <xsd:restriction base="xsd:string"> • <xsd:enumeration value="Warner-Elektra-Atlantic"/> • <xsd:enumeration value="Universal Music Group"/> • <xsd:enumeration value="Sony Music Entertainment,Inc."/> • <xsd:enumeration value="Capitol Records, Inc."/> • <xsd:enumeration value="BMG Music"/> • </xsd:restriction> • </xsd:simpleType> XML & XML Query

  23. XML Schema ---- features • create new types by join existing types through a union. • Example: • <xsd:simpleType name="MoneyOrDecimal"> • <xsd:union> • <xsd:simpleType> • <xsd:restriction base="xsd:decimal"> • </xsd:restriction> • </xsd:simpleType> • <xsd:simpleType> • <xsd:restriction base="xsd:string"> • <xsd:pattern value="\p{Sc}\p{Nd}+(\.\p{Nd}\p{Nd})?"/> • </xsd:restriction> • </xsd:simpleType> • </xsd:union> • </xsd:simpleType> XML & XML Query

  24. XML Schema ---- features • Namespaces • http://www.w3.org/2001/XMLSchema • the namespace that identifies the names of tags and attributes used in a schema. • The name is understood by all schema aware XML processors. • http://www.w3.org/2001/XMLSchema-instance • a small number of special names used in instance documents, not schema. • - target namespace • the set of names defined by a particular schema document • the user-defined names that are to be used in the instance documents. XML & XML Query

  25. XML Schema ---- features • Grouping • - Does order really mattered? ? • - How? • xsd:all group ---- each element in the group must occur at most once, but that order is not important. • xsd:choice group ---- any one element from the group should appear. • xsd:sequence group ---- each element in the group appear exactly once, in the specified order. XML & XML Query

  26. XML Schema ---- features Example for xsd:all group <xsd:complexType name="PersonType"> <xsd:sequence> <xsd:element name="NAME"> <xsd:complexType> <xsd:all> <xsd:element name="GIVEN" type="xsd:string" minOccurs="1" maxOccurs="1"/> <xsd:element name="FAMILY" type="xsd:string" minOccurs="1" maxOccurs="1"/> </xsd:all> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> XML & XML Query

  27. XML Schema ---- features Example for XML Choice group: <xsd:complexType name="SongType"> <xsd:sequence> <xsd:element name="TITLE" type="xsd:string"/> <xsd:choice> <xsd:element name="COMPOSER" type="PersonType"/> <xsd:element name="PRODUCER" type="PersonType"/> </xsd:choice> <xsd:element name="PUBLISHER" type="xsd:string" minOccurs="0"/> <xsd:element name="LENGTH" type="xsd:string"/> <xsd:element name="YEAR" type="xsd:string"/> <xsd:element name="ARTIST" type="xsd:string" maxOccurs="unbounded"/> <xsd:element name="PRICE" type="xsd:string" minOccurs="0"/> </xsd:sequence> </xsd:complexType> XML & XML Query

  28. XML Schema ---- features • Schemas address limitations of DTDs: • a strange, non-XML syntax • namespace incompatibility • lack of data typing • limited extensibility and scalability. • XML Schemas • - Powerful data typing • - Range checking • - Namespace-aware validation based on namespace URIs rather than on prefixes • - Extensibility and scalability XML & XML Query

  29. XML Constrains ---- DTD • DTD • No keys, its Referential integrity is weak • Attributes :ID, IDREF, IDREFS. • ID ---- Unique value • IDREF ---- Valid ID declared in same document IDREFS ---- Valid ID, space-separated • But these are also based on type string. • Element: no corresponding parts. XML & XML Query

  30. XML Constrains ---- Schema • XML keys: • Similar with SQL, but complicated. • - complex structures • - a key might be composed of a sequence of values • - located at different depths inside an element. • Two ways:   • - tag unique ---- UNIQUE constraint • - tag key ---- PRIMARY KEY , not null • eg: <key name=“PrimaryKeyForClass”> • <selector xpath=“Classes/Class”/> • <field xpath=“CrsCode”/> • <field xpath=“Semester”/> • </key> XML & XML Query

  31. XML Constrains ---- Schema • Foreign keys: • eg: <complexType> • …… • <keyref name=“NoBogusTranscripts” refer=“adm:PrimaryKeyForClass”> • <selector xpath=“Students/Student/CrsTaken”/> • <field xpath=“@CrsCode”/> • <field xpath=“@Semester”/> • </keyref> • … … • </complexType> • Powerful? XML & XML Query

  32. Question • Is XML data model relational or object-relational? • Is XML a database? XML & XML Query

  33. References [1] Chapter 17, XML and Web Data [2] Chapter 24, XML Bible (2nd edition): Schemas http://www.ibiblio.org/xml/books/bible2/index.html#toc [3] http://www.w3schools.com http://www.w3.org/ http://www.xml.com/ XML & XML Query

  34. Part II • XML Query Language • Counterpart of SQL in XML World XML & XML Query

  35. XML Query Language • Desired Characteristics for XML Query Language - also Requirements • Good candidate: XQuery Language • Use Cases for XQuery Language XML & XML Query

  36. Desired Characteristics • XML Output • Declarative - what has to be done? • Query Operation • No Schema Required • Preserve Order and Association • Mutually Embedding with XML • Support for New Datatypes • Suitable for Metadata • Ability to add update capabilities in future versions XML & XML Query

  37. Details • XML Output • define derived database (virtual views) • provide transparency to application (why?) • The XML Query Language MUST be declarative - like SQL • specifies what has to be done • it MUST not enforce a particular evaluation strategy XML & XML Query

  38. Details (cont.) • Query Operation • Projection, selection, join, and restructuring should all be possible in a single XML Query (why?) • for optimization reason XML & XML Query

  39. Query Operations XML & XML Query

  40. Example - Sample Data • <bib> • <book year="1999" isbn="1-55860-622-X"> • <title>Data on the Web</title> • <author>Abiteboul</author> • <author>Buneman</author> • <author>Suciu</author> • </book> • <book year="2001" isbn="1-XXXXX-YYY-Z"> • <title>XML Query</title> • <author>Fernandez</author> • <author>Suciu</author> • </book> • </bib> XML & XML Query

  41. Example - XML Schema • <xs:group name="Bib"> • <xs:element name="bib"> • <xs:complexType> • <xs:group ref="Book" • minOccurs="0" maxOccurs="unbounded"/> • </xs:complexType> • </xs:element> • </xs:group> XML & XML Query

  42. Example - XML Schema (Cont.) • <xs:group name="Book"> • <xs:element name="book"> • <xs:complexType> • <xs:attribute name="year" type="xs:integer"/> • <xs:attribute name="isbn" type="xs:string"/> • <xs:element name="title" type="xs:string"/> • <xs:element name="author"type="xs:string" maxOccurs="unbounded"/> • </xs:complexType> • </xs:element> • </xs:group> XML & XML Query

  43. Variable Binding • LET $bib0 := • <bib> • <book year="1999" isbn="1-55860-622-X"> • <title>Data on the Web</title> • <author>Abiteboul</author> • <author>Buneman</author> • <author>Suciu</author> • </book> • <book year="2001" isbn="1-XXXXX-YYY-Z"> • <title>XML Query</title> • <author>Fernandez</author> • <author>Suciu</author> • </book>), • </bib> XML & XML Query

  44. Projection • $bib0/book/author • ==> <author>Abiteboul</author>, • <author>Buneman</author>, • <author>Suciu</author>, • <author>Fernandez</author>, • <author>Suciu</author> • Notes: the document order of author elements is preserved XML & XML Query

  45. Selection • FOR $b IN $bib0/book • WHERE $b/@year/data() <= 2000 • RETURN $b • ==> <book year="1999" isbn="1-55860-622-X"> • <title>Data on the Web</title> • <author>Abiteboul</author> • <author>Buneman</author> • <author>Suciu</author> • </book> XML & XML Query

  46. Join - Sample Data • LET $review0 := • <reviews> • <book> • <title>XML Query</title> • <review>A darn fine book.</review> • </book>, • <book> • <title>Data on the Web</title> • <review>This is great!</review> • </book> • </review> : Reviews XML & XML Query

  47. Join • FOR $b IN $bib0/book, $r IN $review0/book • WHERE $b/title/data() = $r/title/data() • RETURN <book>{ $b/title, $b/author, $r/review }</book> • ==> <book> • <title>Data on the Web</title> • <author>Abiteboul</author> • <author>Buneman</author> • <author>Suciu</author> • <review>A darn fine book.</review> • </book>, • <book> • <title>XML Query</title> • <author>Fernandez</author> • <author>Suciu</author> • <review>This is great!</review> • </book> XML & XML Query

  48. Restructuring • FOR $a IN distinct-value($bib0/book/author/data()) RETURN • <biblio> • <author>{ $a }</author> • { FOR $b IN $bib0/book, $a2 IN $b/author/data() • WHERE $a = $a2 RETURN • $b/title • } • </biblio> XML & XML Query

  49. Restructuring (Cont.) • ==> <biblio> • <author>Abiteboul</author> • <title>Data on the Web</title> • </biblio>, • <biblio> • <author>Buneman</author> • <title>Data on the Web</title> • </biblio>, • <biblio> • <author>Suciu</author> • <title>Data on the Web</title> • <title>XML Query</title> • </biblio>, • <biblio> • <author>Fernandez</author> • <title>XML Query</title> • </biblio> XML & XML Query

  50. Details (cont.) • No Schema Required • XML Query should be usable on XML data when there is no schema (DTD or XML Schema) known in advance. But it should be able to exploit the schema if the schema is available. • Preserve Order and Association • XML Query should preserve order and association of elements in XML data (why?) XML & XML Query

More Related