450 likes | 570 Views
This guide provides a comprehensive overview of XML and its associated technologies, including DTD, XPath, and XSLT. Learn about the self-describing nature of XML, how DTDs can standardize XML schemas, methods to navigate XML documents using XPath, and techniques to transform XML with XSLT. The content is geared toward students and professionals looking to enhance their understanding of these essential tools for managing hierarchical data. Find practical examples and insights into XML's structure and functionality.
E N D
CS 433Xml, DTD, XPath, & Xslt Extensible Markup and Beyond September 26, 2001 Jeff Derstadt
Administration • Due: Friday Sept. 28th • Relational table creation and summary • See course web site for more details • Logging into Egret • Questions?
Overview • Xml • A self-describing, hierarchal data model • DTD • Standardizing schemas for Xml • XPath • How to navigate and query Xml documents • Xslt • How to transform one Xml document into another Xml document
Xml – An Example <class name=‘CS 433’> <location building=‘Olin’ room=‘255’/> <professor>Johannes Gehrke</professor> <ta>Jeff</ta> <student_list> <student id=‘999-991’>John Smith</student> <student id=‘999-992’>Jane Doe</student> </student_list> </class>
Xml – Extensible Markup Language • Language • A way of communicating information • Markup • Notes or meta-data that describe your data or language • Extensible • Limitless ability to define new languages or data sets
Xml – What’s The Point? • You can include your data and a description of what the data represents • This is useful for defining your own language or protocol • Example: Chemical Markup Language <molecule> <weight>234.5</weight> <Spectra>…</Spectra> <Figures>…</Figures> </molecule>
attribute closing tag open tag attribute value data element name Xml – Structure • Xml looks like HTML • Xml is a hierarchy of user-defined tags called elements with attributes and data • Data is described by elements, elements are described by attributes <student id=‘999-991’>John Smith</student>
attribute closing tag open tag attribute value data element name Xml – Elements <student id=‘999-991’>John Smith</student> • Xml is case and space sensitive • Element opening and closing tag names must be identical • Opening tags: “<” + element name + “>” • Closing tags: “</” + element name + “>” • Empty Elements have no data and no closing tag: • They begin with a “<“ and end with a “/>” <location/>
attribute closing tag open tag attribute value data element name Xml – Attributes <student id=‘999-991’>John Smith</student> • Attributes provide additional information for element tags. • There can be zero or more attributes in every element; each one has the the form: attribute_name=‘attribute_value’ • There is no space between the name and the “=‘” • Attribute values must be surrounded by “ or ‘ characters • Multiple attributes are separated by white space (one or more spaces or tabs).
attribute closing tag open tag attribute value data element name Xml - Data <student id=‘999-991’>John Smith</student> • Xml data is any information between an opening and closing tag • Xml data must not contain the ‘<‘ or ‘>’ characters
Xml – Nesting & Hierarchy • Xml tags can be nested in a tree hierarchy • Xml documents can have only one root tag • Between an opening and closing tag you can insert: 1. Data 2. More Elements 3. A combination of data and elements <root> <tag1> Some Text <tag2>More</tag2> </tag1> </root>
Node Type: Element_Node Name: Element Value: Root Node Type: Element_Node Name: Element Value: tag1 Node Node Type: Text_Node Name: Text Value: Some Text Type: Element_Node Name: Element Value: tag2 Node Type: Text_Node Name: Text Value: More Xml – Storage • Storage is done just like an n-ary tree (DOM) <root> <tag1> Some Text <tag2>More</tag2> </tag1> </root>
Xml vs. Relational Model <Table> <Computer Id=‘101’> <Speed>800Mhz</Speed> <RAM>256MB</RAM> <HD>40GB</HD> </Computer> <Computer Id=‘102’> <Speed>933Mhz</Speed> <RAM>512MB</RAM> <HD>40GB</HD> </Computer> </Table> Computer Table
DTD – Document Type Definition • A DTD is a schema for Xml data • Xml protocols and languages can be standardized with DTD files • A DTD says what elements and attributes are required or optional • Defines the formal structure of the language
DTD – An Example <?xml version='1.0'?> <!ELEMENT Basket (Cherry+, (Apple | Orange)*) > <!ELEMENT Cherry EMPTY> <!ATTLIST Cherry flavor CDATA #REQUIRED> <!ELEMENT Apple EMPTY> <!ATTLIST Apple color CDATA #REQUIRED> <!ELEMENT Orange EMPTY> <!ATTLIST Orange location ‘Florida’> -------------------------------------------------------------------------------- <Basket> <Cherry flavor=‘good’/> <Apple color=‘red’/> <Apple color=‘green’/> </Basket> <Basket> <Apple/> <Cherry flavor=‘good’/> <Orange/> </Basket>
DTD - !ELEMENT <!ELEMENT Basket (Cherry+, (Apple | Orange)*) > • !ELEMENT declares an element name, and what children elements it should have • Wildcards: • * Zero or more • + One or more Name Children
DTD - !ATTLIST <!ATTLIST Cherry flavor CDATA #REQUIRED> <!ATTLIST Orange location CDATA #REQUIRED color ‘orange’> • !ATTLISTdefines a list of attributes for an element • Attributes can be of different types, can be required or not required, and they can have default values. Element Attribute Type Flag
DTD –Well-Formed and Valid <?xml version='1.0'?> <!ELEMENT Basket (Cherry+)> <!ELEMENT Cherry EMPTY> <!ATTLIST Cherry flavor CDATA #REQUIRED> -------------------------------------------------------------------------------- Not Well-Formed <basket> <Cherry flavor=good> </Basket> Well-Formed but Invalid <Job> <Location>Home</Location> </Job> Well-Formed and Valid <Basket> <Cherry flavor=‘good’/> </Basket>
XPath – Navigating Xml • When Xml is stored in a tree, XPath allows you to navigate to different nodes: Class <Class> <Student>Jeff</Student> <Student>Pat</Student> </Class> Student Student Text: Jeff Text: Pat
XPath – Navigating Xml • Xml is similar to a file structure, but you can select more than one node: //Class/Student Class <Class> <Student>Jeff</Student> <Student>Pat</Student> </Class> Student Student Text: Jeff Text: Pat
XPath – Navigating Xml • An XPath expression looks just like a file path • Elements are accessed as /<element>/ • Attributes are accessed as @attribute • Everything that satisfies the path is selected • You can add constraints in brackets [ ] to further refine your selection
XPath – Navigating Xml <class name=‘CS 433’> <location building=‘Olin’ room=‘255’/> <professor>Johannes Gehrke</professor> <ta>Dan Kifer </ta> <student_list> <student id=‘999-991’>John Smith</student> <student id=‘999-992’>Jane Doe</student> </student_list> </class> Starting Element Attribute Constraint //class[@name=‘CS 433’]/student_list/student/@id Element Path Selection Selection Result: The attribute nodes containing 999-991 and 999-992
XPath - Context • Context – your current focus in an Xml document • Use: //<root>/… When you want to start from the beginning of the Xml document
XPath - Context XPath: List/Student Class Prof Location List Text: Gehrke Attr: Olin Student Student Text: Jeff Text: Pat
Class Prof Location List Text: Gehrke Attr: Olin Student Student Text: Jeff Text: Pat XPath - Context XPath: Student
XPath – Examples <Basket> <Cherry flavor=‘sweet’/> <Cherry flavor=‘bitter’/> <Cherry/> <Apple color=‘red’/> <Apple color=‘red’/> <Apple color=‘green’/> … </Basket> Select all of the red apples: //Basket/Apple[@color=‘red’]
XPath – Examples <Basket> <Cherry flavor=‘sweet’/> <Cherry flavor=‘bitter’/> <Cherry/> <Apple color=‘red’/> <Apple color=‘red’/> <Apple color=‘green’/> … </Basket> Select the cherries that have some flavor: //Basket/Cherry[@flavor]
XPath – Examples <orchard> <tree> <apple color=‘red’/> <apple color=‘red’/> </tree> <basket> <apple color=‘green’/> <orange/> </basket> </orchard> Select all the apples in the orchard: //orchard/descendant()/apple
Xslt – Transforming Xml Amazon.com order form: <single_book_order> <title>Databases</title> <qty>1</qty> </single_book_order> Supplier’s order form: <form7957> <purchase item=’book’ property=’title’ value=’Databases’ quantity=’1’/> </form7957>
Xslt - Extensible Style Language for Transformation • Xslt is a language for transforming or converting one Xml format into another Xml format. • Benefits: • No need to parse or interpret many different Xml formats – they can all be transformed to a single format to facilitate interpretation • Language looks like Xml! (remember, Xml defines languages!)
Xslt – A First Look <single_book_order> <title>Databases</title> <qty>1</qty> </single_book_order> <form7957> <purchase item=’book’ property=’title’ value=’Databases’ quantity=’1’/> </form7957> <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'> <xsl:template match='single_book_order'> <form7957><purchase item='book' property='title' value='{title}‘ quantity='{qty}'/></form7957> </xsl:template> </xsl:stylesheet>
Xslt – Header • Xslt stylesheets MUST include this body: <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'> … </xsl:stylesheet>
Xslt – Templates • Xslt stylesheets are a collection of templates • Templates are like functions • The body of a template is the output of a transformation
Xslt - Templates • You define a template with the <xsl:template match=‘’>instruction • You call a template with the <xsl:apply-templates select=‘’>instruction 1. All elements or attributes that satisfy the the select attribute expression are selected. 2. For each element or attribute that is selected: i. A matching template is found in the stylesheet. ii. The body of the template is executed.
Xslt – choose Instruction • <xsl:choose>instruction is similar to a C++ or Java switchstatement • <xsl:when test=‘’>instruction is similar to the casestatement • <xsl:otherwise>instruction is similar to the defaultstatement
Xslt – choose Example Original Xml:<customer> <order id=‘5’> <item><title>Database Management Systems</title></item> </order> </customer> Xslt Stylesheet:<xsl:template match=‘customer’> FUNCTION <xsl:choose> SWITCH <xsl:when test='order/@id'> CASE <single_book_order> <title><xsl:value-of select='order/item/title'/></title> </single_book_order> </xsl:when> <xsl:otherwise><single_book_order><fail/> DEFAULT </single_book_order></xsl:otherwise> </xsl:choose> </xsl:template> Output Xml: <single_book_order><title>Database Management Systems</title></single_book_order>
Xslt – choose Example 2 Original Xml:<customer> <order> <item><title>Database Management Systems</title></item> </order> </customer> Xslt Stylesheet:<xsl:template match=‘customer’> FUNCTION <xsl:choose> SWITCH <xsl:when test='order/@id'> CASE <single_book_order> <title><xsl:value-of select='order/item/title'/></title> </single_book_order> </xsl:when> <xsl:otherwise><single_book_order><fail/> DEFAULT </single_book_order></xsl:otherwise> </xsl:choose> </xsl:template> Output Xml: <single_book_order><fail/></single_book_order>
Xslt – for-each Instruction • <xsl:for-each select=‘item’>instruction is similar to a foreach iteratoror a for loop • The selectattribute selects a set of elements from an Xml document
Xslt – if Instruction • <xsl:if test=‘’>instruction is similar to an if statement in Java or C++ • The testattribute is the if condition: • True • statement is true • test returns an element or attribute. • False • statement is false • test returns nothing • There is no ‘else’, so use the <xsl:choose>operator in this situation.
Xslt – for-each and if Example Original Xml:<basket> <apple color=‘red’ condition=‘yummy’/> <apple color=‘green’ condition=‘wormy/> <apple color=‘red’ condition=‘crisp’/> </basket> Xslt Stylesheet: <xsl:template match=‘basket’> FUNCTION <condition_report> <xsl:for-each select=‘apple’> FOR LOOP <xsl:if test=“contains(@color, ‘red’)”> IF <condition><xsl:value-of select=‘@condition’/></condition> </xsl:if> </xsl:for-each> </condition_report> </xsl:template> Output Xml: <condition_report> <condition>yummy</condition> <condition>crisp</condition> </condition_report>
Xslt – Other Information • W3C is standardizing XPath and Xslt: http://www.w3.org/TR/xslt.html http://www.w3.org/TR/xpath.html • Lot’s of Books. Here’s a suggestion: D. Martin et al. Professional Xml. Wrox Press, 2000.
What’s Next? • XSchema • DTDs, but written in XML • Will replace DTDs • XQuery • Fully declarative XML query language • Will be able to do anything you can do with XPath and XSLT, plus a LOT more
Xml in Commercial Databases • Many Xml parsers and XSLT engines are availbable • Microsoft, IBM, and Oracle (among others) are adding native Xml support • Native Xml databases
URL Tutorials http://msdn.microsoft.com/xml/tutorial/default.asp http://www.ils.unc.edu/~kempa/inls259/xml/ http://www.geocities.com/SiliconValley/Peaks/5957/10minxml.html