1 / 48

XML, XSD, and XSL

XML, XSD, and XSL. Bryan Hogan IBM. Terminology. World Wide Web Consortium (W3C)

mmortensen
Download Presentation

XML, XSD, and XSL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML, XSD, and XSL Bryan Hogan IBM

  2. Terminology • World Wide Web Consortium (W3C) • develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum for information, commerce, communication, and collective understanding. (www.w3.org) • Document Object Model (DOM) • a W3C standard API which describes mechanisms for software developers and Web script authors to access and manipulate parsed XML (and HTML) content. The DOM is both platform-neutral and language-neutral • Document Type Definition (DTD) • a specification of the elements and attributes that are permitted in an XML document • XML Schema • a specification of the elements and attributes that are permitted in an XML document along with the datatypes for these artifacts

  3. Terminology (continued) • Well-formed XML document • an XML document that conforms to basic XML rules • Valid XML document • a well-formed XML document that conforms to the rules specified in a DTD • Simple API for XML (SAX) • a standard interface for event-based XML parsing • Extensible Stylesheet Language Transformations (XSLT) • A language for transforming XML documents via the application of stylesheets

  4. What is XML? • eXtensible Markup Language • Markup language for describing data • Simpler than predecessor SGML (Standard Generalized Markup Language) • More versatile than HTML (HyperText Markup Language) • Self-describing: XML documents can describe other XML documents (ie. XML schema)An open standard for defining and sharing data across diverse network topologies Mark Weitzel (IBM)

  5. Why use XML? • XML data representation is human-readable, application-neutral, and language-neutral enabling universal interchange of data • XML documents provide an intuitive mechanism for initializing structured data within an application. • XML standard is open; therefore, costs are nominal

  6. <table border cellspacing=0 cellpadding=5> <tr> <th>Team name</th> <th>Score</th> </tr> <tr> <td>Clemson</td> <td>15</td> </tr> <tr> <td>NCSU</td> <td>17</td> </tr> </table> <football_game> <home> <school>NCSU</school> <score>17</score> </home> <visitor> <school>Clemson</school> <score>15</score> </visitor> </football_game> HTML and XML side by side

  7. XML document syntax • Element start/end tags<tag1></tag1> or <tag1/><tag1></TAG1><!– syntax error. XML is case sensitive  • Attributes<tag1 attribute1=“testValue” /><tag1 enabled /><!– syntax error. Allowed in HTML not XML  • Comments<!– This is an XML comment  • Entity references<tag1 attr1="&Entity1;"> • Processing instructions<?xml version="1.0"?>

  8. XML document syntax • Character data sections (CDATA)<![CDATA [<tag1>test</tag1>] ]> • Document type declarations<!DOCTYPE XmlMappingSpec SYSTEM "abtxmap.dtd" > <!DOCTYPE XmlMappingSpec SYSTEM "abtxmap.dtd" [<!ENTITY entity1 “testValue1" ><!ENTITY entity2 “testValue2" > ]>

  9. DTD syntax and terminology • Element type declarations<!ELEMENT Street (#PCDATA) >Usage:<Street>29 Oak Street</Street> • Attribute list declarations<!ATTLIST name firstName CDATA #REQUIRED ><!ATTLIST car maker (Ford | GM | BMW) >Usage:<name firstName=“George” /><car maker=“Ford” /><!-- Not valid. Validating parser will flag error <car maker=“Mercury” />

  10. DTD syntax and terminology • Entity declarations<!ENTITY IBM “International Business Machines” ><!ENTITY testDoc SYSTEM “http://mywebsite/testDoc.xml” > Usage:<Company>&IBM;</Company><!– Inline the contents of the testDoc ENTITY <root>&testDoc;</root> • Parameter entity declaration<!ENTITY % code_format “CDATA”> • Notations declarations<!NOTATION Find_Help SYSTEM “Help System” >

  11. DTD ELEMENT examples • An element with multiple required subelements.<!ELEMENT main (sub1, sub2, sub3) > • A subelement (sub2) that occurs once or not at all.<!ELEMENT main (sub1, sub2?) > • A subelement (sub2) that occurs one or more times.<!ELEMENT main (sub1, sub2+) > • A subelement (sub2) that occurs zero or more times.<!ELEMENT main (sub1, sub2*) > • An element that contains one of multiple elements.<!ELEMENT main (choice1 | choice2 | choice3) >

  12. DTD ATTLIST examples • #REQUIRED default indicates that an attribute must be specified in XML document instance.<!ATTLIST main attr1 CDATA #REQUIRED > • #IMPLIED default indicates that an attribute is not required by the XML document instance.<!ATTLIST main attr1 CDATA #IMPLIED > • #FIXED default indicates an attribute has a fixed value, and no other values are acceptable. Since the attribute value is fixed, it does NOT need to be specified in an instance document.<!ATTLIST main attr1 CDATA #FIXED “FixedValue” > • Default value supplied. The default value will be used only if no value is supplied by XML document instance.<!ATTLIST main attr1 CDATA “DefaultValue” >

  13. Parsing XML (DOM) • DOM parsers read XML into a tree structure of nodes. Node types are shown below: • Document • DocumentFragment • DocumentType • EntityReference • Element • Attr • ProcessingInstruction • Comment • Text • CDATASection • Entity • Notation

  14. DOM Element API • getAttribute, setAttribute, removeAttribute, getAttributeNode, setAttributeNode, removeAttributeNode, hasAttribute • getAttributeNS, setAttributeNS, removeAttributeNS, getAttributeNodeNS, removeAttributeNodeNS, hasAttributeNS • getElementsByTagName, getElementsByTagNameNS

  15. Parsing XML (SAX) • SAX parsers generate parsing events that are processed by handlers in an application program. Parsers allow users to plug in custom implementations of the SAX interfaces. The SAX 2.0 interfaces are: • Attributes • ContentHandler • DTDHandler • EntityResolver • ErrorHandler • Locator • XMLFilter • XMLReader

  16. SAX ContentHandler interface • characters • endDocument • endElement • endPrefixMapping • ignorableWhitespace • processingInstruction • setDocumentLocator • skippedEntity • startDocument • startElement • startPrefixMapping

  17. VAST XML parser • XML 1.0 specification • http://www.w3.org/TR/1998/REC-xml-19980210 • DOM level-2 core interfaces • http://www.w3.org/TR/1998/REC-xml-19980210 • SAX 2.0 • http://www.saxproject.org/

  18. Wedding planner DTD <!-- 3/10/2001 WildAndWackyWeddings.com retains information and performs billing for wedding planners. All wedding planners must provide records in the format specified by this DTD. --> <!ELEMENT WeddingPlanner (Address,PhoneNumber,Weddings)> <!ATTLIST WeddingPlanner Name NMTOKEN #REQUIRED id ID #REQUIRED> <!ELEMENT WeddingPlanners (WeddingPlanner*) > <!ENTITY % AddressMembers 'Street,City,State,Zip' > <!ELEMENT Address (%AddressMembers;)> <!ELEMENT PhoneNumber (#PCDATA) > <!ELEMENT Street (#PCDATA) > <!ELEMENT City (#PCDATA) > <!ELEMENT State (#PCDATA) > <!ELEMENT Zip (#PCDATA) > <!ELEMENT Weddings (Wedding)* > <!ELEMENT Wedding (Bride,Groom,Date,Time,CeremonyLocation,ReceptionLocation,Caterer,NumberOfGuests,TotalFee,BillingAddress)> <!ATTLIST Wedding id ID #REQUIRED > <!ELEMENT Bride (#PCDATA) > <!ELEMENT Groom (#PCDATA) > <!ELEMENT BillingAddress (%AddressMembers;) > <!ELEMENT CeremonyLocation (FacilityName,Address) > <!ELEMENT ReceptionLocation (FacilityName,Address) > <!ELEMENT Date (#PCDATA) > <!ELEMENT Time (#PCDATA) > <!ELEMENT Caterer (#PCDATA) > <!ELEMENT NumberOfGuests (#PCDATA) > <!ELEMENT TotalFee (#PCDATA) > <!ELEMENT FacilityName (#PCDATA) >

  19. Valid XML document <?xml version="1.0"?> <!DOCTYPE WeddingPlanner SYSTEM "wedding.dtd" > <WeddingPlanner Name="J-Lo" id="Planner_1" > <Address> <Street>29 Oak St.</Street> <City>Raleigh</City> <State>NC</State> <Zip>99999</Zip> </Address> <PhoneNumber>555-4343</PhoneNumber> <Weddings> <Wedding id="Ghezzo.G"> <!-- Detailed wedding information removed ... --> </Wedding> </Weddings> </WeddingPlanner>

  20. VAST XML parser example " Validating parser used to read well-formed XML and verify that the contents conform to the DTD referenced in the XML." | domDocument domElement | domDocument := AbtXmlDOMParser newValidatingParser parseURI: ‘d:\ncsu_2003\wedding1.xml’. domElement := domDocument getElementById: ‘Planner_1’. " Non-validating parser used to read well-formed XML data. " | domDocument domElements | domDocument := AbtXmlDOMParser newNonValidatingParser parseURI: ‘d:\ncsu_2003\wedding1.xml’. domElements := domDocument getElementsByTagName: ‘Address’.

  21. XML namespaces An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names. In an XML instance document, items are namespace qualified using a namespace prefix. The reserved word xmlns is used to associate an arbitrary namespace prefix with the actual namespace. Items in the document instance are prefixed to identify the namespace containing their definition.

  22. XML namespace example <?xml version="1.0"?> <vastwsd:deployment targetNamespace="urn:SstWSInsurancePolicyInterface" xmlns:vastwsd="urn:VASTWebServiceDeployment600" xmlns:vast="Smalltalk" xmlns:swsipi="urn:Test"> <services> <service name="SstWSInsurancePolicyInterface" namespace="urn:Test"> <serviceInterfaceClass>SstWSService</serviceInterfaceClass> <provider type="swsipi:TestProvider"> <vast:provider className=“TestClass" creationMethod="new"/> </provider> </service> </services> </vastwsd:deployment>

  23. XML schema • XML schema improves on XML DTD • XML schemas are coded in XML • XML schema includes type information allowing object models to be represented • The W3C XML Schema Primer is a great resource for basic information about XML schema. (http://www.w3.org/TR/xmlschema-0)

  24. <schema> • The top-level element of an XML schema document. The schema element typically includes the namespace associations required by the schema. <xs:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://schemas.xmlsoap.org/soap/envelope/" targetNamespace="http://schemas.xmlsoap.org/soap/envelope/" >

  25. <element> • An “instance” of a schema type. An element in XML schema is much like a variable declaration in typed languages like Java. <xsd:element name=“field1" type="xsd:QName" /> <xsd:element name=“field2" type="xsd:string" minOccurs=“0” /> <xsd:element name=“field3" type="xsd:string” maxOccurs=“unbounded” /> <xsd:element name=“field4" type="xsd:QName" nillable=“true” /> <xsd:element name=“field5" type="xsd:string" default=“TestValue” /> <xsd:element name=“field6" ref=“tns:field1” /> <xsd:element name=“field7" type=“xsd:string” form=“qualified” />

  26. <attribute> • Used to represent simple values associated with an XML element. Items represented as attributes cannot contain other attributes or elements. <xsd:attribute name=“field1" type="xsd:QName" /> <xsd:attribute name=“field2" type="xsd:string" minOccurs=“0” /> <myElement field1=“tns:FooBar” field2=“testString” />

  27. <simpleType> • Used to describe the content of XML elements that contain simple data, but no subelements or attributes. Below is a list of simpleTypes that are defined in the base XML schema (http://www.w3.org/2001/XMLSchema). Some of the base types are derived from other types. string, normalizedString, token, byte, unsignedByte, base64Binary, hexBinary, integer, positiveInteger, negativeInteger, nonNegativeInteger, nonPositiveInteger, int, unsignedInt, long, unsignedLong, short, unsignedShort, decimal, float, double, boolean, dateTime, duration, date, gMonth, gYear, gDay, gMonthDay, Name, Qname, NCName, anyURI, language, ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS

  28. <restriction> • Used to define a new schema type by supplying constraints (restrictions) for an existing schema type. <xsd:simpleType name=“customInteger” > <xsd:restriction base=“xsd:integer” > <xsd:minInclusive value=“100” /> <xsd:maxInclusive value=“1000” /> </xsd:restriction> </xsd:simpleType>

  29. <extension> • Used to derive a new schema type by extending the properties of an existing type (analogous to subclass).

  30. <extension> example <complexType name="Address"> <sequence> <element name="name" type="string"/> <element name="street" type="string"/> <element name="city" type="string"/> </sequence> </complexType> <complexType name="USAddress"> <complexContent> <extension base="ipo:Address"> <sequence> <element name="state" type="ipo:USState"/> <element name="zip" type="positiveInteger"/> </sequence> </extension> </complexContent> </complexType>

  31. <enumeration> • Used to provide a list of valid values for an extended type. <xsd:simpleType name="USState"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="AK"/> <xsd:enumeration value="AL"/> <xsd:enumeration value="AR"/> <!-- and so on ... --> </xsd:restriction> </xsd:simpleType>

  32. <complexType> • Used to describe XML elements that can contain attributes and subelements. <xsd:complexType name="deploymentType"> <xsd:sequence> <xsd:element minOccurs="0" name="container" type="tns:containerType" /> <xsd:element minOccurs="0" name="services" type="tns:servicesType" /> </xsd:sequence> <xsd:attribute name="targetNamespace" type="xsd:anyURI" /> </xsd:complexType>

  33. <sequence> • Used to specify a group of elements that must appear in an instance document in the same order that they are defined in the schema type definition. <xsd:sequence> <xsd:element name=“field1" type=“xsd:string" /> <xsd:element name=“field2" type=“xsd:string" /> </xsd:sequence>

  34. <choice> • Used to specify that one element or element group out of potentially many will be included in a document instance. <xsd:choice> <xsd:element name=“field1" type=“xsd:string" /> <xsd:element name=“field2" type=“xsd:string" /> </xsd:choice>

  35. <all> • Used to specify a group of elements that may appear once or not at all in an instance document, and the elements may appear in any order. <xsd:all> <xsd:element name=“field1" type=“xsd:string" /> <xsd:element name=“field2" type=“xsd:string" /> </xsd:all>

  36. <import> • Used to specify a namespace that is referenced by one or more declarations in the schema being defined. An import may specify the schemaLocation from which the namespace definitions can be retrieved. <xsd:import namespace=“urn:MyOtherNamespace" schemaLocation="http://www.myserver.com/otherns.xsd" />

  37. <include> • Used to pull in definitions from an external resource. The definitions must be in the same namespace as the schema where the <include> is specified. <xsd:include schemaLocation=“moredefinitions.xsd" />

  38. Other schema tags • annotation, appInfo, attributeGroup, complexContent, documentation, field, group, key, keyref, length, list, maxInclusive, maxLength, minInclusive, minLength, pattern, redefine, selector, simpleContent, union, unique

  39. Special attributes (xsi:nil) Used in an XML instance document to indicate that the value of an element is nil Given the following schema definition: <element name="myValue" type="xsd:string" nillable="true" /> <!-- The presumed value of the 'myValue' element below is the empty string --> <myValue></myValue> <!-- The presumed value of the 'myValue' element below is nil --> <myValue xsi:nil="true"></myValue>

  40. Special attributes (xsi:type) Used to enable usage of a derived type where the base type is expected. <!– Schema definition for billTo specifies type ‘Address’ <billTo xsi:type="ipo:USAddress"> <name>Robert Smith</name> <street>8 Oak Avenue</street> <city>Old Town</city> <state>PA</state> <zip>95819</zip> </billTo>

  41. Wedding planner schema <!-- Schema for wedding planner app. This schema is directly dervied from the wedding.dtd file --> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:tns="urn:WeddingPlanner" targetNamespace="urn:WeddingPlanner" > <xsd:complexType name="Wedding"> <xsd:sequence> <xsd:element name="Bride" type="xsd:string" /> <xsd:element name="Groom" type="xsd:string" /> <xsd:element name="Date" type="xsd:string" /> <xsd:element name="Time" type="xsd:string" /> <xsd:element name="CeremonyLocation" type="tns:Location" /> <xsd:element name="ReceptionLocation" type="tns:Location" /> <xsd:element name="Caterer" type="xsd:string" /> <xsd:element name="NumberOfGuests" type="xsd:int" /> <xsd:element name="TotalFee" type="xsd:decimal" /> <xsd:element name="BillingAddress" type="tns:Address"/> </xsd:sequence> <xsd:attribute name="Name" type="tns:string" /> </xsd:complexType> <!-- Remainder of schema not included in order to save space  </xsd:schema>

  42. XSL (XML Stylesheet Language ) • Enables separation of data content and format • Enables standardized style of presentation • Customizable based upon individual preferences • XSL stylesheets are declarative. Each instruction tells the processor “what” to perform in contrast to imperative languages that tell the processor “how” to perform.

  43. XML Transformations • Great For Interoperability Problems • Transforms Data From A Source Data Format To A Target Format • Source Is XML, Target Is Some Kind Of Text Format • Target Can Be XML • XSLT Is Used For Transformations • Can Exploit Coalescing Around A Standard

  44. XML Transformations XML + XSLT = HTML XML + XSLT = XHTML XML + XSLT = Text XML + XSLT = XML XML + XSLT = SVG (Picture) XML + XSLT = Whatever (Non-binary)

  45. XML Transformations

  46. XSL element names • xsl:stylesheet • xsl:template • xsl:apply-templates • xsl:comment • xsl:pi • xsl:element • xsl:attribute • xsl:value-of • xsl:for-each • xsl:if • xsl:choose • xsl:when • xsl:otherwise • xsl:copy

  47. XSL example <!-- Only part of this XSL stylesheet is shown here due to space constraints  <xsl:template match="football_game"> <html> <head><title>Game results</title></head> <body bgcolor="#ffffff" text="#000000"> <table width="100%" border="1" cellspacing="0" cellpadding="4"> <th align="left">Team name</th> <th align="left">Team score</th> <xsl:apply-templates select="home"/> <xsl:apply-templates select="visitor"/> </table> </body> </html> </xsl:template> <xsl:template match="home"> <tr> <td><b><xsl:value-of select="/football_game/home/school"/></b></td> <td><b><xsl:value-of select="/football_game/home/score"/></b></td> </tr> </xsl:template>

  48. Open source downloads • Xalan XSLT processor (for Java or C++)http://xml.apache.org/ • Xerces XML validating parser (for Java or C++)http://xml.apache.org/

More Related