1 / 30

G52IWS: Extensible Markup Language (XML)

Learn about XML, its syntax, standards, and document structure. Understand elements, tags, attributes, and entities used in XML. Explore how DTDs and XML Schema define constraints in XML documents.

kgilmore
Download Presentation

G52IWS: Extensible Markup Language (XML)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. G52IWS: Extensible Markup Language (XML) Chris Greenhalgh

  2. Contents • What is XML • XML standards • XML Syntax • DTDs • XML Schema See “Developing Java Web Services” chapter 8, first part and G51WPS notes on XML; see W3C standards

  3. What is XML • Text-based language for structured data encoding • Tree-structured • Common abstract syntax • any XML document can be read by a common parser • DTDs or XML-Schema define particular application-specific constraints • E.g. new tags, allowed structures & datatypes

  4. XML standards • Created in 1996 • Derived from SGML markup language • Managed by the W3C XML (www.w3c.org) group(s) since 1998 • http://www.w3.org/XML/Core/#Publications inc: • Extensible Markup Language (XML) 1.0 (Fourth Edition) • http://www.w3.org/XML/Schema#dev inc: • XML Schema Part 0: Primer • http://www.w3.org/XML/Query/#specs inc: • XML Path Language (XPath) 2.0 • …

  5. XML Example (no DTD) <?xml version="1.0" ?> <Friends> <Person> <Name>Jane Doe</Name> <Age>21</Age> <Body> <Weight Unit="lbs">126</Weight> <Height Unit="inches">62</Height> </Body> <Trust trusted="yes"/> </Person> <Person> <Name>John Doe</Name> <Age>26</Age> <Trust trusted="no"/> </Person> </Friends>

  6. XML document structure • Prolog • Document type declaration • Optional • Includes element declarations • Root element • With nested elements • With optional attributes • With optional text content (incl. CDATA sections) • Interleaved with optional comments and processing instructions

  7. XML Syntax Contents • Prolog • Root • Processing instructions • Comments • Names • Tags • Elements • Content and CDATA sections • Attributes • Entities • Namespaces

  8. Prolog • Every XML document starts with prolog, e.g. <?xml version="1.0" ?><?xml version="1.0" encoding="ISO-8859-1" ?> • Known start allows multi-byte and byte-order encodings to be identified • Allows specific encoding to be specified • Defaults to Unicode (UTF-8 if single byte)

  9. Root • Every XML document has exactly one “top”-level or root element, e.g. <?xml version="1.0" ?> <Friends> … </Friends> • But not e.g. <?xml version="1.0" ?> <Friends> … </Friends> <Friends> … </Friends>

  10. Processing instructions • Provide information for XML processing application(s) • Are of the form: <?targetinstructions?> • Includes the document prolog:<?xml version="1.0" ?>

  11. Comments • Used for documentation • Are of the form:<!-- some comment --> • E.g.:<?xml version="1.0" ?><!-- my friends --><Friends> <!-- my first friend --> <Person> … </Person></Friends>

  12. Names • No blanks spaces • Must start with alphabetical letter (e.g. A-Z or a-z) or underscore (_) • Can be followed by letters, digits (0-9), underscores (_), hyphens (-), periods (.) and colons (:) • Colons are normally reserved for use with namespaces • Case-sensitive • E.g. “product” is different from “Product”

  13. Tags • Main building block of XML • Start tag:<tagname optional-attributes> • End tag:</tagname> • Empty-element tag:<tagname optional-attributes/>

  14. XML Example <?xml version="1.0" ?> <Friends> <Person> <Name>Jane Doe</Name> <Age>21</Age> <Body> <Weight Unit="lbs">126</Weight> <Height Unit="inches">62</Height> </Body> <Trust trusted="yes"/> </Person> <Person> <Name>John Doe</Name> <Age>26</Age> <Trust trusted="no"/> </Person> </Friends> Start tag without attributes Start tag with attributes Empty-element tag End tag

  15. Elements • Basic building block of XML • Have form: • Start tag … matching end tag or • Empty-element tag • Never overlap • Unlike SGML • E.g. can’t have “<a>…<b>…</a>…</b>” • But can be nested • I.e. a tree, starting from the root element • E.g. can have “<a>…<b>…</b>…</a>” • Can contain textual content

  16. Content and CDATA sections • Within elements • between start and end tags • Plain text • Whitespace optionally significant • No ‘<‘ or ‘&’ • Use entity references instead (“&lt;&amp;”) • CDATA “escape” section can include any text unescaped except “]]>” e.g.<![CDATA[<hello>&asoa,osd>as<]]>

  17. Attributes • Set of key-value pairs associated with each element • Defined in the start tag or empty-element tag • never in the end tag • Optional • Each key must be unique within that element • E.g. attribute key is “Unit” and value is “lbs”:<Weight Unit="lbs">126</Weight>

  18. Entities • Short-cuts/references to text • Of the form:&entityname; • E.g.&lt; <&gt; >&amp; &&quot; "&apos; ' • More can be defined in the (optional) DTD

  19. Namespaces • Are contexts within which names are defined • Prevent confusion between coincidental uses of the same names (for elements or attributes) • Namespace is a URI • Never actually resolved to a document • Default namespace introduced by attributexmlns="namespaceuri" • Applies to that and all subsequent unqualified element names (NOT attribute names) • Namespace prefix introduced by attributexmlns:prefix="namespaceuri" • Used explicitly as “prefix:name” • No namespace is the same as the empty URI “” • This is the top-level default namespace and default namespace for all attributes at any level

  20. Namespace example Expanded names <?xml version="1.0" ?> <Friends xmlns="http://woo.foo/"> <Person xmlns:n2="http://wee.fee/"> <n2:Name>Jane Doe</n2:Name> <Age xmlns="http://wee.fee/">21</Age> <Weight Unit="lbs">126</Weight> <Height n2:Unit="inches">62</Height> </Person> </Friends> “http://woo.foo/”,”Friends” Default NS“http://woo.foo/” “http://woo.foo/”,”Person” “http://wee.fee/”,”Name” Default NS“http://wee.fee/” “http://wee.fee/”,”Age” “http://woo.foo/”,”Weight” (att.) “”,”Unit” (att.) “http://wee.fee/”,”Unit”

  21. Document Type Definitions • Use regular expressions to specify valid document structure • Element nesting, required and optional attributes, default values • May be included after prolog in document • Or may be referenced from an external name or URL • Relatively limited expressiveness, especially for attribute and text values See G51WPS notes

  22. XML Schema • More modern alternative to DTDs for specifying valid XML document structure and content • See http://www.w3.org/XML/Schema#dev • XML Schema Part 0: Primer • XML Schema Part 1: Structures • XML Schema Part 2: Datatypes

  23. XML Schema • An XML Schema definition is an XML document conforming to the XML Schema schema  • Allows definition of • simple types • Without nested elements • Including built-in types such as xsd:decimal, xsd:string • complex types • with nested elements and optional attributes • Elements (which may be simple or complex) • Attributes (which all have simple types)

  24. XML Schema example 1 <?xml version="1.0"?> <schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://woo.foo/"> <xsd:element name="comment" type="xsd:string"/> </schema> Defines one element “http://woo.foo”,”comment” of simple type xsd:string, e.g. <?xml version="1.0"?> <comment xmlns="http://woo.foo/">this is a comment</comment>

  25. XML Schema example 2 <?xml version="1.0"?> <schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://woo.foo/"> <xsd:simpleType name="Chocolate"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="dark"/> <xsd:enumeration value="milk"/> <xsd:enumeration value="white"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="chocolate" type="Chocolate"/> </schema> Defines one element “http://woo.foo”,”chocolate” of new simple type “http://woo.foo”,”Chocolate”, which must be “dark”, “milk” or “white” <?xml version="1.0"?> <chocolate xmlns="http://woo.foo/">dark</chocolate>

  26. XML Schema example 3 <?xml version="1.0"?> <schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://woo.foo/"> <xsd:complexType name=“ThreePiece"> <xsd:sequence> <xsd:element name="lead" type="xsd:string" minOccurs="1" maxOccurs="1"/> <xsd:element name="bass" type="xsd:string" minOccurs="1" maxOccurs="1"/> <xsd:element name="drums" type="xsd:string" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <xsd:element name=“band" type="ThreePiece"/> </schema> Defines one element “http://woo.foo”,”band” of new complex type “http://woo.foo”,”ThreePiece”, with three mandatory child elements <?xml version="1.0"?> <band xmlns="http://woo.foo/"> <lead>Bill</lead> <bass>Bob</bass> <drums>Ben</drums> </band>

  27. XML Schema example 4 <?xml version="1.0"?> <schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://woo.foo/"> <xsd:complexType name=“WeightType"> <xsd:simpleContent> <xsd:extension base="xsd:double"> <xsd:attribute name="Units" type="xsd:string"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> <xsd:element name="weight" type=“WeightType"/> </schema> Defines one element “http://woo.foo”,”weight” of new simple type “http://woo.foo”,”Chocolate”, which must be “dark”, “milk” or “white” <?xml version="1.0"?> <weight xmlns="http://woo.foo/" Units="kg">dark</chocolate>

  28. XML Schema built-in data types • string • base64binary – Base64 encoded binary • boolean – true or false • decimal – integers • double – 64 bit floating point • float – 32 bit floating point • anyUri – URI • duration – duration • dateTime- date & time • … And various restrictions, e.g. minimum & maximum values, lengths

  29. Complex type building blocks • Element combinations: • Sequence – in order given, specifiable count • All – in any order, 0 or 1 of each • Choice – one of • Additional constructions • Reusable groups of elements • Reusable groups of attributes • Substitution groups • Alternative elements which may appear in a particular place

  30. Summary • XML • Common abstract syntax • Hierarchical element tree, plus content and attributes • XML Schema • Specifies XML elements and allowed structure and content for XML document(s) • Checked by “validating” parsers • Used to formally specify WSDL, SOAP, etc. • Can be used to generate schema-specific APIs • E.g. Java API for XML Binding (JAXB) • Typically more readable code than DOM or SAX

More Related