1 / 53

XML_1

XML_1. Ch. 7 Fall 2010. Bibliography. W3C Recommendations http://www.w3.org/TR/REC-xml/ XML online tutorials http://www.w3schools.com/xml/default.asp Java API for XML Processing (JAXP): https://jaxp.dev.java.net/ http://www.xml-training-guide.com/ Examples from textbook

phiala
Download Presentation

XML_1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML_1 Ch. 7 Fall 2010 Comp Sci 346

  2. Bibliography • W3C Recommendations http://www.w3.org/TR/REC-xml/ • XML online tutorials • http://www.w3schools.com/xml/default.asp • Java API for XML Processing (JAXP): https://jaxp.dev.java.net/ • http://www.xml-training-guide.com/ • Examples from textbook • Examples from “Internet & World Wide Web How to Program” Third Edition by Deitel, Deitel, and Goldberg Prentice Hall Comp Sci 346

  3. Extensible Markup Language (XML) What is XML? • A meta-markup language • A technology for creating markup languages Why should you learn XML? • It allows you to invent your own tags • XML documents can be easily parsed • XML is portable • The Web is becoming XML-based rather than HTML-based Comp Sci 346

  4. Based on tag pairs Purpose: Markup language Displays the data Focus: How it looks Does not care about the meaning of contents Predefined tags Based on tag pairs Purpose: Meta-markup language to define markup language ML defined by XML describes the data Focus:meaning (what) of data Method of data exchange Content must be well structured No predefined tags XHTML vs XML Comp Sci 346

  5. How to display XML data? • XHTML: Use CSS (Cascading Style Sheet) • XSL (Extensible Style Language) Comp Sci 346

  6. Historical Development SGML( Standard Generalized Markup Language) XML HTML MusicXML . . . MathML XHTML CML RSS XBRL Comp Sci 346

  7. XML Syntax • Same as XHTML (an application of XML) • First line: <?xml version = "1.0"?> • Tree of elements: one root element • Element: opening tag, content, closing tag • Opening tag format: <tag_name> • Closing tag format: </tag_name> • Opening tag may contain attributes • Attribute values must be quoted Comp Sci 346

  8. XML Documents • Contain marked up data • Do not contain any formatting information • XML parser passes data on to an application (e.g. browser) • Stylesheet may be applied to render the document Comp Sci 346

  9. XML Documents • Two major elements • Prolog • XML declaration statements • Processing instructions • Comments • For example: <?xml version = "1.0" encoding = "utf-8"?> <!--Sample XML document --> • Body Comp Sci 346

  10. XML Document • XML is hierarchical • E.g. • Book • Title • Chapter • Paragraph <Book> <Title> </Title> <Chapter> <paragraph> </paragraph> </Chapter> </Book> Comp Sci 346

  11. XML Document • SystemMessage as an example <SystemMessage> <MessageTitle>System Down for Maintenance</MessageTitle> <MessageBody>Going down for maintenance soon! </MessageBody> <MessageAuthor> <MessageAuthorName>Joe SystemGod </MessageAuthorName> <MessageAuthorEmail> systemgod@someserver.com </MessageAuthorEmail> </MessageAuthor> < MessageDate> Oct. 19, 2010</MessageDate> </SystemMessage> Comp Sci 346

  12. Rules • XML is case-sensitive • All XML tags must be properly closed • XML tags must be properly nested • No overlapping tags are allowed Comp Sci 346

  13. View 1article.xml with Browsers • Microsoft Internet Explorer • Netscape • Mozilla • Firefox • Opera • What did you discover? Comp Sci 346

  14. Comp Sci 346

  15. Comp Sci 346

  16. A more complicated xml document • XML for a Business Letter • See 2letter.xml Comp Sci 346

  17. <?xml version = "1.0"?> <!-- Fig. 20.3: 2letter.xml --> <!-- Business letter formatted with XML --> <!DOCTYPE letter SYSTEM "letter.dtd"> <letter> <contact type = "from"> </contact> <contact type = "to"> </contact> <salutation>Dear Sir:</salutation> <paragraph>It is our privilege to inform you about our new database managed with XML. This new system allows you to reduce the load of your inventory list server by having the client machine perform the work of sorting and filtering the data.</paragraph> <closing>Sincerely</closing> <signature>Mr. Doe</signature> </letter> Comp Sci 346

  18. DTD • DTD (Document-Type-Information) added • XML document does not need DTD • XML parsers need DTD to ensure that XML documents have the proper structure • Use a validator • www.w3.org/XML/Schema.html for suggestions • Microsoft downloadable XML validatorhttp://support.microsoft.com/kb/307379 Comp Sci 346

  19. Comp Sci 346

  20. Comp Sci 346

  21. The contact element <contact type = "from"> <name>John Doe</name> <address1>123 Main St.</address1> <address2></address2> <city>Anytown</city> <state>Anystate</state> <zip>12345</zip> <phone>555-1234</phone> <flag gender = "M"/> </contact> Comp Sci 346

  22. Characters XML documents may contain • Carriage returns • Line feeds • Unicode characters Angle brackets < > delimit markup text Character data is the text between start and end tags Comp Sci 346

  23. Reserved Characters • These characters may not be used in character data: & < > ' " • Entity references are used if reserved characters are needed in character data <blah> &quot;Hello &amp; goodbye.&quot; </blah> Comp Sci 346

  24. Unicode Characters Entity references are used for Unicode characters not found on keyboard Example: &#1583; denotes an Arabic character Comp Sci 346

  25. DOCTYPE • XML documents may contain a <!DOCTYPE> tag <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0Strict//EN" "http://www.w3.org/TR/chtml1/DTD/xhtml1-strict.dtd"> • Specifies root element (html) • Information about location of the document type definition (dtd) Comp Sci 346

  26. CDATA Sections • Sections of XML doc ignored by parser • May contain special characters • Example: JavaScript code <![CDATA[ XML parser <<< ignores >>> all of this stuff. Note no space between first [ and CDATA; CDATA and second [ ]]> Comp Sci 346

  27. Namespaces • Document authors may invent their own elements • Tag names may be reused • Naming collisions must be avoided • How? Comp Sci 346

  28. XML Namespace • XML • Allows document authors to create custom elements • Naming collisions • XML namespace • Collection of element and attribute names may conflict • <subject>Math</subject> • <subject>xhtml</subject> • Uniform resource identifier (URI) • Uniquely identifies the namespace • A string of text for differentiating names • <School:subject>Math</School:subject> • <web_programming:subject>xhtml</web_programming:subject> • Any name except for reserved namespace xml • Directory • Root element and contains other elements Comp Sci 346

  29. Specify Namespace with URIExample: namespace.xml <?xml version = "1.0"?> <!-- Fig. 20.4 : namespace.xml --> <!-- Demonstrating Namespaces --> <directory xmlns:text = "urn:deitel:textInfo" xmlns:image = "urn:deitel:imageInfo"> <text:file filename = "book.xml"> <text:description>A book list</text:description> </text:file> <image:file filename = "funny.jpg"> <image:description> A funny picture</image:description> <image:size width = "200" height = "100"/> </image:file> </directory> Comp Sci 346

  30. Or use URL • <text:directory xmlns:text = http://www.deitel.com/xml-text Xmlns:image = http://www.deitel.com/xmlns-image> Comp Sci 346

  31. Default namespaceExample: defaultnamespace.xml <directory xmlns = "urn:deitel:textInfo" xmlns:image = "urn:deitel:imageInfo"> <file filename = "book.xml"> <description>A book list</description> </file> <image:file filename = "funny.jpg"> <image:description>A funny picture</image:description> <image:size width = "200" height = "100"/> </image:file> </directory> Comp Sci 346

  32. How to specify structure of document • Two methods for defining an XML document's structure • DTD • Schema • "Valid" XML doc: conforms to DTD or schema • Note: a doc may be well-formed but invalid Comp Sci 346

  33. DTD • Document Type Definition • Uses Extended Backus-Naur Form (EBNF) grammar to define structure • DTDs exist for XHTML strict and transitional • Used by validation services Comp Sci 346

  34. Document Type Definitions • Enables XML parser to verify whether XML document is valid • Allow independent user groups to check structure and exchange data in standardized format • Expresses set of rules for structure using EBNF grammar • ELEMENT type declaration • Defines rules • ATTLIST attribute-list declaration • Defines an attribute Comp Sci 346

  35. DTD for 2letter.xml <!-- Fig. 20.4: letter.dtd --> <!-- DTD document for letter.xml --> <!ELEMENT letter ( contact+, salutation, paragraph+, closing, signature )> <!ELEMENT contact ( name, address1, address2, city, state, zip, phone, flag )> <!ATTLIST contact type CDATA #IMPLIED> <!ELEMENT name ( #PCDATA )> <!ELEMENT address1 ( #PCDATA )> <!ELEMENT address2 ( #PCDATA )> <!ELEMENT city ( #PCDATA )> <!ELEMENT state ( #PCDATA )> <!ELEMENT zip ( #PCDATA )> <!ELEMENT phone ( #PCDATA )> <!ELEMENT flag EMPTY> <!ATTLIST flag gender (M | F) "M"> <!ELEMENT salutation ( #PCDATA )> <!ELEMENT closing ( #PCDATA )> <!ELEMENT paragraph ( #PCDATA )> <!ELEMENT signature ( #PCDATA )> Comp Sci 346

  36. Indicators • +: one or more elements • *: optional element that can occur any number of times • ?: optional element that can occur at most once • No indicator: exactly once Comp Sci 346

  37. ATTLIST – Attribute-List Declaration • Defines type <!ATTLIST contact type CDATA #IMPLIED> • If no type, arbitrary or ignore <!ATTLIST contact type CDATA #REQUIRED> • Attribute must be present <!ATTLIST contact type CDATA #FIXED> • Attribute if present must have the given fixed value <!ATTLIST address zip #FIXED “54901”> Comp Sci 346

  38. XHTML11.DTD Comp Sci 346

  39. Why Schema? • Development feels DTD inflexible • Cannot manipulate DTD as XML documents • DTD defines structure, not contents • <quantity>5</quantity> • 5 is treated as PCDATA – Parsed Character Data • Parser verifies that 5 is PCDATA but not numeric • Even <quantity>string</quantity> is acceptable • XML Schema allows specification that quantity must be numeric Comp Sci 346

  40. Schema • New and improved method describing XML doc structure • Uses XML syntax • Schema is an XML document • Schemas may be modified by software • Allows more detailed specification of element content • Tutorial: www.w3schools.com/schema/default.asp Comp Sci 346

  41. W3C XML Schema Documents • Properties • Specify XML document structure • Do not use EBNF grammar • Use XML syntax • Can be manipulated like other XML documents • Require validating parsers • W3C XML schemas www.w3.org/XML/Schema • XML document is “Schema valid” means • XML document conforms to a schema document • Schemas uses .xsd extension Comp Sci 346

  42. W3C XML Schema Documents • Root element schema e.g. book.xsd • Contains elements that define the XML document structure • targetNamespace • Namespace of XML vocabulary the schema defines • Same as the xmlns defined in book.xml • Xml document is connected via this targetNamespace • element tag • Defines element to be included in XML document structure • name and type attributes • Specify element’s name and data type respectively • Built-in simple types • String, date, int, double, time, etc Comp Sci 346

  43. W3C XML Schema Documents • Two categories of data types • Simple types • Cannot contain attributes or child elements • Complex types • May contain attributes and child elements • complexType • Define complex type • Simple content • Cannot have child elements • Complex content • May have child elements Comp Sci 346

  44. Is the schema correctly specified? • To validate book.xsd, use XSV (XML Schema Validator) open source: www.w3.org/2001/03/webdata/xsv • Or Free Trials http://www.stylusstudio.com/xml_parsers.html Comp Sci 346

  45. Comp Sci 346

  46. Comp Sci 346

  47. Comp Sci 346

  48. Online XSD Schema Validator • Schema\book.xml is based on Schema specification • Schema\book.xsd, an XML Schema document, defines the structure for book.xml • Schemas use .xsd extension • Does book.xml conform to the schema book.xsd? • Cut and paste book.xml and book.xsd into www.xmlforasp.net/Schemavalidator.aspx and validate • Or SAX (exercise) http://msdn2.microsoft.com/en-us/library/ms756991.aspx Comp Sci 346

  49. Book.xml <?xml version = "1.0"?> <!-- Fig. 20.7 : book.xml --> <!-- Book list marked up as XML --> <deitel:books xmlns:deitel = "http://www.deitel.com/booklist"> <book> <title>XML How to Program</title></book> <book> <title>C How to Program</title></book> <book> <title>Java How to Program</title></book> <book> <title>C++ How to Program</title></book> <book> <title>Perl How to Program</title> </book> </deitel:books> Comp Sci 346

  50. Book.xsd <?xml version = "1.0"?> <!-- Fig. 20.8 : book.xsd --> <!-- Simple W3C XML Schema document --> <schema xmlns = "http://www.w3.org/2001/XMLSchema" xmlns:deitel = "http://www.deitel.com/booklist" targetNamespace = "http://www.deitel.com/booklist"> <element name = "books" type = "deitel:BooksType“ /> <complexType name = "BooksType"> <sequence> <element name = "book" type = "deitel:SingleBookType" minOccurs = "1" maxOccurs = "unbounded"/> </sequence> </complexType> <complexType name = "SingleBookType"> <sequence><element name = "title" type = "string"/></sequence> </complexType> </schema> Comp Sci 346

More Related