1 / 27

What is XML?

What is XML?. XML = Extensible Markup Language Subset of the Standard Generalized Markup language (SGML) Standard for defining descriptions of the structure of different types of electronic documents Specified by the World Wide Web Consortium (W3C) http://www.w3.org/XML/

randi
Download Presentation

What is XML?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What is XML? • XML = Extensible Markup Language • Subset of the Standard Generalized Markup language (SGML) • Standard for defining descriptions of the structure of different types of electronic documents • Specified by the World Wide Web Consortium (W3C) • http://www.w3.org/XML/ • Initial Draft in November 1996 from an SGML conference

  2. What is XML? • Very similar to HTML in appearance • Standard for data interchange • Used for structured hierarchical information • XML is not just for web documents! • Can be used for information storage, information transfer • XML is not a programming language!

  3. A Simple XML Example <?xml version="1.0"?> <HockeyPlayer name=“Wayne Gretzky" position="centre" number="11" shoots="right"> <SeasonRecord year="1999"> <goals>12</goals> <assists>17</assists> <points>29</points> </SeasonRecord> <SeasonRecord year="2000"> <goals>13</goals> <assists>18</assists> <points>31</points> </SeasonRecord> </HockeyPlayer>

  4. XML Syntax • Elements • Example: <name>Kane</name> • The element above has an open and a closing tag • Attributes • Each tag may have one or more attributes • Example: <name type=“given”>Kane</name> • In this case, the attribute is type, and its value is given

  5. XML Syntax (2) • XML Prolog • The first line of every XML document • Example: <?xml version="1.0" standalone="yes"?> • Prolog rules • Must be the first element in the document • Can contain version information, DTD declaration, comments and/or processing instructions • Processing Instructions • Instructions to pass onto the user agent (browser) • Example: <?this is a processing instruction?>

  6. XML Syntax (3) • Entities • Internal Entities – used for shortcuts and macros within the XML document <!ENTITY entityname "replacement text"> <!ENTITY CS214 ‘Computer Science 214'> • External Entities – used to incorporate contents from external documents <!ENTITY chap1 SYSTEM "chap1.xml"> <!ENTITY mypicture SYSTEM "pic01.gif" GIF> • Usage: In the XML document: All occurrences of &CS214; is replaced by the XML viewer with: Computer Science 214

  7. XML Syntax (4) • Case sensitivity • Prolog is case-sensitive and must start with a lower-case xml <?xml … ?> • Elements are also case sensitive unlike HTML <TABLE></table> … acceptable in HTML, but not XML <TABLE></TABLE> … is XML-acceptable <taBlE></taBlE> … is also acceptable, as long as the case of the start and end tags match exactly. • Special attention to attributes as well <PICTURE width=“700px”> and <PICTURE WIDTH=“700px”> define two different attributes

  8. XML Syntax (5) • Whitespace • In HTML, only the first whitespace is significant: • In XML, all whitespace is passed from the parsed to the application as-is. <B>Hello, Moe!</B> • HTML: • XML: Hello, Moe! Hello, Moe!

  9. XML Syntax (6) • Comments • Indicated by <!-- comment --> tags (as in HTML) • Empty Elements • Elements that do not explicitly have an end tag must be closed with a terminating / (slash) symbol • In HTML: <IMG SRC=“somepic.jpg> • In XML: <IMG SRC=“somepic.jpg /> • Thus, all elements must be terminated

  10. Good XML Style • Attributes vs. Elements • Confusion exists between using an element or an attribute to represent data • Some guidelines • Visibility • Shoes: Manufacturer’s Code [attrib]; Shoe size [element] • Consumer/Provider • Database: Student ID [attrib]; Student address [element] • Container vs. Contents • Sports: Team Name [attrib]; Team Players [element]

  11. Good XML Style (2) • Well-Formed vs. Valid XML Documents • Well-Formed: adheres to XML syntax rules • Valid XML: follows the definitions provided in the DTD files • XML validators exist on the Internet • http://www.garshol.priv.no/download/xmltools/cat_ix.html • JAXP can validate the syntax of XML files • (Java API for XML Processing)

  12. Document-centric vs. data-centric XML • XML documents can be written in one of two forms: • Data-centric: elements are highly structured with a fairly regular pattern • Example: database records • Document-centric: document is designed for human reading • Example of document-centric: books or HTML documents

  13. Data-centric XML <student given_name=“Kane” last_name=“”>Joe <course>CPSC547</course> <course>CPSC533</course> <course>CPSC559</course> </student> <student given_name=“Gabriel” last_name=“Grey”> <course>CPSC547</course> <course>CPSC571</course> </student> <student given_name=“Patrick” last_name=“Holt”> <course>CPSC547</course> </student>

  14. Document-centric XML <Product> <Name>Hamster Wrench</Name> <Developer>Full Fabrication Labs, Inc.</Developer> <Summary>Like a monkey wrench, but not as big.</Summary> <Description> <Para>The hamster wrench, which comes in <i>both right- and left-handed versions (skyhook optional)</i>, is made of the <b>finest stainless steel</b>. The Readi-grip rubberized handle quickly adapts to your hands, even in the greasiest situations. Adjustment is possible through a variety of custom dials.</Para></Description>

  15. General Benefits • Human readability of data • Text non-binary format • DTD allows easy exchange (Openness) • Easily translatable into other formats • Hierarchical • Good for displaying hierarchical data • Allows customized display • Facilitates processing of data

  16. Applications Possible areas for use of XML include: • Business to Business (B2B) • eCommerce • Electronic Data Interchange (EDI) • Web Services • ebXML • UN/CEFACT, OASIS • Content Management • Data representation

  17. DTD (1) • First method designed for specifying a valid doc structure • Set of rules governing the tags in an XML document • Used for validation of an XML document • Prevents the creation of an invalid XML structure • Tells both validating and nonvalidating parsers where text is expected • Tells author which elements are allowed in a document

  18. DTD (2) • DTD can be specified... • In the prolog as part of the XML document (inline) <!DOCTYPE root-element [element-declarations]> • Or as an external reference <!DOCTYPE root-element SYSTEM "filename"> • If doc is well-formed and doesn’t require validation, a DTD is optional • Building blocks of XML • Elements, tags, attributes, entities, PCDATA, CDATA

  19. DTD (3) • DTD - Elements • Element declaration <!ELEMENT element_namecontent_spec > • Element_name • The tag name • Content_spec • Type of content the tag contains • Example • <!ELEMENT department (course+)> • <!ELEMENT course (title, desc?, prereq*, coreq*)> • <!ELEMENT title (#PCDATA)>

  20. DTD (4) • Specifying valid child elements • Enclosed within parentheses (content model) • Specify a sequence of children <!ELEMENT course (title, desc) > • Specify a choice of children <!ELEMENT course (title | desc) > • Specify the occurrences of children <!ELEMENT course (title+, desc?) > • Optional characters: + (one or more), * (zero or more), ? (zero or one)

  21. DTD (5) • DTD - Attributes • Attribute declaration <!ATTLIST element_nameattr_name attr_type default> • Element_name • The tag of which is an attribute • Attr_name • The name of the attribute • Attr_Type • Types: String, tokenized types, enumerated types. • Default • Keywords: #REQUIRED, #IMPLIED, #FIXED • Example <!ATTLIST course number ID #REQUIRED>

  22. DTD (6) • An external DTD example … department.dtd … <!ELEMENT department (courses)+> <!ATTLIST department name ID #REQUIRED> <!ELEMENT course (title, description, preq?)> <!ATTLIST course number ID #REQUIRED> <!ELEMENT title (#PCDATA)> <!ELEMENT description (#PCDATA)> <!ELEMENT prerequisite (#PCDATA)> … XML File … <?xml version="1.0"?> <!DOCTYPE department ”department.dtd"> <department name="Computer Science"> <course number="CS 214"> <title>XML</title> <description>New web technology.</description> </course> </department>

  23. DTD (7) • Validates structure but that’s all! • Limitations • Can’t restrict the contents of an element • Can’t specify complex relationships <!ELEMENT item (#PCDATA | (#PCDATA, item+))> • Can’t double-define elements <!ELEMENT item (#PCDATA)> <!ELEMENT item (#PCDATA, item+)> • DTD syntax is different from XML syntax • DTD offers no hierarchy structure

  24. Linking XML Documents (1) • XML linking much more powerful than HTML anchors • XLink – XML Linking Specification • Based on IDs given to document elements • Two-way links • Links to multiple documents • Expanding links • Indirect links

  25. Linking XML Documents (2) • XPointer – XML Extended Pointer Specification • Addresses internal structure with IDs or parent/child node relationships • Example: examples.xml#ID(conclusion).child(2,*).child(2,#element,’p’)

  26. Searching XML Documents (1) • XPath • XML Path Language • Hierarchical addressing scheme • Abstract, logical structure, rather than surface syntax • Example: /Project[Status=“Done”] Select all Project elements with a child Status whose value is “Done”

  27. Searching XML Documents (2) • XQuery • XML Query Language • Builds on hierarchical XPath addressing • Provides complex search capability • Comparison operations, conditional expressions • Definition of literals, variables, function calls • Still a working draft by W3C • Best supported by Native XML Databases

More Related