1 / 22

XML Study-Session: Part II

XML Study-Session: Part II. Validating XML Documents. Objectives: . By completing this study-session, you should be able to: Validate XML documents against a DTD. Understand basic DTD syntax. Create simple DTDs of your own. What is a DTD?. Document Type Definition:

Download Presentation

XML Study-Session: Part II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Study-Session: Part II Validating XML Documents

  2. Objectives: By completing this study-session, you should be able to: • Validate XML documents against a DTD. • Understand basic DTD syntax. • Create simple DTDs of your own.

  3. What is a DTD? Document Type Definition: • Standard originally developed for SGML. • Provides a description of the XML document’s structure, and serves as a grammar to specify what tags and attributes are valid in an XML document and in what context they are valid. • E.g. The following is an example DTD statement: <!ELEMENT person (name, e-mail*)>

  4. Why use a DTD? DTDs are used to allow an application to construct valid XML that conforms to that specification. Also: • Self documentation • Portability • Provides defaults for attributes • Entity declaration

  5. Using a DTD in an XML document An XML document may do any of the following: • Refer to a DTD, using its URI. • Include a DTD inline as part of the XML document. • Omit a DTD altogether. Without a DTD, an XML document can be checked for well-formedness, but not for validity. The DTD used by the XML document may be internal or external. An external DTD is stored as an ASCII text .dtd file.

  6. Example: Using a DTD inline <?xml version=‘1.0’ encoding=‘UTF-8’?> <!DOCTYPE Book [ <!ELEMENT Book (Title, Author+, Summary*, Note?)> <!ATTLIST Book ISBN CDATA #REQUIRED section (fiction|nonfiction) ‘fiction’> <!ELEMENT Title(#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Summary(#PCDATA)> <!ENTITY Description ‘A great American novel.’> ]> <Book ISBN=‘1234’> <Title> To Kill a Mockingbird </Title> <Author> Harper Lee </Author> <Summary> &Description; </Summary> </Book>

  7. Doctype declaration The Document Type (Doctype) declaration is used to indicate the DTD used for the document. Syntax may be in any of the following forms: • <!DOCTYPE rootname [DTD]> • <!DOCTYPE rootname SYSTEM URL> • <!DOCTYPE rootname SYSTEM URL [DTD]> • <!DOCTYPE rootname PUBLIC identifier URL> • <!DOCTYPE rootname PUBLIC identifier URL [DTD]>

  8. Example: External DTD The following is an example of an XML document that uses an external DTD: <?xml version=‘1.0’ standalone=‘no’?> <!DOCTYPE Book SYSTEM ‘booklist.dtd’> <Book ISBN=‘4576’> <Title> Moby Dick </Title> <Author> Herman Melville </Author> </Book> The external DTD must be located in the same directory as the XML document.

  9. Example: Using DTDs with URLS The following is an example of an XML document that references an external DTD with an URL: <?xml version=‘1.0’ standalone=‘no’?> <!DOCTYPE Book SYSTEM http://www.somewebsite.com/booklist.dtd> <Book ISBN=‘4576’> <Title> Moby Dick </Title> <Author> Herman Melville </Author> </Book>

  10. Specifying Elements • In the DTD, this is done with the notation: <!ELEMENT elemName elemDefinitionOrType> where elemName is the actual element name, and elemDefinitionOrType indicates whether the content of the content is pure data or a compound type of data and other elements.

  11. Some Element Types • The element type keyword ANY allows the element to contain textual data, nested elements, or any legal XML combination of the two. • The element type keyword #PCDATA indicates textual data, and can be used to store regular character data we want the XML document to handle normally. • The element type keyword EMPTY indicates that the element is always empty.

  12. Nesting elements • To define the allowed nestings within a DTD, the following notation is used: <!ELEMENT elemName (nestedElem, nestedElem, …)> where the order of elements is enforced as a validity constraint within an XML document. • By default, an element can appear exactly once when specified without any modifiers in the DTD.

  13. Recurrence Operators: Recurrence operators can be used to indicate how many times an element must appear in an XML document:

  14. Grouping elements • Often, recurrence occurs for a block or group of elements rather than with a single element. • To signify a group, enclose a set of elements within parantheses. Nested parentheses are acceptable. • In this way, a recurrence operator can then be applied to the group. • E.g. <!ELEMENT groupingExample ((group1Elem1, group1Elem2)+, (group2Elem1, group2Elem2)?)+>

  15. Either Or • In the DTD, an “OR” operator is signified by using |. This allows one thing or the other to occur, and can be used in conjunction with groupings. • E.g. <!ELEMENT aggregateElement (#PCDATA|Element1|Element2)*>

  16. Defining Attributes • Attribute definitions are in the following form: <!ATTLIST enclosingElement attributName attributeType attributeModifier …> • The attributeType keyword CDATA allows an attribute to take on any value, and may represent a comment or additional information about an element. • Another attribute type is an enumeration, where any of the specified values may be used, but any other value for the attribute results in an invalid document. • E.g. <!ATTLIST elementName attribuetName (value1|value2) attributeModifier …>

  17. Attribute Modifiers • We can indicate in the attribute definition whether the attribute is required within an element. • The three modifier keywords are: #IMPLIED, #REQUIRED, and #FIXED. • An implied attribute may be given a value, or left unspecified. • A required attribute must be given a value. • A fixed attribute has a specified value that can never change. The notation for this is: <!ATTLIST elementName attributName #FIXED fixedValue>

  18. Parameter Entities in DTDs • Parameter entities are entities that can only be used in the DTD. • A simple internal parameter entity has the format: <!ENTITY % name definition> • E.g. <?xml version=‘1.0’ standalone=‘yes’> <!DOCTYPE Book [ <!ENTITY % sum “<!ELEMENT Summary (#PCDATA)>”> <!ELEMENT Book (Title, Author+, Summary*, Note?)> <!ELEMENT Title(#PCDATA)> <!ELEMENT Author (#PCDATA)> %sum; ]> …

  19. Parameter Entities in DTDs (contd.) • External parameter entitites can be declared using the following: <!ENTITY % name SYSTEM URI> or <!ENTITY % name PUBLIC identifier URI> • E.g. The following ‘orders.dtd’ file could be created: <!ENTITY % record "(Name, Date, Orders)"> <!ELEMENT Store (Customer|Buyer|Supplier)*> <!ELEMENT Customer %record;> <!ELEMENT Buyer %record;> <!ELEMENT Supplier %record;> <!ELEMENT Name (#PCDATA)> <!ELEMENT Date (#PCDATA)> <!ELEMENT Orders (Product|Price)> <!ELEMENT Product (#PCDATA)> <!ELEMENT Price (#PCDATA)> <!ENTITY % XHTML1 –t.dtd PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd> %XHTML1-t.dtd

  20. Using INCLUDE and IGNORE • We can customize our DTDs using the INCLUDE and IGNORE statements, which have the following syntax: <![INCLUDE [DTD sections]]> <![IGNORE [DTD sections]]> • E.g. In the ‘orders.dtd’ file, add the following lines: <!ENTITY % includer “INCLUDE”> …(same as before)… <![includer; [ <ELEMENT Product_ID (#PCDATA)> <ELEMENT Ship_Date (#PCDATA)> <ELEMENT Tax (#PCDATA)> ]]>

  21. Example: Using the XHTML 1.1 DTD • The XHTML 1.1 DTD is a DTD driver which includes various XHTML 1.1 modules (i.e. DTD sections) using parameter entities. • E.g. <!--Tables Module……………………………--> <ENTITY % xhtml-table.module “INCLUDE”> <![%xhtml-table.module;[ <ENTITY % xhtml-table.mod PUBLIC “-//W3C//ELEMENTS XHTML 1.1 Tables 1.0//EN” “xhtml11-table-1.mod”> %xhtml-table.mod;]]> • The above allows us to customize the XHTML 1.1 DTD to include/exclude support for tables.

  22. Next session: Parsing XML Documents • Parsing techniques • Writing your own XML applications

More Related