1 / 28

Defining XML The Document Type Definition

Defining XML The Document Type Definition. Document Type Definition. text syntax for defining elements of XML attributes (and possibly default values) structure <?xml … standalone = “no”… ?> implies that an external definition exists and may be required to properly understand the content.

ivanbritt
Download Presentation

Defining XML The Document Type Definition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Defining XMLThe Document Type Definition

  2. Document Type Definition • text syntax for defining • elements of XML • attributes (and possibly default values) • structure • <?xml … standalone = “no”… ?> • implies that an external definition exists and may be required to properly understand the content

  3. Why do we need DTDs? • Define classes of xml documents • For particular applications • Agreement on data and structure • Validate xml data • DTD is used to check structure • Document an xml class • DTD provides complete information about an xml class

  4. linking an XML file to a DTD • a document type declaration is added to the xml <!DOCTYPE message SYSTEM “myDTD.dtd”> XML file DTD DOCTYPE link message.xml myDTD.dtd

  5. What Is a DTD? • Defines a type of xml document • What elements are allowed? • What attributes do they have? • How can they be structured? • DTD is in text format • Usually external to the xml data • Linked by a document type declaration • May be included in the xml data file

  6. Element type declarations <!ELEMENT myElement (#PCDATA)> content that the element can have the “element definition” element name of the element being defined #PCDATA = parsed character data

  7. Example <!ELEMENT message ( #PCDATA )> One line of text, stored in messageML.dtd Example of a message document conforming to this DTD <?xml version = “1.0” ?> <!DOCTYPE message SYSTEM ”messageML.dtd"> <message> Welcome to XML! </message>

  8. Internal DTD Example <?xml version = “1.0” ?> <!DOCTYPE message [ <!ELEMENT message (#PCDATA)> ]> <message> Welcome to XML! </message>

  9. Defining structure • Element declarations define the content of elements • Content can be text or other elements • Content defines structure • How are the elements nested? • How many elements can be included? • What order do elements come in?

  10. Defining structure <!ELEMENT classroom (teacher, student)> a classroom contains exactly one teacher followed by exactly one student <!ELEMENT dessert (iceCream ¦ pastry)> a dessert contains either one iceCream or one pastry, but not both <!ELEMENT album (track+)> an album contains one or more tracks

  11. occurrence indicators Plus sign (+) Element will appear 1 to many times <!ELEMENT album (track+)> Asterisk (*) Element will appear 0 to many times <!ELEMENT library (book*)> Question mark (?) Element will appear 0 to 1 times <!ELEMENT seat (person?)>

  12. A Simple Document Type Definition

  13. DTD Example 1 <!ELEMENT class (number, (instructor ¦ assistant+), (credit ¦ nocredit) )> a class must contain a number followed by either an instructor or one or more assistants followed by either a credit or a nocredit <class> <number>CM4003</number> <instructor>John McCall</instructor> <credit>15</credit> </class>

  14. DTD Example 2 <!ELEMENT donutBox (jam?, lemon*, ((cream | sugar)+ | iced)) a donutBox contains 0 or 1 jam followed by 0 to many lemon followed by either one to many cream or sugar or one iced <donutBox> <jam>raspberry</jam> <lemon>sour</lemon> <lemon>half-sour</lemon> <iced>chocolate</iced </donutBox> <donutBox> <iced>pink</iced> </donutBox>

  15. DTD Example 3 <!ELEMENT farm (farmer+, (dog* | cat?), pig*, (goat | cow)?, (chicken+ | duck*) )> <farm> <farmer>Farmer Maggot</farmer> <cat>Tiddles</cat> <duck>Donald</duck> </farm>

  16. DTD Example 4 mixed content (narrative XML) <!ELEMENT paragraph (#PCDATA|name|profession|date|irony)*> A <paragraph> element may contain any combination of <name>, <profession> or <date> elements interspersed with parsed character data. <paragraph> Today’s date is <date month=“October” day=“1”/> and <name>John McCall</name>, a <profession>lecturer</profession> is delivering a <irony>scintillating</irony> XML lecture.</paragraph>

  17. Defining attributes • attributes assigned to elements using the <!ATTLIST …> instruction • ATTLIST defines • Which element the attribute belongs to • The name of the attribute • The values the attribute can take • Possible default values • Whether the attribute MUST be present or not

  18. Attribute values • In HTML all attributes are text • DTDs support 10 attribute types • Most common are: • CDATA (literal text) • ID (unique identifier) • NMTOKEN (“no whitespace”) • Enumeration (of all possible values)

  19. Conditions on attributes • #REQUIRED • the attribute must be given a value in the XML • #IMPLIED • the attribute may be omitted from the XML • #FIXED • the value of the attribute is fixed and defined in the DTD • literal • a default value is supplied literally in the DTD

  20. Example attribute declarations <!ELEMENT pig (PCDATA)> <!ATTLIST pig weight CDATA #REQUIRED> <!ATTLIST pig id_code ID #REQUIRED> <!ATTLIST pig name NMTOKEN #IMPLIED> <!ATTLIST pig sex (M | F) “F”> <!ATTLIST pig canFly FIXED “no”> <pig weight = “1000kg” id_code = “pig017”> Porky </pig>

  21. entities • used to represent text that would cause parsing problems • &lt; represents < • &amp; represents & • &gt; represents > • &quot; represents “ • &apos; represents ‘

  22. defining entities • <!ENTITY label replacementText> • <!ENTITY super supercallifragilisticexpialidocious> • now &super; is replaced in the XML (or in attribute values) by supercallifragilisticexpialidocious

  23. CDATA or PCDATA? • PCDATA • Parsed Character DATA • will be parsed for entities • CDATA • Character DATA • Will NOT be parsed • CDATA sections are sometimes included in xml to include “literal” sections of code

  24. Writing a CDATA section <!CDATA[ Hi! I’m a CDATA section! I can include anything that would normally upset the parser: <?<<< &&&;; ><></> hahahahahahaha!!! The only thing I have to avoid is a double square closing bracket, which means the CDATA has ended. ]]>

  25. Validation of xml • Validation means checking that an xml document conforms to its DTD • Adds security to automatic processing • Allows free machine-machine exchange of xml • Applied before manipulating xml • See XSLT, SAX, DOM later

  26. Well-formed vs valid • Well-formed xml • The data obeys the xml syntax rules • Valid xml • The data is well-formed xml • The data has a DTD • The data conforms to the DTD • xml data may be well-formed but invalid

  27. xml parser types • validating parser • checks XML is well-formed • conforms to XML specification • checks XML is valid (has and matches a DTD) • non-validating parser • only checks XML is well-formed • may pass invalid XML

  28. Labs • Now split into two sessions • Thursday C26 11.00-13.00 • Friday C18 11.00-13.00 • Choose one as convenient • Assessed Lab will be in a separately arranged session on afternoon of Friday 30th November

More Related