Defining xml the document type definition
1 / 28

Defining XML - PowerPoint PPT Presentation

  • Updated On :

Defining XML The Document Type Definition. Document Type Definition. text syntax for defining elements of XML attributes (and possibly default values) structure <?xml … standalone = “no”… ?> implies that an external definition exists and may be required to properly understand the content.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Defining XML' - ivanbritt

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Defining xml the document type definition l.jpg
Defining XMLThe Document Type Definition

Document type definition l.jpg
Document Type Definition

  • text syntax for defining

    • elements of XML

    • attributes (and possibly default values)

    • structure

  • <?xml … standalone = “no”… ?>

  • implies that an external definition exists and may be required to properly understand the content

Why do we need dtds l.jpg
Why do we need DTDs?

  • Define classes of xml documents

    • For particular applications

    • Agreement on data and structure

  • Validate xml data

    • DTD is used to check structure

  • Document an xml class

    • DTD provides complete information about an xml class

Linking an xml file to a dtd l.jpg
linking an XML file to a DTD

  • a document type declaration is added to the xml

    <!DOCTYPE message SYSTEM “myDTD.dtd”>







What is a dtd l.jpg
What Is a DTD?

  • Defines a type of xml document

    • What elements are allowed?

    • What attributes do they have?

    • How can they be structured?

  • DTD is in text format

  • Usually external to the xml data

    • Linked by a document type declaration

  • May be included in the xml data file

Element type declarations l.jpg
Element type declarations

<!ELEMENT myElement (#PCDATA)>

content that the element can have

the “element definition” element

name of the element being defined

#PCDATA = parsed character data

Slide7 l.jpg


<!ELEMENT message ( #PCDATA )>

One line of text, stored in messageML.dtd

Example of a message document conforming to this DTD

<?xml version = “1.0” ?>

<!DOCTYPE message SYSTEM ”messageML.dtd">


Welcome to XML!


Internal dtd example l.jpg
Internal DTD Example

<?xml version = “1.0” ?>

<!DOCTYPE message [

<!ELEMENT message (#PCDATA)>



Welcome to XML!


Slide9 l.jpg

Defining structure

  • Element declarations define the content of elements

  • Content can be text or other elements

  • Content defines structure

    • How are the elements nested?

    • How many elements can be included?

    • What order do elements come in?

Defining structure l.jpg
Defining structure

<!ELEMENT classroom (teacher, student)>

a classroom contains exactly one teacher followed by exactly one student

<!ELEMENT dessert (iceCream ¦ pastry)>

a dessert contains either one iceCream or one pastry, but not both

<!ELEMENT album (track+)>

an album contains one or more tracks

Occurrence indicators l.jpg
occurrence indicators

Plus sign (+)

Element will appear 1 to many times

<!ELEMENT album (track+)>

Asterisk (*)

Element will appear 0 to many times

<!ELEMENT library (book*)>

Question mark (?)

Element will appear 0 to 1 times

<!ELEMENT seat (person?)>

Dtd example 1 l.jpg
DTD Example 1

<!ELEMENT class

(number, (instructor ¦ assistant+), (credit ¦ nocredit) )>

a class must contain a number followed by either an instructor or one or more assistants followed by either a credit or a nocredit



<instructor>John McCall</instructor>



Dtd example 2 l.jpg
DTD Example 2

<!ELEMENT donutBox (jam?, lemon*,

((cream | sugar)+ | iced))

a donutBox contains 0 or 1 jam followed by 0 to many lemon followed by either one to many cream or sugar or one iced










Dtd example 3 l.jpg
DTD Example 3

<!ELEMENT farm (farmer+,

(dog* | cat?), pig*,

(goat | cow)?, (chicken+ | duck*)



<farmer>Farmer Maggot</farmer>




Dtd example 4 l.jpg
DTD Example 4

mixed content (narrative XML)

<!ELEMENT paragraph (#PCDATA|name|profession|date|irony)*>

A <paragraph> element may contain any combination of <name>, <profession> or <date> elements interspersed with parsed character data.

<paragraph> Today’s date is <date month=“October” day=“1”/> and

<name>John McCall</name>, a <profession>lecturer</profession> is delivering a <irony>scintillating</irony> XML lecture.</paragraph>

Defining attributes l.jpg
Defining attributes

  • attributes assigned to elements using the <!ATTLIST …> instruction

  • ATTLIST defines

    • Which element the attribute belongs to

    • The name of the attribute

    • The values the attribute can take

    • Possible default values

    • Whether the attribute MUST be present or not

Attribute values l.jpg
Attribute values

  • In HTML all attributes are text

  • DTDs support 10 attribute types

  • Most common are:

    • CDATA (literal text)

    • ID (unique identifier)

    • NMTOKEN (“no whitespace”)

    • Enumeration (of all possible values)

Conditions on attributes l.jpg
Conditions on attributes


    • the attribute must be given a value in the XML


    • the attribute may be omitted from the XML

  • #FIXED

    • the value of the attribute is fixed and defined in the DTD

  • literal

    • a default value is supplied literally in the DTD

Example attribute declarations l.jpg
Example attribute declarations



<!ATTLIST pig id_code ID #REQUIRED>


<!ATTLIST pig sex (M | F) “F”>

<!ATTLIST pig canFly FIXED “no”>

<pig weight = “1000kg”

id_code = “pig017”>



Entities l.jpg

  • used to represent text that would cause parsing problems

  • &lt; represents <

  • &amp; represents &

  • &gt; represents >

  • &quot; represents “

  • &apos; represents ‘

Defining entities l.jpg
defining entities

  • <!ENTITY label replacementText>

  • <!ENTITY super supercallifragilisticexpialidocious>

  • now &super; is replaced in the XML (or in attribute values) by supercallifragilisticexpialidocious

Cdata or pcdata l.jpg


    • Parsed Character DATA

    • will be parsed for entities


    • Character DATA

    • Will NOT be parsed

    • CDATA sections are sometimes included in xml to include “literal” sections of code

Writing a cdata section l.jpg
Writing a CDATA section


Hi! I’m a CDATA section!

I can include anything that would normally upset the parser:

<?<<< &&&;; ><></> hahahahahahaha!!!

The only thing I have to avoid is a double square closing bracket, which means the CDATA has ended.


Validation of xml l.jpg
Validation of xml

  • Validation means checking that an xml document conforms to its DTD

  • Adds security to automatic processing

  • Allows free machine-machine exchange of xml

  • Applied before manipulating xml

    • See XSLT, SAX, DOM later

Well formed vs valid l.jpg
Well-formed vs valid

  • Well-formed xml

    • The data obeys the xml syntax rules

  • Valid xml

    • The data is well-formed xml

    • The data has a DTD

    • The data conforms to the DTD

  • xml data may be well-formed but invalid

Xml parser types l.jpg
xml parser types

  • validating parser

    • checks XML is well-formed

      • conforms to XML specification

    • checks XML is valid (has and matches a DTD)

  • non-validating parser

    • only checks XML is well-formed

    • may pass invalid XML

Slide28 l.jpg

  • Now split into two sessions

    • Thursday C26 11.00-13.00

    • Friday C18 11.00-13.00

  • Choose one as convenient

  • Assessed Lab will be in a separately arranged session on afternoon of Friday 30th November