defining xml the document type definition
Download
Skip this Video
Download Presentation
Defining XML The Document Type Definition

Loading in 2 Seconds...

play fullscreen
1 / 28

Defining XML - PowerPoint PPT Presentation


  • 239 Views
  • Uploaded on

Defining XML The Document Type Definition. Document Type Definition. text syntax for defining elements of XML attributes (and possibly default values) structure <?xml … standalone = “no”… ?> implies that an external definition exists and may be required to properly understand the content.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Defining XML' - ivanbritt


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
document type definition
Document Type Definition
  • text syntax for defining
    • elements of XML
    • attributes (and possibly default values)
    • structure
  • <?xml … standalone = “no”… ?>
  • implies that an external definition exists and may be required to properly understand the content
why do we need dtds
Why do we need DTDs?
  • Define classes of xml documents
    • For particular applications
    • Agreement on data and structure
  • Validate xml data
    • DTD is used to check structure
  • Document an xml class
    • DTD provides complete information about an xml class
linking an xml file to a dtd
linking an XML file to a DTD
  • a document type declaration is added to the xml

<!DOCTYPE message SYSTEM “myDTD.dtd”>

XML

file

DTD

DOCTYPE link

message.xml

myDTD.dtd

what is a dtd
What Is a DTD?
  • Defines a type of xml document
    • What elements are allowed?
    • What attributes do they have?
    • How can they be structured?
  • DTD is in text format
  • Usually external to the xml data
    • Linked by a document type declaration
  • May be included in the xml data file
element type declarations
Element type declarations

<!ELEMENT myElement (#PCDATA)>

content that the element can have

the “element definition” element

name of the element being defined

#PCDATA = parsed character data

slide7

Example

<!ELEMENT message ( #PCDATA )>

One line of text, stored in messageML.dtd

Example of a message document conforming to this DTD

<?xml version = “1.0” ?>

<!DOCTYPE message SYSTEM ”messageML.dtd">

<message>

Welcome to XML!

</message>

internal dtd example
Internal DTD Example

<?xml version = “1.0” ?>

<!DOCTYPE message [

<!ELEMENT message (#PCDATA)>

]>

<message>

Welcome to XML!

</message>

slide9

Defining structure

  • Element declarations define the content of elements
  • Content can be text or other elements
  • Content defines structure
    • How are the elements nested?
    • How many elements can be included?
    • What order do elements come in?
defining structure
Defining structure

<!ELEMENT classroom (teacher, student)>

a classroom contains exactly one teacher followed by exactly one student

<!ELEMENT dessert (iceCream ¦ pastry)>

a dessert contains either one iceCream or one pastry, but not both

<!ELEMENT album (track+)>

an album contains one or more tracks

occurrence indicators
occurrence indicators

Plus sign (+)

Element will appear 1 to many times

<!ELEMENT album (track+)>

Asterisk (*)

Element will appear 0 to many times

<!ELEMENT library (book*)>

Question mark (?)

Element will appear 0 to 1 times

<!ELEMENT seat (person?)>

dtd example 1
DTD Example 1

<!ELEMENT class

(number, (instructor ¦ assistant+), (credit ¦ nocredit) )>

a class must contain a number followed by either an instructor or one or more assistants followed by either a credit or a nocredit

<class>

<number>CM4003</number>

<instructor>John McCall</instructor>

<credit>15</credit>

</class>

dtd example 2
DTD Example 2

<!ELEMENT donutBox (jam?, lemon*,

((cream | sugar)+ | iced))

a donutBox contains 0 or 1 jam followed by 0 to many lemon followed by either one to many cream or sugar or one iced

<donutBox>

<jam>raspberry</jam>

<lemon>sour</lemon>

<lemon>half-sour</lemon>

<iced>chocolate</iced

</donutBox>

<donutBox>

<iced>pink</iced>

</donutBox>

dtd example 3
DTD Example 3

<!ELEMENT farm (farmer+,

(dog* | cat?), pig*,

(goat | cow)?, (chicken+ | duck*)

)>

<farm>

<farmer>Farmer Maggot</farmer>

<cat>Tiddles</cat>

<duck>Donald</duck>

</farm>

dtd example 4
DTD Example 4

mixed content (narrative XML)

<!ELEMENT paragraph (#PCDATA|name|profession|date|irony)*>

A <paragraph> element may contain any combination of <name>, <profession> or <date> elements interspersed with parsed character data.

<paragraph> Today’s date is <date month=“October” day=“1”/> and

<name>John McCall</name>, a <profession>lecturer</profession> is delivering a <irony>scintillating</irony> XML lecture.</paragraph>

defining attributes
Defining attributes
  • attributes assigned to elements using the <!ATTLIST …> instruction
  • ATTLIST defines
    • Which element the attribute belongs to
    • The name of the attribute
    • The values the attribute can take
    • Possible default values
    • Whether the attribute MUST be present or not
attribute values
Attribute values
  • In HTML all attributes are text
  • DTDs support 10 attribute types
  • Most common are:
    • CDATA (literal text)
    • ID (unique identifier)
    • NMTOKEN (“no whitespace”)
    • Enumeration (of all possible values)
conditions on attributes
Conditions on attributes
  • #REQUIRED
    • the attribute must be given a value in the XML
  • #IMPLIED
    • the attribute may be omitted from the XML
  • #FIXED
    • the value of the attribute is fixed and defined in the DTD
  • literal
    • a default value is supplied literally in the DTD
example attribute declarations
Example attribute declarations

<!ELEMENT pig (PCDATA)>

<!ATTLIST pig weight CDATA #REQUIRED>

<!ATTLIST pig id_code ID #REQUIRED>

<!ATTLIST pig name NMTOKEN #IMPLIED>

<!ATTLIST pig sex (M | F) “F”>

<!ATTLIST pig canFly FIXED “no”>

<pig weight = “1000kg”

id_code = “pig017”>

Porky

</pig>

entities
entities
  • used to represent text that would cause parsing problems
  • &lt; represents <
  • &amp; represents &
  • &gt; represents >
  • &quot; represents “
  • &apos; represents ‘
defining entities
defining entities
  • <!ENTITY label replacementText>
  • <!ENTITY super supercallifragilisticexpialidocious>
  • now &super; is replaced in the XML (or in attribute values) by supercallifragilisticexpialidocious
cdata or pcdata
CDATA or PCDATA?
  • PCDATA
    • Parsed Character DATA
    • will be parsed for entities
  • CDATA
    • Character DATA
    • Will NOT be parsed
    • CDATA sections are sometimes included in xml to include “literal” sections of code
writing a cdata section
Writing a CDATA section

<!CDATA[

Hi! I’m a CDATA section!

I can include anything that would normally upset the parser:

<?<<< &&&;; ><></> hahahahahahaha!!!

The only thing I have to avoid is a double square closing bracket, which means the CDATA has ended.

]]>

validation of xml
Validation of xml
  • Validation means checking that an xml document conforms to its DTD
  • Adds security to automatic processing
  • Allows free machine-machine exchange of xml
  • Applied before manipulating xml
    • See XSLT, SAX, DOM later
well formed vs valid
Well-formed vs valid
  • Well-formed xml
    • The data obeys the xml syntax rules
  • Valid xml
    • The data is well-formed xml
    • The data has a DTD
    • The data conforms to the DTD
  • xml data may be well-formed but invalid
xml parser types
xml parser types
  • validating parser
    • checks XML is well-formed
      • conforms to XML specification
    • checks XML is valid (has and matches a DTD)
  • non-validating parser
    • only checks XML is well-formed
    • may pass invalid XML
slide28
Labs
  • Now split into two sessions
    • Thursday C26 11.00-13.00
    • Friday C18 11.00-13.00
  • Choose one as convenient
  • Assessed Lab will be in a separately arranged session on afternoon of Friday 30th November
ad