1 / 34

QUALITY CONTROL WITH SCHEMAS

QUALITY CONTROL WITH SCHEMAS. CSC1310 Fall 2009. BASIS CONCEPTS. Schema is a pass-or-fail test for document Schema is a minimum set of requirements for document to prevent anomalous processing or to formalize an application. Validation is a testing a document with a schema.

martha
Download Presentation

QUALITY CONTROL WITH SCHEMAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. QUALITY CONTROL WITH SCHEMAS CSC1310 Fall 2009

  2. BASIS CONCEPTS • Schema is a pass-or-fail test for document • Schema is a minimum set of requirements for document to prevent anomalous processing or to formalize an application. • Validation is a testing a document with a schema. • Structure: use and placement of markup elements and attributes. • Data typing: patterns of character data • Integrity: the status of links between nodes and resources. • Business rules: spelling checks, checksum results and so on.

  3. DOCUMENT TYPE DEFINITIONS(DTDS) • DTD is the oldest and widely supported schema language. • DTD declares a set of allowed elements (vocabulary). • DTD defines a content model for each element (grammar) • DTD declares a set of allowed attributes for each element: name, data type, default values, behavior (for example, required or optional).

  4. DOCUMENT PROLOG FOR DTD • All external parsed entities (including DTD) should begin with text declaration. • Text declaration looks like XML declaration except explicitly excluding the standalone property. <?xml version=“1.0” encoding=“character set”> • Encoding in DTD won’t automatically carry over the XML documents that use the DTD. • External parsed entities (including DTD) must not contain a document type declaration.

  5. DECLARATIONS • DTD is a set of rules (declarations). • Each declaration adds a new element, set of attributes, entity or notation. • If there are redundant entity declarations, the first one that appears takes precedence, others are ignored. • EMPTY: no information (special tags like <br>) • ANY: any information. • PCDATA or CDATA : character data. • With Children : a parent-child relationship (order of kids).

  6. USE OF CHILDREN • There are ways that children elements can be defined in a DTD file : • One Occurrence Only • Minimum of One Occurence (+) • Zero or More Occurences (*) • Zero or One Occurences (?) • Either / Or Occurrences ( | )

  7. ATTRIBUTES • There are four value options : • Value: The default value of the attribute surrounded by quotes ( " ") • #IMPLIED: The attribute is optional • #FIXED: A fixed value. • #REQUIRED: The attribute is required when the element is used.

  8. TYPES OF ATTRIBUTE • CDATA : The value is Character Data. • (en1|en2|...) : The value is an enumerated list. • ID : The value is a unique id. • IDREF : The value is the id of another element. • IDREFS : The value is a list of other ids • NMTOKEN : The value is a valid XML name. • NMTOKENS : The value is a list of valid XML names. • ENTITY : The value is an entity. • ENTITIES : The value is a list of entities. • NOTATION : The value is a name of a notation. • xml: The value is a predefined XML value.

  9. EXAMPLE

  10. EXAMPLE

  11. EXAMPLE <!ELEMENT date (year, month, day)> <!ELEMENT year #PCDATA> <!ELEMENT month #PCDATA > <!ELEMENT day #PCDATA >

  12. EXAMPLE <!ELEMENT address (street, city, country, zip)> <!ELEMENT street (#PCDATA | unit )*> <!ELEMENT city #PCDATA > <!ELEMENT country #PCDATA > <!ELEMENT zip #PCDATA > <!ELEMENT unit #PCDATA >

  13. EXAMPLE <!ELEMENT person (name, age, gender)> <!ELEMENT name (first, last, (junior | senior)? )> <!ELEMENT age #PCDATA > <!ELEMENT gender #PCDATA > <!ELEMENT first #PCDATA > <!ELEMENT last #PCDATA > <!ELEMENT junior #EMPTY> <!ELEMENT senior #EMPTY> <!ATTLIST person pid ID #REQUIRED employed (fulltime|partime)>

  14. TIPS FOR DESIGNING DTD • Organize declarations into groups by their purpose • Blocks, hierarchical elements, part of tables, lists, etc. • Use whitespace • More understandable and easier to navigate. • Use comments • At the top of each DTD file: purpose, version number, contact information • Customization: original, authors, your changes. • Label each section and subsection of the DTD. • Track version • Use parameter entities • Hold recurring parts of declarations and allow to edit them in one place.

  15. PARAMETER ENTITIES • In the external DTD, can be used in: • Element-type declarations to hold element groups • Attribute list declarations to hold attribute definition. • In the internal DTD, can hold only complete declarations. <!ENTITY % common.atts “ id ID # IMPLIED class CDATA #IMPLIED”> <!ATTLIST foo %common.atts;> <!ATTLIST bar %common.atts; extra CDATA #FIXED “blah”>

  16. IMPORTING MODULES • .mod means file contains declarations but should not be used as DTD on its own. • External entity import all the text in a file. <!ELEMENT catalog (title, metadata, front, entries+)> <!ENTITY % basic.stuff SYSTEM “basics.mod”> %basic.stuff; <!ENTITY % front.matter SYSTEM “front.mod”> %front.matter; <!ENTITY % metadata SYSTEM “metadata.dtd”> %metadata;

  17. CONDITIONAL SECTIONS • Conditional section is a special form of markup in DTD to mark a region for inclusion or exclusion. • Conditional section can be used only in external subsets <![INCLUDE [ DTD text ]]> <![IGNORE [ DTD text ]]> <![INCLUDE [ <!ELEMENT blah #PCDATA>]]>

  18. OVERRIDING ELEMENT • In DTD: <!ENTITY % default.polyhedron “INCLUDE”> <![%default.polyhedron;[ <!ELEMENT polyhedron (side+,angle+)>]]> • In XML: <!DOCTYPE picture SYSTEM “shapes.dtd”[ <!ENTITY %default.polyhedron “IGNORE”> <!ELEMENT polyhedron (side, side, side+, angle, angle, angle+)>] >

  19. LIMITATION OF DTD • DTD describes how elements are arranged in document, but say a little about the content in document. • DTD is not flexible in children order. • Lockdown namespace: any element in a document has to have a corresponding declaration in DTD. • Schema is a new validation system: • contains rules that all must be satisfied for a document to be considered valid • is not built into the XML specification. • W3C XML Schema, RELAX NG, Schematron.

  20. NAMESPACES • Namespaces are used to group elements and attributes. xmlns: namespace_prefix = “namespace_identifier” <part catalog xlmns:nw=“http://www.nutware.com” xlmns=“http://www.bobco.com”> #implicit namespace <nw:entry nw:number=“1327”> < nw:decription > hexnut < /nw:description ></nw:entry> <part id=“555”> <name> type 4 </name> </part> </part-catalog>

  21. W3C SCHEMA (2001) • XML document by themselves. • In DTD: <!ELEMENT country #PCDATA > • In W3C Schema <xs:schema xlmns:xs=“http://www.w3.org/2001/XMLSchema”> <xs:element name=“country” type=“xs:string”/> </xs:schema>

  22. WIDELY USED TYPES. • xs:string any text • xs:token textual tokens separated by whitespace • xs:decimal any decimal number • xs:integer any integer number • xs:float floating-point number • xs:ID, xs:IDREF the same as ID, IDREF in DTD • xs:boolean “true”/”false” (“1”/”0”) • xs:time time as HH:MM:SS-Timezone • xs:date date in format CCYY-MM-DD • xs:dateTime date/time combination in format CCYY-MM-DDTHH:MM:SS-Timezone • xs:Qname namespace-qualified name

  23. COMPLEX ELEMENT IN SCHEMA <xs:element name=“date”> <xs:complexType> <xs:all> <xs:element ref=“year”/> <xs:element ref=“month”/> <xs:element ref=“day”/> </xs:all> </xs:complexType> </xs:element> <xs:element name=“year” type=“xs:integer”/> <xs:element name=“month” type=“xs:integer”/> <xs:element name=“day” type=“xs:integer”/>

  24. FACETS • Facet is a way to control the range of the data type. <xs:simpleType name=“monthNum”> <xs:restriction base=“xs:integer”> <xs:minInclusive value=“1”/> <xs:maxInclusive value=“12”/> </xs:restriction> </xs:simpleType> <xs:element name=“month” type=“monthNum”/> • Facets can create fixed values, constrain the length of strings, match patterns, set allowed values.

  25. FACETS EXAMPLE • List of allowed values: <xs:simpleType name=“genderType”> <xs:restriction base=“xs:token”> <xs:enumeration value=“female”/> <xs:enumeration value=“male”/> </xs:restriction> </xs:simpleType> • Pattern: <xs:simpleType name=“pcode”> <xs:restriction base=“xs:token”> <xs:pattern value=“[0-9]{3}[A-Z]{3}”/> </xs:restriction> </xs:simpleType>

  26. SCHEMA EXAMPLE <xs:schema xlmns:xs= “http://www.w3.org/2001/ XMLSchema”> <xs:element name=“census-record”> <xs:complexType> <xs:sequence> <xs:element ref=“date”/> <xs:element ref=“address”/> <xs:element ref=“person” maxOccurs=“unbounded”/> </xs:sequence> <xs:attribute ref=“taker”/> </xs:complexType> </xs:element>

  27. SCHEMA EXAMPLE <xs:attribute name=“taker”> <xs:simpleType> <xs:restriction base=“xs:integer”> <xs:minInclusive value=“1”/> <xs:maxInclusive value=“9999”/> </xs:restriction> </xs:simpleType> </xs:attribute>

  28. SCHEMA EXAMPLE <xs:element name=“date” type=“xs:date”> <xs:element name=“address”> <xs:complexType> <xs:all> <xs:element ref=“street”/> <xs:element ref=“city”/> <xs:element ref=“country”/> <xs:element ref=“zip”/> </xs:all> </xs:complexType> </xs:element> <xs:element name=“street” type=“xs:string”/> <xs:element name=“city” type=“xs:string”/> <xs:element name=“country” type=“xs:string”/>

  29. SCHEMA EXAMPLE <xs:element name=“zip”> <xs:simpleType> <xs:restriction base=“xs:token”> <xs:pattern value=“[0-9]{3}[A-Z]{3}”/> </xs:restriction> </xs:simpleType> </xs:element>

  30. SCHEMA EXAMPLE <xs:element name=“person”> <xs:complexType> <xs:all> <xs:element ref=“name”/> <xs:element ref=“age”/> <xs:element ref=“gender”/> </xs:all> <xs:attribute ref=“employed”/> <xs:attribute ref=“pid”/> </xs:complexType> </xs:element>

  31. SCHEMA EXAMPLE <xs:attribute name=“employed”> <xs:simpleType > <xs:restriction base=“xs:token”> <xs:enumeration value=“fulltime”/> <xs:enumerationvalue=“parttime”/> <xs:enumerationvalue=“none”/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name=“pid”> <xs:simpleType> <xs:restriction base=“xs:integer”> <xs:minInclusive value=“1”/> <xs:maxInclusive value=“999999”/> </xs:restriction> </xs:simpleType> </xs:attribute>

  32. SCHEMA EXAMPLE <xs:element name=“age”> <xs:simpleType> <xs:restriction base=“xs:integer”> <xs:minInclusive value=“0”/> <xs:maxInclusive value=“150”/> </xs:restriction> </xs:simpleType> </xs:element> <xs:attribute name=“gender”> <xs:simpleType > <xs:restriction base=“xs:token”> <xs:enumeration value=“female”/> <xs:enumerationvalue=“male”/> </xs:restriction> </xs:simpleType> </xs:element>

  33. SCHEMA EXAMPLE <xs:element name=“name”> <xs:complexType> <xs:all> <xs:element ref=“first”/> <xs:element ref=“last”/> </xs:all> <xs:choice minOccurs=“0”> <xs:element ref=“junior”/> <xs:element ref=“senior”/> </xs:choice> </xs:complexType> </xs:element>

  34. SCHEMA EXAMPLE <xs:element name=“junior” type=“emptyElem”/> <xs:element name=“senior” type=“emptyElem”/> <xs:complexType name=“emptyElem”/> </xs:schema>

More Related