xml schema part 2
Download
Skip this Video
Download Presentation
XML Schema – Part 2

Loading in 2 Seconds...

play fullscreen
1 / 47

XML Schema – Part 2 - PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on

XML Schema – Part 2. More on Schema Types & Derivation Abstact types & type substitution Uniqueness & Keys Additional schema mechanisms - include & import - open content Comparison with DTD & other tech. Content Types.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' XML Schema – Part 2' - dustin-wise


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
xml schema part 2
XML Schema – Part 2
  • More on SchemaTypes & Derivation
  • Abstact types & type substitution
  • Uniqueness & Keys
  • Additional schema mechanisms

- include & import

- open content

  • Comparison with DTD & other tech.
content types
Content Types
  • The type hierarchy first branches into two groups: simple types and complex types.
  • Complex types are divided into two groups: those with simple content and those with complex content.
  • While both forms of complex type allow attributes, only those with complex content allow child elements; those with simple content only allow character content.
slide3

content type
Content Type
  • As we saw there are four kinds of derivation: restriction, extension, list, and union.
  • All types derive, directly or indirectly, from the root type. The root type is anyType.
  • The default syntax for complex types is complex content that restricts anyType.
anytype
anyType
  • The anyType is the base type for all types which do not specify a value for the base attribute. It is the base type for all elements which do not specify a type.
    • Example:

This is the

definition of

the anyType.

complex type
Complex Type
                • empty element
                  Empty Element

                  Your first inclination might be to associate the empty element with a simple type. But that won't work since simple types allow data content. So it must be a complex type. The, ask yourself the next question. Will it allow element children? No. We need a with , right?

                  Wrong. Complex types with simple content also allow data content, and we want an empty element. That leaves us with with , which ensures that there will not be any data content in the element. But we don't want child elements, either, and a complex type with complex content allows child elements. The key is that it doesn't require them. Simply leave the content model out of the type definition:

empty element1
Empty Element

DTD:

      • type="xsd:anyURI" use="required"/>

Schema:

Instance

doc (snippet):

type substitutability
Type Substitutability
  • As we saw earlier, substitutionGroup gives us "element substitution", i.e., the ability to substitute one element for another. Now we will see how to achieve "type substitution", i.e., the ability to substitute an element’s content with another content.
  • Here’s the principle of type substitutability: A base type can be substituted by any derived type.
    • Example. Suppose that BookType is derived from PublicationType. If we declare an element, Publication, to be of type PublicationType (the base type) then in the instance document Publication's content can be either a PublicationType or a BookType.
slide11

targetNamespace="http://www.books.org"

xmlns="http://www.books.org"

elementFormDefault="unqualified">

PublicationType

is the base type

BookType extends

PublicationType

Publication is of type

PublicationType

(the base type)

slide12

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation=

"http://www.books.org

BookStore.xsd">

Staying Young Forever

Karin Granstrom Jordan, M.D.

1999

Illusions The Adventures of a Reluctant Messiah

Richard Bach

1977

0-440-34319-4

Dell Publishing Co.

The First and Last Freedom

J. Krishnamurti

1954

0-06-064831-7

Harper & Row

This Publication’s

content model is

PublicationType

This Publication’s

content model is

BookType

This Publication’s

content model is

BookType

BookStore.xml

abstract elements
Abstract Elements
  • The head element must be global.
  • Same type as the head or derived within substitution group
  • Used in place of the head element.
  • Generic head element,
  • Shouldn't be used directly, but in one of its derived forms.
  • Declare head as abstract.
  • Analogous to abstract classes in O/O.
  • This example defines name-elt as an abstract element that should be replaced either by name or surname everywhere it is referenced.
slide14
Abstract complexType
  • You can declare a complexType to be abstract
    • Example.
  • An abstract complexType is a template/placeholder type:
    • If an element is declared to be a type that is abstract then in an XML instance document the content model of that element may not be that of the abstract type.
      • Example. An element declared to be of type PublicationType (shown above) may not have that type’s content model.
    • However, complexType’s that are derived from the abstract type may substitute for the abstract type.
slide15
Note that PublicationType

is declared abstract.

Book derives from

PublicationType. By default

abstract="false". Thus, this

type can substitute for the

PublicationType.

slide16

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation=

"http://www.books.org

BookStore.xsd">

My Life and Times

Paul McCartney

1998

94303-12021-43892

McMillin Publishing

FooManchu

Don Keyote

1951

The content model of each element must be from a type that derives from PublicationType.

In the schema there are two such types - BookType and SingleAuthorPublication.

review of abstract elements and abstract complextypes
Review of Abstract Elements and Abstract complexTypes
  • If you declare an element to be abstract
    • - -> Use element substitution for the abstract element (as provided by substitutionGroup)
  • If you declare a complexType to be abstract
    • - -> Use type substitution for the abstract type (as provided by type derivation)
uniqueness keys
Uniqueness & Keys
  • DTDs provide the ID attribute datatype for uniqueness (i.e., an ID value must be unique throughout the entire document, and the XML parser enforces this).
  • XML Schema has much enhanced uniqueness capabilities:
    • enables you to define element content to be unique.
    • enables you to define non-ID attributes to be unique.
    • enables you to define a combination of element content and attributes to be unique.
    • enables you to distinguish between unique versus key.
    • enables you to declare the range of the document over which something is unique
unique vs key
unique vs key
  • Key: an element or attribute (or combination thereof) which is defined to be a key must:
    • always be present (minOccurs must be greater than zero)
    • be non-nillable (i.e., nillable="false")
    • be unique
  • Key implies unique, but unique does not imply key
slide20

targetNamespace="http://www.books.org"

xmlns="http://www.books.org"

xmlns:bk="http://www.books.org"

elementFormDefault="qualified">

slide21

...

"Within we define a

key, called PK. Select each , and

within each the ISBN element is

a key."

In other words, within

each must have an and

it must be unique.

slide22

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation=

"http://www.books.org

BookStore.xsd">

My Life and Times

Paul McCartney

1998

1-56592-235-2

McMillin Publishing

Illusions The Adventures of a Reluctant Messiah

Richard Bach

1977

0-440-34319-4

Dell Publishing Co.

The First and Last Freedom

J. Krishnamurti

1954

0-06-064831-7

Harper & Row

A schema-validator

will verify that each

Book has an ISBN

element and that the

values are all unique.

notes about key
Notes about
  • It must be nested within an
  • It must come at the end of (after the content model, and attribute declarations)
  • Use the element as a child of to select a set of elements for which the key applies.
  • Use the element as a child of to identify the element or attribute that is to be the key
    • There can be multiple elements.
unique
unique
  • The element is used exactly like the element is used. It has a and one or more elements, just like has.
  • The only difference is that the schema validator will simply validate that, whenever present, the values are unique.
slide25

targetNamespace="http://www.books.org"

xmlns="http://www.books.org"

xmlns:bk="http://www.books.org"

elementFormDefault="qualified">

Note: ISBN

is optional

Require

every ISBN

be unique.

referencing a key
Referencing a key
  • Recall that by declaring an element of type IDREF then that element must reference an ID attribute, and an XML Parser will verify that the IDREF value corresponds to a legitimate ID value.
  • Similarly, you can define a keyref which asserts, "the value of this element must match the value of an element referred to by this".
slide27

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation=

"http://www.library.org

AuthorSigningAtLibrary.xsd">

Illusions The Adventures of a Reluctant Messiah

Richard Bach

1977

0-440-34319-4

Dell Publishing Co.

...

Richard Bach

Illusions The Adventures of a Reluctant Messiah

0-440-34319-4

A key element

Suppose that we define a

key for ISBN (i.e., each

Book must have an ISBN

and it must be unique)

We would like to ensure

that the ISBN for the

GuestAuthor matches

one of the ISBNs in the

BookStore.

A keyref element

slide28

AuthorSigningAtLibrary.xsd

slide29

This tells the schema-validator to validate that

every Book (in BookStore) has an ISBN, and

that ISBN must be unique.

This tells the schema-validator that the ISBN of the Book

that the Author is signing must refer to one of the ISBN

elements in the collection defined by the PK key.

specifying scope of uniqueness in xml schemas
Specifying scope of uniqueness in XML Schemas
  • The key/keyref/unique elements may be placed anywhere in your schema (that is, at the bottom of any element declaration)
  • Where you place them determines the scope of the uniqueness
  • Example. We may desire to have uniqueness in a localized region of instance documents. Thus, we would use key/keyref/unique within the element for that region.
slide31
Additional schema mechanisms

include & import

open content

include import
include & import
  • xsd:include
    • similar to a copy and paste
    • overriding the definitions of the included schema isn’t allowed.
assembling a schema from multiple schema documents
Assembling a Schema from Multiple Schema Documents
  • The include element allows you to access components in other schemas
    • All the schemas you include must have the same namespace as your schema (i.e., the schema that is doing the include)
    • The net effect of include is as though you had typed all the definitions directly into the containing schema

LibraryEmployee.xsd

LibraryBook.xsd

Library.xsd

assembling a schema from multiple schema documents with different namespaces
Assembling a Schema from Multiple Schema Documents with Different Namespaces
  • The import element allows you to access elements and types in a different namespace

Namespace

B

Namespace

A

B.xsd

A.xsd

schemaLocation="A.xsd"/>

schemaLocation="B.xsd"/>

C.xsd

slide35
Camera Schema

Pentax.xsd

Nikon.xsd

Olympus.xsd

Camera.xsd

slide36

targetNamespace="http://www.camera.org"

xmlns:nikon="http://www.nikon.com"

xmlns:olympus="http://www.olympus.com"

xmlns:pentax="http://www.pentax.com"

elementFormDefault="qualified">

schemaLocation="Nikon.xsd"/>

schemaLocation="Olympus.xsd"/>

schemaLocation="Pentax.xsd"/>

These import

elements give

us access to

the components

in these other

schemas.

Here I am

using the

body_type

that is

defined

in the

Nikon

namespace

extensible instance documents
Extensible Instance Documents
  • The element enables instance document authors to create instance documents containing elements above and beyond what was specified by the schema. The instance documents are said to be extensible. Contrast this schema with previous schemas where the content of all our elements were always fixed and static.
  • We are empowering the instance document author with the ability to define what data makes sense to him/her!
open content
Open Content
  • Definition: an open content schema is one that allows instance documents to contain additional elements beyond what is declared in the schema. This is achieved by using the and elements in the schema.
  • Sprinkling and elements liberally throughout your schema will yield benefits in terms of how evolvable your schema is.
slide39
anyAttribute
  • The element enables the instance document author to extend his/her document with attributes not specified by the schema.

Now an instance document author can add any number of attributes onto a

element (as well as extend the element content).

dtd vs schema
DTD vs Schema
  • Enhanced datatypes
    • 44+ versus 10
    • Can create your own datatypes
  • Written in the same syntax as instance documents
  • Better maintainence and readability
    • Object-oriented - can extend or restrict a type
    • Modular - include & import
  • Extensibility
    • An open content schema by using the and elements.
dtd vs schema1
DTD vs Schema
  • Can specify element content as being unique (keys on content) and uniqueness within a region
  • Can define elements with nil content
  • Can define substitutable elements
  • Can express sets, i.e., can define the child elements to occur in any order
not all powerful
Not “All Powerful”
  • XML Schemas is very powerful
  • However, it is not "all powerful". There are many constraints that it cannot express. Here are some examples:
    • Ensure that the value of the aircraft element is greater than the value of the obstacle element.
    • Ensure that:
      • if the value of the attribute, mode, is "air", then the value of the element, , is either airplane or hot-air balloon
      • if mode="water" then is either boat or hovercraft
      • if mode="ground" then is either car or bicycle.
    • Ensure that the value of the is equal to the value of , where these elements are in separate documents!
  • To check all our constraints we will need to supplement XML Schemas with another tool.
two approaches to extending xml schemas
Two Approaches to Extending XML Schemas
  • XSLT/XPath
    • The first approach is to supplement the XSD document with a stylesheet
  • Schematron
    • The second approach is to embed the additional constraints within elements in the XSD document. Then, a tool (Schematron) will extract and process those constraints.
schematron
Schematron
  • The Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages.
  • XSLT/XPath based
  • W3C Schemas are conservative: everything not permitted is forbidden.
  • Schematron is liberal: everything not forbidden is permitted.
  • No data typing; validation only
  • Handles unordered structures very well
  • Handles descendant constraints very well
  • Almost self-documenting
element conditional definition
Element - conditional definition

For instance, the following Schematron schema states that if the element E has the attribute one, then it must have the second attribute two as well:

E cannot have attribute 'one‘ alone.

schema validators
Schema Validators
  • Command Line Only
    • XSV by Henry Thompson
      • ftp://ftp.cogsci.ed.ac.uk/pub/XSV/XSV12.EXE
  • Has a Programmatic API
    • xerces by Apache
      • http://www.apache.org/xerces-j/index.html
    • IBM Schema Quality Checker (Note: this tool is only used to check your schema. It cannot be used to validate an instance document against a schema.)
      • http://www.alphaworks.ibm.com/tech/xmlsqc
    • MSXML4.0
      • http://www.microsoft.com
  • GUI Oriented
    • XML Spy
      • http://www.xmlspy.com
    • Turbo XML
      • http://www.extensibility.com
sources
Sources
  • http://www.w3.org/XML/Schema
  • http://www.xfront.com/
  • http://www.xml.com/
ad