1 / 62

Week5 – Schema

Week5 – Schema. Why Schema? Schemas vs. DTDs Introduction – W3C vs. Microsoft XDR Schema, How To? Element Types – Simple vs. Complex Attributes Restrictions/Facets Data Types An Example – How to design a schema from scratch?. Why Schema? Schema vs. DTD. DTDs

cody-byers
Download Presentation

Week5 – Schema

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week5 – Schema • Why Schema? Schemas vs. DTDs • Introduction – W3C vs. Microsoft XDR • Schema, How To? • Element Types – Simple vs. Complex • Attributes • Restrictions/Facets • Data Types • An Example – How to design a schema from scratch?

  2. Why Schema? Schema vs. DTD • DTDs • Not flexible enough to meet certain programming needs • Cannot be manipulated (searched, transformed, etc.) • Not XML documents (EBNF) • Schemas • Alternative to DTDs • XML documents (using XML syntax) • Two major models • W3C XML Schema • Early development stage (at time of this writing) • Microsoft XML Data-Reduced (XDR)

  3. Why Schema? Schema vs. DTD (cont.) <!DOCTYPE quantity [ !ELEMENT quantity (#PCDATA)]> • DTDs • Cannot ensure proper element content <quantity>hello</quantity>is valid • Use EBNF grammar <ElementType name=“quantity” content=“textOnly” model=“closed” dt:type=“int”> • Schema (in XDR format) • Can ensure proper element content – different data types supported <quantity>hello</quantity> is invalid • Use XML syntax

  4. Why Schema? • There are a number of reasons why XML Schema is better than DTD. • One of the greatest strengths of XML Schemas is the support for data types. • With the support for data types: • It is easier to describepermissible document content • It is easier to validate the correctness of data • It is easier to work with data from a database • It is easier to define data facets (restrictions on data) • It is easier to define data patterns (data formats) • It is easier to convert data between different data types

  5. Why Schema? (cont.) • XML Schemas use XML Syntax • Another great strength about XML Schemas is that they are written in XML. • Because XML Schemas are written in XML: • You don't have to learn another language • You can use your XML editor to edit your Schema files • You can use your XML parser to parse your Schema files • You can manipulate your Schema with the XML DOM (API) • You can transform your Schema with XSLT • XML Schemas Secure Data Communication • Data format agreement from both senders and receivers: the same "expectations" about the content. • With XML Schemas, the sender can describe the data in a way that the receiver will understand. • Data Confusion: A date like this: "03-11-2004" will, in some countries, be interpreted as 3. November and in other countries as 11. March, but an XML element with a data type like this: • <date type="date">2004-03-11</date> • ensures a mutual understanding of the content because the XML data type date requires the format YYYY-MM-DD.

  6. Why Schema? (cont.) • XML Schemas are Extensible • XML Schemas are extensible, just like XML, because they are written in XML. • With an extensible Schema definition you can: (manipulation) • Reuse your Schema in other Schemas • Create your own data types derived from standard types • Reference multiple schemas from the same document • Well-Formed is not Enough • A well-formed XML document is a document that conforms to the XML syntax rules: • must begin with the XML declaration • must have one unique root element • all start tags must match end-tags • XML tags are case sensitive • all elements must be closed • all elements must be properly nested • all attribute values must be quoted • XML entities must be used for special characters • Even if documents are Well-Formed they can still contain errors, and those errors can have serious consequences.

  7. Introduction • XML Schema is an XML-based doc., an alternation to DTD. • An XML schema describes and defines the structure of an XML document. • Schema is used to validate XML document • W3C: The XML Schema language is also referred to as XML Schema Definition (XSD). • What You Should Already Know • Before you study the XML Schema Language, you should have a basic understanding of XML and XML Namespaces. It will also help to have some basic understanding of DTD.

  8. Introduction (cont.) • What is an XML Schema? • The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. • An XML Schema: • defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes

  9. Introduction (cont.) • XML Schemas are the Successors of DTDs • XML Schemas will be used in most Web applications as a replacement for DTDs. Here are some reasons: • XML Schemas are extensible to future additions • XML Schemas are richer and more useful than DTDs • XML Schemas are written in XML • XML Schemas support data types • XML Schemas support namespaces • XML Schema is a W3C Recommendation • XML Schema was originally proposed by Microsoft, but became an official W3C recommendation in May 2001, M keeps an XML-Data Reduced (XDR) Schema (as we saw in the textbook) • The specification is now stable and has been reviewed by the W3C Membership. • For a full overview of W3C Activities and Status: http://www.w3schools.com/w3c/default.asp

  10. Schema – How to? • XML documents can have a reference to a DTD or an XML Schema • An Example presented in XML, DTD, and Schema

  11. Schema – How to? (cont.) An XML Schema – W3c XML Schema

  12. Schema – How to? (cont.) XML Schema – Microsoft XDR Schema XML + DTD XML + Schema (W3c) XML + Schema (Mic)

  13. Schema – How to? (cont.) • Referencing a Schema in an XML Document – W3C • xmlns: specifies the default namespace declaration. • This declaration tells the schema-validator that all the elements used in this XML document are declared in the "http://www.w3schools.com" namespace. • xmlns:xsi: specifiesthe XML Schema Instance namespace • xsi:schemaLocation: specifies where the XML schema file is

  14. Schema – How to? (cont.) • Referencing a Schema in an XML Document – Microsoft XDR • xmlns: specifies where the XML schema is

  15. Elements • root element: <schema> • W (as in W3c XML Schema): • M (as in Microsoft XDR Schema): • Attribute: xmlns (XML namespace)

  16. W3C or Microsoft XDR? • W3c is a standard version of XML schema recommendation, We’ll use it through the class. http://www.w3.org • Microsoft XDR spec. can be found through the web site: http://msdn.microsoft.com/library/ en-us/xmlsdk/html/xmconxdr_whatis.asp

  17. Elements (cont.) • XML Schemas define the elements of your XML files • What are Global Elements? • Global elements are elements that are immediate children of the "schema" element! • Local elements are elements nested within other elements • Element Types: • Simple Types • Contain only text with data types • Complex Types • An XML element that contains other elements and/or attributes. • There are four kinds of complex elements: • Empty elements • Elements that contain only other elements • Elements that contain only text • Elements that contain both other elements and text

  18. Elements – Simple Elements • A simple element is an XML element that can contain only text. It cannot contain any other elements or attribute • The text can be of many different types. It can be one of the types that are included in the XML Schema definition (boolean, string, date, etc.), • or it can be a custom type that you can define yourself. • Common XML Schema Data Types: XML Schema has a lot of built-in data types. Here is a list of the most common types: • xs:string • xs:decimal • xs:integer • xs:boolean • xs:date • xs:time

  19. Elements – Simple Elements (cont.)

  20. Elements – Simple Elements (cont.) • Declare Default and Fixed Values for Simple Elements • Simple elements can have a default value OR a fixed value set. • A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red": • A fixed value is also automatically assigned to the element. You cannot specify another value. In the following example the fixed value is "red":

  21. Attributes • All attributes are declared as simple types. • Only complex elements can have attributes • What is an Attribute? • Simple elements cannot have attributes. • If an element has attributes, it is considered to be of complex type. • The attribute itself is always declared as a simple type. • This means that an element with attributes always has a complex type definition. • Define an Attribute

  22. Attributes (cont.) • XML Schema has a lot of built-in data types (same as in Common XML Schema Data Types, slides #19). • A simple attribute definition

  23. Attributes (cont.) • Declare Default and Fixed Values for Attributes • Attributes can have a default value OR a fixed value specified. • A default value is automatically assigned to the attribute when no other value is specified. In the following example the default value is "EN": • A fixed value is also automatically assigned to the attribute. You cannot specify another value. In the following example the fixed value is "EN"

  24. Attributes (cont.) • Creating Optional and Required Attributes • All attributes are optional by default. To explicitly specify that the attribute is optional, use the "use" attribute • To make an attribute required

  25. Restrictions on Content • When an XML element or attribute has a type defined, it puts a restriction on the element's or attribute's content. • If an XML element is of type "xs:date" and contains a string like "Hello Mother", the element will not validate. • With XML Schemas, you can also add your own restrictions to your XML elements and attributes. These restrictions are called facets.  • Restrictions/Facets • Restrictions are used to control acceptable values for XML elements or attributes. • Restrictions on XML elements are called facets

  26. Restrictions on Content (cont.) • Restrictions on Values • This example defines an element called "age" with a restriction. The value of age cannot be lower than 0 or greater than 100

  27. Restrictions on Content (cont.) • Restrictions on a Set of Values • To limit the content of an XML element to a set of acceptable values, we would use the enumeration constraint. • This example defines an element called "car“ • The "car" element is a simple type with a restriction. The acceptable values are: Audi, Golf, BMW.

  28. Restrictions on Content (cont.) • Restrictions on a Series of Values • To limit the content of an XML element to define a series of numbers or letters that can be used, we would use the pattern constraint. • The examples defines two elements called “letter”, “word”: • The "letter" element is a simple type with a restriction. The only acceptable value is ONE of the LOWERCASE letters from a to z • The “word”element is a simple type with a restriction. The only acceptable value is Two of the UPPERCASE letters from A to Z

  29. Restrictions on Content (cont.) • Several other examples: • [a-zA-Z] :The only acceptable value is ONE of the LOWERCASE OR UPPERCASE letters • [xyz]: The only acceptable value is ONE of the following letters: x, y, OR z • [0-9][0-9]: The only acceptable value is TWO digits in a sequence, and each digit must be in a range from 0 to 9 • ([a-z])*: The acceptable value is zero or more occurrences of lowercase letters from a to z (applied +, *, and pipe character: |) • [a-zA-Z0-9]{8} :There must be exactly eight characters in a row and those characters must be lowercase or uppercase letters, or a number

  30. Restrictions on Content (cont.) • Restrictions on White Space Characters • To specify how white space characters should be handled, use the whiteSpace constraint. • This example defines an element called "address" • The whiteSpace constraint is set to "preserve", the XML processor WILL NOT remove any white space characters • Alternatively, the whiteSpace constraint can set to “replace", the XML processor WILL REPLACE all white space characters (line feeds, tabs, spaces, and carriage returns) with spaces • The whiteSpace constraint can set to "collapse", which means that the XML processor WILL REMOVE all white space characters

  31. Restrictions on Content (cont.) • Restrictions on Length • To limit the length of a value in an element, we would use the length, maxLength, and minLength constraints.

  32. Restrictions on Data Types

  33. XML Data Types (cont.)

  34. Elements – Complex (cont.) • What is a Complex Element? • A complex element is an XML element that contains other elements and/or attributes. • There are four kinds of complex elements: • Empty elements • Elements that contain only other elements • Elements that contain only text • Elements that contain both other elements and text • Note: Each of these elements may contain attributes as well!

  35. Elements – Complex (cont.) • 4 types of complex XML elements,

  36. Elements – Complex (cont.) • Define a complex element

  37. Elements – Complex (cont.) • Type I: Define Complex Types for Empty Elements • An empty complex element can contain attributes; but it cannot have any content between the opening and closing tags.

  38. Elements – Complex (cont.) • Type II: Define Complex Types with Elements Only • An "elements only" complex type contains an element that contains only other elements

  39. Elements – Complex (cont.) • Type III: Define Complex Text-Only Elements • A complex text element can contain both attributes and text • This type contains only simple content (text and attributes), • Add a simpleContent element around the content. • Must define an extension OR a restriction within the simpleContent element,

  40. Elements – Complex (cont.) • Examples

  41. Elements – Complex (cont.) • Type IV: Define Complex Types with Mixed Content • A mixed complex type element can contain attributes, elements, and text

  42. Elements – Complex (cont.) • Example: Mixed content + reusable element definition

  43. Elements – Complex (cont.) • Complex Types Indicators • Control HOW elements are to be used in documents with indicators • 7 types: • Order indicators: Define how elements should occur • All • Choice • Sequence • Occurrence indicators: Define how often an element can occur • maxOccurs • minOccurs • Group indicators: Define related sets of elements, attributes • Element Group name • Attribute Group name

  44. Elements – Complex (cont.) Order Indicators Order indicators are used to define how elements should occur.

  45. Elements – Complex (cont.) Occurrence Indicators Occurrence indicators are used to define how often an element can occur Note:For all "Order" and "Group" indicators (any, all, choice, sequence, group name, and group reference) the default value for maxOccurs and minOccurs is 1

  46. Elements – Complex (cont.) • A working example (which type of complex element?) XML

  47. Elements – Complex (cont.) Group Indicators Group indicators are used to define related sets of elements or attributes

  48. Elements – Complex (cont.) • The example of defining an element group and using it in an element definition

  49. Elements – Complex (cont.) • The example of defining an attribute group and using it in an element definition

  50. Elements – The <any> Element (cont.) • The <any> element enables us to extend the XML document with elements not specified by the schema • A declaration for an element is presented. • By using the <any> element we can extend (after general elements) the content of "person" with any element:

More Related