Multi-Schema Project: Zero, One, or Many Namespaces?

Multi-Schema Project:Zero, One, or Many Namespaces? XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Managing Multiple Schemas - Same or Different targetNamespaces? . . . Schema-n.xsd Schema-1.xsd Schema-2.xsd … or no targetNamespace?

Multi-Schema Project:Zero, One, or Many Namespaces? • Issue: In a typical project multiple schemas will be created. What targetNamespace should each schema have? There are several designs to consider: • Heterogeneous Namespaces Design: each schema has a different targetNamespace • Homogeneous Namespace Design: all schemas have the same targetNamespace • No Namespace Design (Chameleon Design): the "integrating schema" has a targetNamespace, and the "supporting schemas" have no targetNamespace

Heterogeneous Namespaces Design <xsd:schema … targetNamespace="B"> <xsd:schema … targetNamespace="A"> B.xsd A.xsd <xsd:schema … targetNamespace="C"> <xsd:import namespace="A" schemaLocation="A.xsd"/> <xsd:import namespace="B" schemaLocation="B.xsd"/> … </xsd:schema> C.xsd

Heterogeneous Namespaces Design • Each schema has a different targetNamespace. • The integrating schema reuses the components in the supporting schemas by <import>ing them. • Example. C.xsd reuses the components in A.xsd and B.xsd • C.xsd is the integrating schema • A.xsd and B.xsd are the supporting schemas

Homogeneous Namespace Design <xsd:schema … targetNamespace="Library"> <xsd:schema … targetNamespace="Library"> LibraryBookCatalogue.xsd LibraryEmployees.xsd <xsd:schema … targetNamespace="Library"> <xsd:include schemaLocation="LibraryBookCatalogue.xsd"/> <xsd:include schemaLocation="LibraryEmployees.xsd"/> … </xsd:schema> Library.xsd

Homogeneous Namespace Design • Each schema has the same targetNamespace. • The integrating schema reuses the components in the supporting schemas by <include>ing them.

No Namespace Design (Chameleon Design) <xsd:schema …> <xsd:schema …> Q.xsd R.xsd <xsd:schema … targetNamespace="Z"> <xsd:include schemaLocation="Q.xsd"/> <xsd:include schemaLocation="R.xsd"/> … </xsd:schema> Z.xsd

No Namespace Design (Chameleon Design) • The integrating schema has a targetNamespace. • The supporting schemas have no targetNamespace. • The integrating schema reuses the components in the no-namespace supporting schemas by <include>ing them • When an integrating schema <include>s components from a schema with no targetNamespace, those no-namespace components "take on" the namespace of the integrating schema. This is called the Chameleon Effect. From now on the no-namespace components will be referred to as Chameleon components.

Homogeneous & Chameleon Designs - Reuse and Customize • With both of these designs the integrating schema can access the components in the supporting schemas by either of these mechanisms: • <include>: This element gives the integrating schema access to the components. The effect is as though the component declarations/definitions had been typed directly into the integrating schema. If the components come from a schema with no namespace they take on the namespace of the integrating schema. • <redefine>: This element gives access to the components, just like the <include> element does. In addition, it enables you to customize zero or more of the components (e.g., extend/restrict a complexType).

Example showing <redefine> <xsd:schema …> <xsd:complexType name="Book"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:schema> Book.xsd <xsd:schema … targetNamespace="Library"> <xsd:redefine schemaLocation="Book.xsd"> <xsd:complexType name="Book"> <xsd:complexContent> <xsd:extension base="Book"> <xsd:sequence> <xsd:element name="ISBN" type="xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexContent> </xsd:complexType> </xsd:redefine> … </xsd:schema> Library.xsd

Advantages/DisadvantagesHeterogeneous Namespace Design • Advantages: • Avoids Name Collisions: occasionally, in a multi-schema project the same component name may have been used by several schemas. With this design, since the components are all in different namespaces there will not be a name collision when an integrating schema <import>s them. • There is a visual (namespace) indication in instance documents of the origin/lineage of each component. • Disadvantages: • No <redefine> capability: you cannot redefine components that are in a different namespace.

Advantages/DisadvantagesHomogeneous Namespace Design • Advantages: • Has <redefine> capability: an integrating schema can customize the components in the supporting schemas by using the <redefine> element • Disadvantages: • Potential Name Collisions: there may be multiple components with the same name in the schemas. When they are <include>d in the integrating schema there will be a name collision. • There is no visual (namespace) indication in instance documents of the origin/lineage of each component (i.e., which schema declared it).

Advantages/DisadvantagesNo Namespace Design (Chameleon Design) • Before discussing the advantages/disadvantages of this design: • Let’s look at using proxy schemas as a mechanism for supporting Chameleon schemas.

Using Proxy Schemas in the Chameleon Design <xsd:schema …> <xsd:schema …> Q.xsd R.xsd <xsd:schema … targetNamespace="Z2"> <xsd:include schemaLocation="R.xsd"/> <xsd:schema … targetNamespace="Z1"> <xsd:include schemaLocation="Q.xsd"/> Q-Proxy.xsd R-Proxy.xsd <xsd:schema … targetNamespace="Z"> <xsd:import namespace="Z1" schemaLocation="Q-Proxy.xsd"/> <xsd:import namespace="Z2" schemaLocation="R-Proxy.xsd"/> … </xsd:schema> Z.xsd

Proxy Schema • Incorporates the components in a no-namespace schema using either <include> or <redefine> • Gives context (i.e., a namespace) to the no-namespace components. [Recall that when Chameleon components are used in a schema with a namespace they take-on that namespace.] • Customizes the Chameleon components using <redefine>

Using Proxy Schemas for Name Collision Avoidance • Create a proxy schema for each no-namespace schema. • The integrating schema <import>s the proxy schemas • Chameleon components with the same name are now embedded within different namespaces in the proxy schemas. Thus, when the integrating schema accesses the components in the proxy schemas there is no name collision.

Advantages/DisadvantagesNo Namespace Design (Chameleon Design) • Advantages: • Avoids Name Collisions: with the use of proxy schemas name collisions are avoided • The Chameleon components are not hardcoded to one namespace, one semantics. • Customizable Chameleon components: • Structure Customization: Customize the structure using <redefine> • Namespace/Semantics Customization: Customize the namespace and semantics using proxy schemas • Disadvantages: • Proxy schema overhead: to avoid name collision we saw that we could use proxy schemas. This is good, but it's also extra work. • There is no visual (namespace) indication in instance documents of the origin/lineage of Chameleon components (i.e., what schema defined it)

Creating Tools to Process Chameleon Components • Chameleon components blend in with the schemas that they reside within. That is, they adopt the targetNamespace of the schema that <include>s them. • So how do you write tools for components that can assume so many different faces (namespaces)? • One solution is that when you create Chameleon components assign them a global unique id (a GUID). Each component in a schema can have an id attribute. The id attribute is local to the schema, and has no representation in instance documents. The id attribute could be used by a tool to "locate" a Chameleon component, regardless of what "face" (namespace) it currently wears. That is, the tool can open up an instance document using DOM, and the DOM API will provide the tool access to the id value for all components in the instance document. (DOM 3 will provide this capability.)

Chameleon Components <xsd:schema …> <xsd:schema … targetNamespace="Z2"> <xsd:include schemaLocation=”Q.xsd"/> <xsd:schema … targetNamespace="Z1"> <xsd:include schemaLocation="Q.xsd"/> Chameleon components take-on the namespace of the <include>ing schema

Chameleon Component Tools Tool ? <xsd:schema …> <xsd:schema … targetNamespace="Z2"> <xsd:include schemaLocation=”Q.xsd"/> <xsd:schema … targetNamespace="Z1"> <xsd:include schemaLocation="Q.xsd"/> How does a tool identify components that can assume many faces? Certainly not by namespaces.

Identifier for each Component(Schema id Attribute) <xsd:element name="Lat_Lon" id="http://www.geospacial.org" … </xsd:element> Each component (element, complexType, simpleType, attribute) in a schema can have an associated id attribute. This can be used to uniquely identify each Chameleon component, regardless of its namespace.

Chameleon Component Tools Tool <xsd:schema …> id="www.geospacial.org" <xsd:schema … targetNamespace="Z2"> <xsd:include schemaLocation=”Q.xsd"/> <xsd:schema … targetNamespace="Z1"> <xsd:include schemaLocation="Q.xsd"/> id="www.geospacial.org" id="www.geospacial.org" A tool can locate the Chameleon component by using the id attribute.

Traceability: Schema id Versus Namespaces • The targetNamespace places all the components in a schema within one category. Example: zoom and f-stop are in the Olympus namespace. • Namespaces allow you to “visually categorize” components in instance documents. Example: olympus:zoom, and olympus:f-stop. • With the schema id attribute we can categorize components to a finer degree. Example: for the zoom element id="camera:olympus:zoom", for the f-stop element id="camera:olympus:f-stop" • Note that the schema id attribute is strictly for “programmatic categorization”. There is no visual representation in instance documents.

Guidelines • Use the Chameleon Design for schemas which contain components that have no inherent semantics by themselves and it is only in the context of an <include>ing schema that they take on semantics. Also, use the Chameleon Design when you don’t want to hardcode a namespace to a schema, rather you want <include>ing schemas to be able to provide their own application-specific namespace to the schema • Example. A repository of components - such as a schema which defines an array type, or vector, linked list, etc - should be declared with no targetNamespace (i.e., Chameleon). • As a rule of thumb, if your schema just contains type definitions (no element declarations) then that schema is probably a good candidate for being a Chameleon schema • Use the Homogeneous design • when all of your schemas are conceptually related • when there is no need to visually identify in instance documents the origin/lineage of each element/attribute. In this design all components come from the same namespace, so you loose the ability to identify that "element A comes from schema X". Oftentimes that's okay - you don't want to differentiate the elements/attributes. This design approach is well suited for those situations. • Use the Heterogeneous design • when there are multiple elements with the same name. (Avoid name collision) • when there is a need to visually identify in instance documents the origin/lineage of each element/attribute. In this design the components come from different namespaces, so you have the ability to identify that "element A comes from schema X". • Consider identifying components using the schema id attribute. This will enable a finer degree of traceability than is possible using namespaces.

Best Practice • The schema id attribute is not just for identifying Chameleon components! • However, it is especially useful for the Homogeneous and Chameleon designs. With both of these designs the <include> results in loss of origin of the included components. With the schema id attribute we can identify the origin of each component • The combination of namespaces plus the schema id attribute is a powerful tandem for visually and programmatically identifying components. • We recommend as standard practice all components have a schema id attribute. Do Labs 1,2,3

Multi-Schema Project: Zero, One, or Many Namespaces?

Multi-Schema Project: Zero, One, or Many Namespaces?

Presentation Transcript

Using Schema to Make Inferences

Lecture 07 Project

Module 2: Creating Schemas

The XML Standard

XML Schema

Schema

Multi-column Substring Matching for Database Schema Translation

IST346 :

Extensible Markup Language II

Module 3 XML Schema

Introduction to Computing Using Python

OCT

Namespaces

CS4221 - Database Design Course Project Comparison on XML Schema L anguages

Introduction to XML

Introduction to Computing Using Python

Chapter 3: XML Namespaces

XML Schema

Scoping and Namespaces

What is XML Schema ?

Introduction to XML Schema

XML Namespaces