1 / 67

Lecture 14: Metadata and Markup

Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/. Lecture 14: Metadata and Markup. SIMS 202: Information Organization and Retrieval. Lecture Overview. Review

ravi
Download Presentation

Lecture 14: Metadata and Markup

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/ Lecture 14: Metadata and Markup SIMS 202: Information Organization and Retrieval

  2. Lecture Overview • Review • XML and Document Engineering • Metadata And Markup • XML As A Metadata Lingua Franca • METS • SGML vs. XML DTD Construction • XML Schemas • XML For Protocols And Metadata Languages • Readings/Discussion

  3. Lecture Overview • Review • XML and Document Engineering • Metadata And Markup • XML As A Metadata Lingua Franca • METS • SGML vs. XML DTD Construction • XML Schemas • XML For Protocols And Metadata Languages • Readings/Discussion

  4. Lecture Overview • Review • XML and Document Engineering • Metadata And Markup • XML As A Metadata Lingua Franca • METS • SGML vs. XML DTD Construction • XML Schemas • XML For Protocols And Metadata Languages • Readings/Discussion

  5. XML as a common syntax • XML (and SGML) provide a way of expressing the structure of documents that can be verified and validated by document processing systems • “Documents” can be metadata structures • Such as the description of a particular photograph in our Phone project • XML thus provides a way of representing metadata descriptions as well as the content that they describe

  6. XML as a common syntax • All XML documents follow some simple rules that make them interchangeable and usable across different systems • All data and markup is in UNICODE • All elements are marked by begin and end tags • All markup is case-sensitive • XML DTD’s and/or Schemas define the valid structure (and sometimes content) of the documents

  7. Example – METS • METS – the Metadata Encoding and Transmission Standard is a new Schema intended to provide: • “a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium” • METS can be used to “wrap” complex sets of data (the actual data, with rules for encoding binary forms), the metadata describing the parts of that data, and the sequence and conditions under which the data can or should be presented or displayed

  8. Lecture Overview • Review • XML and Document Engineering • Metadata And Markup • XML As A Metadata Lingua Franca • METS • SGML vs. XML DTD Construction • XML Schemas • XML For Protocols And Metadata Languages • Readings/Discussion

  9. SGML/XML Structure • An SGML document consists of three parts: • The SGML Declaration • The Document Type Definition (DTD) • The Document Instance • An XML document REQUIRES only the document instance, but for effective processing a DTD is very important • XML Schema (later) provides an alternative to DTDs for XML applications

  10. Document Type Definitions • The DTD describes the structural elements and "shorthand" markup for a particular document type and defines: • Names of "legal" elements • How many times elements can appear • The order of elements in a document • Whether markup can be omitted (SGML only) • Contents of elements (i.e., nested structures) • Attributes associated with elements • Names of "entities" • Short-hand conventions for element tags (SGML only)

  11. DTD Components • The major components of a DTD are: • Entity Declarations • Element Declarations • Attribute Declarations

  12. Document Type Definitions • Entity Declarations are a "macro" definition facility for both DTD and Document instance parts • General Internal Entity Definitions<!ENTITY name "substitute string">referenced by &name; • General External Entity Definitions<!ENTITY name SYSTEM "file path">referenced by &name; • Parameter Entity Definitions (used only inside DTDs)<!ENTITY %name "substitute string">or<!ENTITY %name SYSTEM "file path">referenced by %name; or %name

  13. Document Type Definitions • SGML Element Declarations define the structural elements of a document and its associated markup<!ELEMENT name - - content_model or declared_content +(include_list) -(exclude_list) > • Omitted tag minimization indicates whether start-tags or end-tags can be omitted in the markup (o) or (-) are required in SGML but can NOT be used in XML

  14. Document Type Definitions • Content model provides a nested structural description of the elements that make up this element, e.g.: <!ELEMENT memo - - ((to & from), body, close?)> <!ELEMENT body - O (p)* > <!ELEMENT p - O (#PCDATA | q)*> <!ELEMENT q - - (#PCDATA)>... • ANY (in SGML) may be used to indicate a content model of any elements in the DTD, in any order

  15. Document Type Definitions • Same content model in XML <?xml version = “1.0”?> <!DOCTYPE memo [ <!ELEMENT memo ((to | from)+, body, close?)><!ELEMENT body (p)* ><!ELEMENT p (#PCDATA | q)* ><!ELEMENT q (#PCDATA)>… ]> • Note the XML processing instruction “Prolog” • Note that & in previous page is not legal XML

  16. Document Type Definitions • Declared content can be:PCDATA, CDATA, RCDATA, EMPTY • Inclusion and Exclusion lists can be used to indicate elements that can occur or are forbidden to occur in any sub-elements of the content model (NOT in XML), e.g.: <!ELEMENT memo -- ((to & from), body close?) +(fn)> • Says that element fn can appear anyplace in the memo

  17. Document Type Definitions • Attribute Declarations define attributes associated with (potentially) each element of a document and provide the acceptable values for those attributes

  18. Attributes Example • <!ATTLIST associate_element attribute_name declared_value default_value > • <!ATTLIST memo status (PUBLIC | CONFIDENTIAL) PUBLIC> • In markup of a document: <memo status="CONFIDENTIAL">also, because of the default set:<memo>would be the same as <memo status="PUBLIC">There are a variety of special defaults and data types that can be given in attribute definitions

  19. Sample SGML DTD <!doctype ELIB-TEXTS [ <!-- This is a DTD for bibliographic records extracted from the elib/rfc1357 simple bibliographic format. --> <!ELEMENT ELIB-TEXTS o o (ELIB-BIB*)> <!-- We allow most elements to occur any number of times in any order --> <!-- this is because there is little consistency in the actual usage. --> <!ELEMENT ELIB-BIB - - (BIB-VERSION, ID, ENTRY?, DATE?, TITLE*, ORGANIZATION*, (SERIES | TYPE | REVISION | REVISION-DATE | AUTHOR-PERSONAL | AUTHOR-INSTITUTIONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-INSTITUTIONAL | CONTACT AUTHOR | PROJECT | PAGES | BIOREGION | CERES-BIOREGION | TEXTSOUP | LOCATION | ULTIMATE-CLIENT | URL | KEYWORDS | NOTES | ABSTRACT)*, (TEXT-REF | PAGED-REF)* )> <!-- We won't make any assumptions about content... all PCDATA --> <!ELEMENT ID - o (#PCDATA)> <!ELEMENT ABSTRACT - o (#PCDATA)> <!ELEMENT AUTHOR-CONTRIBUTING-INSTITUTIONAL - o (#PCDATA)> <!ELEMENT AUTHOR-CONTRIBUTING-PERSONAL - o (#PCDATA)> <!ELEMENT AUTHOR-PERSONAL-CONTRIBUTING - o (#PCDATA)> … etc… ]>

  20. XML Version <!doctype ELIB-TEXTS [ <!-- This is a DTD for bibliographic records extracted from the elib/rfc1357 simple bibliographic format. --> <!ELEMENT ELIB-TEXTS(ELIB-BIB*)> <!-- We allow most elements to occur any number of times in any order --> <!-- this is because there is little consistency in the actual usage. --> <!ELEMENT ELIB-BIB (BIB-VERSION, ID, ENTRY?, DATE?, TITLE*, ORGANIZATION*, (SERIES | TYPE | REVISION | REVISION-DATE | AUTHOR-PERSONAL | AUTHOR-INSTITUTIONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-PERSONAL | AUTHOR-CONTRIBUTING-INSTITUTIONAL | CONTACT AUTHOR | PROJECT | PAGES | BIOREGION | CERES-BIOREGION | TEXTSOUP | LOCATION | ULTIMATE-CLIENT | URL | KEYWORDS | NOTES | ABSTRACT)*, (TEXT-REF | PAGED-REF)* )> <!-- We won't make any assumptions about content... all PCDATA --> <!ELEMENT ID (#PCDATA)> <!ELEMENT ABSTRACT (#PCDATA)> <!ELEMENT AUTHOR-CONTRIBUTING-INSTITUTIONAL (#PCDATA)> <!ELEMENT AUTHOR-CONTRIBUTING-PERSONAL (#PCDATA)> <!ELEMENT AUTHOR-PERSONAL-CONTRIBUTING (#PCDATA)> … etc… ]>

  21. Document Using That DTD <ELIB-BIB> <BIB-VERSION>ELIB-v1.0 </BIB-VERSION> <ID>6</ID> <ENTRY>February 13 1995</ENTRY> <DATE>March 1, 1993</DATE> <TITLE>Water Conditions in California Report 2</TITLE> <ORGANIZATION>California Department of Water Resources</ORGANIZATION> <SERIES>120-93</SERIES> <TYPE>bulletin</TYPE> <AUTHOR-INSTITUTIONAL>California Department of Water Resources </AUTHOR-INSTITUTIONAL> <PAGES>17</PAGES> <TEXT-REF>/elib/data/disk/disk5/documents/6/HYPEROCR/hyperocr.html </TEXT-REF> <PAGED-REF>/elib/data/disk/disk5/documents/6/OCR-ASCII-NOZONE </PAGED-REF> </ELIB-BIB>

  22. Dublin Core • Review… • Simple metadata for describing internet resources • For “Document-Like Objects” • 15 Elements

  23. Title Creator Subject Description Publisher Other Contributors Date Resource Type Format Resource Identifier Source Language Relation Coverage Rights Management Dublin Core Elements

  24. DC XML DTD Implementation • There have been various versions • This one is the one recommended (required) by the Open Archives Initiative Metadata Harvesting Protocol (OAI-MHP) • Uses XML Name Spaces • Available at http://dublincore.org/documents/2001/09/20/dcmes-xml/

  25. DC Element and Attribute Definitions <!-- The elements from DCMES 1.1 --> <!-- The name given to the resource. --> <!ELEMENT dc:title (#PCDATA)> <!ATTLIST dc:title xml:lang CDATA #IMPLIED> <!-- An entity primarily responsible for making the content of the resource. --> <!ELEMENT dc:creator (#PCDATA)> <!ATTLIST dc:creator xml:lang CDATA #IMPLIED> <!-- The topic of the content of the resource. --> <!ELEMENT dc:subject (#PCDATA)> <!ATTLIST dc:subject xml:lang CDATA #IMPLIED> <!-- An account of the content of the resource. --> <!ELEMENT dc:description (#PCDATA)> <!ATTLIST dc:description xml:lang CDATA #IMPLIED> <!-- The entity responsible for making the resource available. --> <!ELEMENT dc:publisher (#PCDATA)> <!ATTLIST dc:publisher xml:lang CDATA #IMPLIED> <!-- An entity responsible for making contributions to the content of the resource. --> <!ELEMENT dc:contributor (#PCDATA)> <!ATTLIST dc:contributor xml:lang CDATA #IMPLIED> <!-- A date associated with an event in the life cycle of the resource. --> <!ELEMENT dc:date (#PCDATA)> <!ATTLIST dc:date xml:lang CDATA #IMPLIED>

  26. DC Element Definitions (cont.) <!-- The nature or genre of the content of the resource. --> <!ELEMENT dc:type (#PCDATA)> <!ATTLIST dc:type xml:lang CDATA #IMPLIED> <!-- The physical or digital manifestation of the resource. --> <!ELEMENT dc:format (#PCDATA)> <!ATTLIST dc:format xml:lang CDATA #IMPLIED> <!-- An unambiguous reference to the resource within a given context. --> <!ELEMENT dc:identifier (#PCDATA)> <!ATTLIST dc:identifier xml:lang CDATA #IMPLIED> <!ATTLIST dc:identifier rdf:resource CDATA #IMPLIED> <!-- A Reference to a resource from which the present resource is derived. --> <!ELEMENT dc:source (#PCDATA)> <!ATTLIST dc:source xml:lang CDATA #IMPLIED> <!ATTLIST dc:source rdf:resource CDATA #IMPLIED> <!-- A language of the intellectual content of the resource. --> <!ELEMENT dc:language (#PCDATA)> <!ATTLIST dc:language xml:lang CDATA #IMPLIED> <!-- A reference to a related resource. --> <!ELEMENT dc:relation (#PCDATA)> <!ATTLIST dc:relation xml:lang CDATA #IMPLIED> <!ATTLIST dc:relation rdf:resource CDATA #IMPLIED> <!-- The extent or scope of the content of the resource. --> <!ELEMENT dc:coverage (#PCDATA)> <!ATTLIST dc:coverage xml:lang CDATA #IMPLIED> <!-- Information about rights held in and over the resource. --> <!ELEMENT dc:rights (#PCDATA)> <!ATTLIST dc:rights xml:lang CDATA #IMPLIED>

  27. A More Complex SGML DTD <!DOCTYPE USMARC [ <!-- USMARC DTD. UCB-SLIS v.0.08 --> <!-- By Jerome P. McDonough, April 1, 1994 --> <!ELEMENT USMARC - - (Leader, Directry, VarFlds)> <!ATTLIST USMARC Material (BK|AM|CF|MP|MU|VM|SE) "BK" id CDATA #IMPLIED> <!-- Author's Note: the id attribute for the USMARC element is intended to hold a unique record number for each MARC record in the local database. That is to say, it is intended ONLY as an aid in maintaining the local database of MARC records --> <!ELEMENT Leader - O (LRL, RecStat, RecType, BibLevel, UCP, IndCount, SFCount, BaseAddr, EncLevel, DscCatFm, LinkRec, EntryMap)> <!ELEMENT Directry - O (#PCDATA)> <!ELEMENT VarFlds - O (VarCFlds, VarDFlds)> <!-- Component parts of Leader --> <!-- Logical Record Length --> <!ELEMENT LRL - O (#PCDATA)> …etc…

  28. More Complex DTD (cont.) <!-- Variable Data Fields --> <!ELEMENT VarDFlds - O (NumbCode, MainEnty?, Titles, EdImprnt?, PhysDesc?, Series?, Notes?, SubjAccs?, AddEnty?, LinkEnty?, SAddEnty?, HoldAltG?, Fld9XX?)> <!-- Component Parts of Variable Data Fields --> <!-- Numbers & Codes --> <!ELEMENT NumbCode - O (Fld010?, Fld011?, Fld015?, Fld017*, Fld018?, Fld019*, Fld020*, Fld022*, Fld023*, Fld024*, Fld025*, Fld027*, Fld028*, Fld029*, Fld030*, Fld032*, Fld033*, Fld034*, Fld035*, Fld036?, Fld037*, Fld039*, Fld040?, Fld041?, Fld042?, Fld043?, Fld044?, Fld045?, Fld046?, Fld047?, Fld048*, Fld050*, Fld051*, Fld052*, Fld055*, Fld060*, Fld061*, Fld066?, Fld069*, Fld070*, Fld071*, Fld072*, Fld074*, Fld080?, Fld082*, Fld084*, Fld086*, Fld088*, Fld090*, Fld096*)> <!-- Main Entries --> <!ELEMENT MainEnty - O (Fld100?, Fld110?, Fld111?, Fld130?)> <!-- Titles --> <!ELEMENT Titles - O (Fld210?, Fld211*, Fld212*, Fld214*, Fld222*, Fld240?, Fld242*, Fld243?, Fld245, Fld246*, Fld247*)> <!-- Edition, Imprint, etc. --> <!ELEMENT EdImprnt - O (Fld250?, Fld254?, Fld255*, Fld256?, Fld257?, Fld260?, Fld261?, Fld262?, Fld263?, Fld265?)> <!-- Physical Description, etc. --> <!ELEMENT PhysDesc - O (Fld300*, Fld305*, Fld306?, Fld310?, Fld315?, Fld321*, Fld340*, Fld350?, Fld351*, Fld355*, Fld357*, Fld362*)> …etc…

  29. Complex DTD (cont.) <!-- Title Statement --> <!ELEMENT Fld245 - O (Six?, (a|b|c|f|g|h|k|n|p|s)+)> <!ATTLIST Fld245 AddEnty (No|Yes|Blank) #IMPLIED NFChars (0|1|2|3|4|5|6|7|8|9|Blnk) #IMPLIED> …etc… <!-- Subfield Element Declarations --> <!ELEMENT a - O (#PCDATA)> <!ELEMENT b - O (#PCDATA)> <!ELEMENT c - O (#PCDATA)> <!ELEMENT d - O (#PCDATA)> <!ELEMENT e - O (#PCDATA)>

  30. Document Markup • All document markup is derived from the DTD for the particular document type • In SGML the DTD should be referenced in the document using the DOCTYPE declaration: <!DOCTYPE name SYSTEM "file_path" >or<!DOCTYPE name SYSTEM "file_path" [doctype_declaration_subset]>or<!DOCTYPE name [doctype_declaration_subset]>The doctype_declaration_subset can be any combination of elements, entity, and attribute declarations

  31. HTML • HTML was not originally "real" SGML, the DTD was invented after the language • It is often more concerned with the form of the output on the screen than with the structural contents of the HTML docs • Relies on the application (such as Netscape) to implement interesting actions like hypertext linking • XHTML is now a W3C “recommendation” that applies XML conventions to HTML, and provides a growing set of capabilities within an XML framework (our phones use XHTML)

  32. Lecture Overview • Review • XML and Document Engineering • Metadata And Markup • XML As A Metadata Lingua Franca • METS • SGML vs. XML DTD Construction • XML Schemas • XML For Protocols And Metadata Languages • Readings/Discussion

  33. What are XML Schemas? • An XML vocabulary for expressing your data's structure AND content types, and even the business rules involved in processing the data • Written in XML themselves • Support namespaces for combining multiple schemas in the same documents • The slides in this section are based on an XML tutorial by Roger L. Costello

  34. Example <location> <latitude>32.904237</latitude> <longitude>73.620290</longitude> <uncertainty units="meters">2</uncertainty> </location> Is this data valid? To be valid, it must meet these constraints (data business rules): 1. The location must be comprised of a latitude, followed by a longitude, followed by an indication of the uncertainty of the lat/lon measurements. 2. The latitude must be a decimal with a value between -90 to +90 3. The longitude must be a decimal with a value between -180 to +180 4. For both latitude and longitude the number of digits to the right of the decimal point must be exactly six digits. 5. The value of uncertainty must be a non-negative integer 6. The uncertainty units must be either meters or feet. We can express all these data constraints using XML Schemas

  35. Validating your data <location> <latitude>32.904237</latitude> <longitude>73.620290</longitude> <uncertainty units="meters">2</uncertainty> </location> XML Schema validator Data is ok! -check that the latitude is between -90 and +90 -check that the longitude is between -180 and +180 - check that the fraction digits is 6 for lat and lon ... XML Schema

  36. Purpose of XML Schemas • Specify: • the structure of instance documents • "this element contains these elements, which contains these other elements, etc" • the datatype of each element/attribute • "this element shall hold an integer with the range 0 to 12,000" (DTDs don't do too well with specifying datatypes like this)

  37. Why Schemas? Motivation for XML Schemas • People are dissatisfied with DTDs • It's a different syntax • You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent • Limited datatype capability • DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the <elevation> element to hold an integer with a range of 0 to 12,000" • Desire a set of datatypes compatible with those found in databases • DTD supports 10 datatypes; XML Schemas supports 44+ datatypes

  38. Highlights of XML Schemas • XML Schemas are a tremendous advancement over DTDs: • Enhanced datatypes • 44+ versus 10 • Can create your own datatypes • Example: "This is a new type based on the string type and elements of this type must follow this pattern: ddd-dddd, where 'd' represents a digit". • Written in the same syntax as instance documents • less syntax to remember • Object-oriented'ish • Can extend or restrict a type (derive new type definitions on the basis of old ones) • Can express sets, i.e., can define the child elements to occur in any order

  39. Highlights of XML Schemas • Can specify element content as being unique (keys on content) and uniqueness within a region • Can define multiple elements with the same name but different content • Can define elements with nil content • Can define substitutable elements - e.g., the "Book" element is substitutable for the "Publication" element.

  40. BookStore.dtd <!ELEMENT BookStore (Book)+> <!ELEMENT Book (Title, Author, Date, ISBN, Publisher)> <!ELEMENT Title (#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Date (#PCDATA)> <!ELEMENT ISBN (#PCDATA)> <!ELEMENT Publisher (#PCDATA)>

  41. ELEMENT ATTLIST BookStore Author #PCDATA Book ID Title CDATA NMTOKEN ISBN Publisher Date ENTITY This is the vocabulary that DTDs provide to define your new vocabulary

  42. http://www.w3.org/2001/XMLSchema http://www.books.org (targetNamespace) complexType element BookStore Author sequence Book schema Title boolean string ISBN Publisher Date integer This is the vocabulary that XML Schemas provide to define your new vocabulary One difference between XML Schemas and DTDs is that the XML Schema vocabulary is associated with a name (namespace). Likewise, the new vocabulary that you define must be associated with a name (namespace). With DTDs neither set of vocabulary is associated with a name (namespace) [DTDs pre-dated namespaces].

  43. <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema> BookStore.xsd xsd = Xml-Schema Definition

  44. <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema> <!ELEMENT BookStore (Book)+> <!ELEMENT Book (Title, Author, Date, ISBN, Publisher)> <!ELEMENT Title (#PCDATA)> <!ELEMENT Author (#PCDATA)> <!ELEMENT Date (#PCDATA)> <!ELEMENT ISBN (#PCDATA)> <!ELEMENT Publisher (#PCDATA)>

  45. <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema> All XML Schemas have "schema" as the root element.

  46. <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema> The elements and datatypes that are used to construct schemas - schema - element - complexType - sequence - string come from the http://…/XMLSchema namespace

  47. XMLSchema Namespace http://www.w3.org/2001/XMLSchema complexType element sequence schema boolean string integer

  48. <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema> Says that the elements defined by this schema - BookStore - Book - Title - Author - Date - ISBN - Publisher are to go in this namespace

  49. Book Namespace (targetNamespace) http://www.books.org (targetNamespace) BookStore Author Book Title ISBN Publisher Date

  50. <?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:schema> The default namespace Is http://www.books.org which is the targetNamespace! This is referencing a Book element declaration. The Book in what namespace? Since there is no namespace qualifier it is referencing the Book element in the default namespace, which is the targetNamespace! Thus, this is a reference to the Book element declaration in this schema.

More Related