Lifecycle metadata for digital objects
1 / 24

Lifecycle Metadata for Digital Objects - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Lifecycle Metadata for Digital Objects. September 27, 2004 Implementing Metadata in XML. What constitutes the XML environment?. XML editor (note that it can’t do anything automatic until you load a DTD or schema or have entered a number of elements) XML parser/validator

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Lifecycle Metadata for Digital Objects

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Lifecycle Metadata for Digital Objects

September 27, 2004

Implementing Metadata in XML

What constitutes the XML environment?

  • XML editor (note that it can’t do anything automatic until you load a DTD or schema or have entered a number of elements)

  • XML parser/validator

  • Display program (e.g. browser)

  • DTD or schema to define elements

  • Style sheet for display of elements

  • XSLT engine to convert to other formats (e.g. database)

Tools for home use

  • In class we will be using XMetaL Author, but it’s not free (there is a trial download if you are a registered Corel user).

  • One free XML authoring environment is Amaya from the W3C:

  • Another is XML Cooktop:

  • You can also validate individual XML files using online web services pointed to at:

Review of “orders” of data

  • First-order: language (segmentation)

  • Second-order: encoding

  • Third-order: meaning

  • Fourth order: function

  • Fifth order: groups of 3 and/or 4

  • Note that each order is “meta” with respect to the one below and “data” with respect to the one above (cf. Goedel)

  • Hence you mark up the order you wish to objectivize and access (examples: TEI, EAD)

XML does nothing

  • XML structures information

  • XML stores information

  • XML sends information

  • XML is not procedural

Two fancy wrappers (what orders are involved?)

  • The XML document as metadata repository

    • XML document contains all the metadata

    • Objects themselves are in separate files pointed to by the document (XLinks)

  • The XML document as the whole enchilada

    • Object is marked up in XML too

    • Metadata is added as additional elements to the original object

Why not mark up the object (I.e., place markup within the object)?

  • If the object is not a text!

  • If the object is a text, but the text is too complex to mark up in XML (hierarchical model doesn’t suit everything; “overlap” problem)

Why mark up the object itself?

  • If the object is a text

  • If the text is well-formed as a hierarchical structure (problem of overlaps not solved in XML)

  • Advantage is that the object carries its own metadata

Best of both worlds

  • XML metadata tags

  • (Text) object marked up in XML

  • Original (text) object pointed to in separate file for preservation

XML Syntax rules for well-formed XML

  • An element containing text or elements must have start and end tags

  • An empty element’s tag must have a slash (/) before the end bracket

  • All attribute values must be in quotes

  • Elements may not overlap

  • Isolated markup characters may not appear in parsed content

  • Element names may not use all characters, and case is significant

*Structure of the XML Document*

  • Document prologue

    • XML declaration

    • Document type declaration

      • Points to root element

      • Points to external standards (DTDs, namespaces)

      • Lists special internally-defined elements

  • Document itself

    • Bracketed by root element

    • Contains elements, attributes, entities

    • Nested structure

XML Declaration

  • Gives version of XML

    • <?xml version=“1.0”?>

  • Defines character encoding

    • <?xml version=“1.0” encoding=“UTF-8”?>

  • Indicates presence of other needed files

    • <?xml version=“1.0” encoding=“UTF-8” standalone=“no”?>

Document type declaration

  • Points first to root element

    • <!DOCTYPE example>

  • Then points to any external source for definition of document structure

    • <!DOCTYPE example SYSTEM “c:\My Documents\classes\metadata\example.dtd”…>

  • Then adds any overriding local elements (internal subset)

Function of the DTD

  • Document Type Definition; not expressed in XML

  • Defines the language in which you will be talking about objects and against which the XML markup may be validated: it is the grammar of the XML document that refers to it

  • Equivalent to declaration of data types in a programming language; allows you to define your own types (a private, or SYSTEM DTD)

  • Or you can use a preexisting DTD (a PUBLIC DTD, example: EAD)

Element declarations in the DTD

  • Occur within the DTD or to give local definition overriding the DTD

    • <!ELEMENT name content-model>

  • Content-models:

    • (#PCDATA) for character data

    • (element, element, element…) modified by

      , | ? + *

Attribute declarations in the DTD

  • All attributes for one element declared in an attribute list

  • Gives attribute name, attribute’s data type, attribute’s behavior

    • <!ATTLIST elementname

      attname1 atttype1 attdesc1

      attname2 atttype2 attdesc2


Entity declarations in the DTD

  • General entities are like variables. They assign a name and define a type. Examples:

  • quoted text <!ENTITY title “Temporary crazy title”>

  • text from an external source

  • other data from an external source <!ENTITY logo SYSTEM “images/logo.gif” NDATA gif>

Elements in the XML document

  • Container elements (element tags bracket data)

    • <name attribute=value>chardata</name>

  • Empty elements (no data is contained, begin and end element tags are collapsed to one)

    • <name attribute=value />

Attributes in the XML document

  • Used to provide more details about an element

  • <elementname attname=“value”>

Entities in the XML document

  • The “entity” behaves like a “variable”

  • Within the document, the entity name is used preceded by an ampersand:

    • <greeting> Dear &name, </greeting>

  • When the document is displayed or used, the entity value at the time will be substituted for the name

Tools for working with XML

  • Authoring, display

    • Amaya (free W3C browser/authoring software)

    • XML Cooktop (free XML authoring software)

  • Display

    • Internet Explorer

    • Netscape 6

    • Mozilla

  • Database

    • Apache Xindice

XML Cooktop editor screenshot

Amaya screenshot

How does all this relate to databases?

  • By defining a “language” for markup in XML, you create categories

  • Even freely-occurring objects can thus be found and grouped (e.g., TEI grammatical markup)

  • Compare to accepted method of placing text in a relational table in order to process it

  • Especially useful for regularly-occurring metadata

  • This is why the structure of a markup scheme is so important: you get what you pay for

  • Login