tamino a dbms designed for xml n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Tamino – a DBMS Designed for XML PowerPoint Presentation
Download Presentation
Tamino – a DBMS Designed for XML

Loading in 2 Seconds...

play fullscreen
1 / 35

Tamino – a DBMS Designed for XML - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Tamino – a DBMS Designed for XML. Dr. Harald Schoning Presenter: Wenhui Li University of Ottawa Instructed by: Dr. Mengchi Liu Carleton University. Abstract. Who?- Software AG What?- XML database management system When? 1999 the first time unveiled 2004 June Tamino XML Server 4.2

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Tamino – a DBMS Designed for XML' - lamar


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
tamino a dbms designed for xml

Tamino –a DBMS Designed for XML

Dr. Harald Schoning

Presenter: Wenhui Li

University of Ottawa

Instructed by:

Dr. Mengchi Liu

Carleton University

abstract
Abstract
  • Who?- Software AG
  • What?- XML database management system
  • When?
    • 1999 the first time unveiled
    • 2004 June Tamino XML Server 4.2
  • Why?
    • management and transfer of structured and unstructured data
    • completely designed for XML
industry background
Industry Background
  • XML is becoming prevailing for data processing in the internet.
  • Early goals of Tamino
    • Easy data exchanging
  • Evolution trend
    • Storing, managing, publishing and exchanging XML documents
    • Business modeling
industry background cont xml support in databases
Industry Background cont’XML support in databases
  • Oracle XML Developer’s Kit
  • SQL Server 2000
  • DB2 XML Extender
limitations of xml support via traditional rdbms or ordb
Limitations of XML support via traditional RDBMS or ORDB
  • XML is not well-structured like RDB,ORDB or OODB
  • Storing and querying XML is possible but not feasible in these DB systems
two modeling approaches
Two Modeling approaches
  • Data-centric documents
    • Regular structure
    • Order does not matter
    • No mixed content
  • Document-centric documents
    • less regular structure
    • significance of the order
    • mixed content
why don t use relational db
Why don’t use relational DB
  • XML documents can have schematic information (DTD), but they are notrequired to.
  • classical database handling objects of a predefined type, cannot be applied in XML
why doesn t use xml itself
Why doesn’t use XML itself?
  • XML is just a markup language, it does not contain processing facilities on its own
  • querying a set of XML documents is outside the scope of the XML recommendation

Therefore, comes the Tamino!

what does tamino do
What does Tamino do?
  • What’s Tamino (the 1st slide)
  • Store XML documents, HTML files and GIF images, etc.
  • Retrieve them in a set-oriented manner, with sophisticated query facilities
the schema of xml documents
The schema of XML documents
  • XML support schematic information, but it differs from the classical databases
  • DTD have a couple of deficiencies (e.g. data type)
  • W3C working group is developing an XML schema description language
  • However, DTD is the only standard schema at present
xml schema vs rdb and oodb schema
XML schema vs. RDB and OODB schema
  • In RDB or OODB, the schema is created before the instances can be stored
  • Instances must conform to the declared schema
  • In XML database, each instance declares a schema on its own.
  • for XML documents, grouping of objects of homogeneous structure into (pre-defined) tables or classes doesn’t work
query and index of xml schema
Query and Index of XML schema
  • Queries operate on sets
  • Indexes are defined on the basis of a common schema
  • For the purpose of querying, arbitrary objects could be grouped to sets
  • Index definition also requires at least a common subset in the structure
schema handling in tamino
Schema handling in Tamino
  • Grouping documents by open content model + user-directed document grouping
  • Documents grouped into collections
  • Within a collection, declare several document types
  • For each document type define a common schema (open content model)
  • For each document, Tamino assigns one of the document type
type assignment
Type Assignment
  • Assignment is based on the root element type
  • Document must match the schema of the document type assigned, but might have additional elements/attributes
  • In a document type, documents might differ considerably
  • If no appropriate document type, document is stored without any schema checking
document accepted by tamino
Document accepted by Tamino

<City Inhabitants=”138000”>

<Name>Darmstart</Name>

<Addition>The city of art nouveaud</Addtion>

<Monument Height=”39m”>

<Name>Langer Ludwig</Name>

<Location>

<Name>Luisenplatz<Name>

<MapIndex>M5</MapIndex>

</Location>

</Monument>

</City>

is an element attribute should be modeled
Is an element/attribute should be modeled?
  • an index will be defined on this element/attribute
  • the element/attribute is to be mapped to an external data source or to a server extension
  • dedicated access rights will be defined on the element/attribute
  • the presence / multiplicity of the element is to be enforced
  • one of the above conditions hold for a child of the element
indexing of tamino
Indexing of Tamino
  • value-based indexes
    • well known from traditional database systems
    • used to accelerate the search
    • exactly address the data object
    • names need not be unique within a DTD
example of value based index
Example of value-based index
  • value-based indexes
    • data-centric view

<!ELEMENT City (Name, Inhabitants, Monument+)>

<!ELEMENT Monument (Name, Description)>

<!ELEMENT Inhabitants (#PCDATA)>

<!ELEMENT Name (#PCDATA)>

<!ELEMENT Description (#PCDATA)>

indexing of tamino cont
Indexing of Tamino (cont’)
  • text indexing
    • document-centric view
    • limit the scope to a specific part of the document
    • the scope might span element content
example of text index
Example of text index
  • text indexing
    • document-centric view

<statement>

<author>

<firstname>Harald></firstname>

<lastname>Schoning</lastname>

</author>

<text>

X<italic>M</italic>L and X<italic>S</italic>L

are <stressed>very</stressed> important

</text>

</speech>

indexing of tamino cont1
Indexing of Tamino (cont’)
  • structural index
    • If multiplicity permits the omission of elements
    • or if no DTD is known
  • Example
    • in a database of all European cities
    • search all those cities which have an element called “beach”
querying xml documents
Querying XML documents
  • Currently, there is no standardized query language
  • XPath allows positioning within a single document
  • XPath fits well the needs of retrieval in data-centric environments
  • document-centric environments need a more content-based retrieval facility
  • Tamino also supports full text search
expectation for xml processor
Expectation for XML processor
  • W3C:XML recommendation specifies the handling of entities, comments and processing instructions.
  • User: Tamino, leave comments intact, no processing instruction evaluated, leave entity references unresolved.
  • User: the output of a Tamino query should match the specification of an XML processor.
why don t leave entities unresolved
Why don’t leave entities unresolved?
  • In case result is a set of (parts of)matching documents
  • This result DTD must include all different entity declarations of the original document
  • Definition of the entity might differ from document to document
  • So, for the same entity name, entities are renamed, and the entity references are changed accordingly.
problems of external entities
problems of external entities
  • These entities can change without the database system knowing about this
  • Thus, the values of external entities must not be included in indexes
  • Example:

<!ENTITY &mysubject SYSTEM

“http://www.softwareag.com/hottopic.xml”>

...

<ticker>Todays hot topic: &mysubject</ticker>

  • Checking the current contents of the external entity lead to unacceptable response times.
relational databases and xml
Relational Databases and XML
  • major (object-) relation database systems include some forms of XML support
  • The simplest form is to generate XML documents for existing relational data.
  • But, real database handling of XML requires that XML data can be stored and retrieved
  • Two approaches
xml support approach 1
XML support approach(1)
  • Map the XML document is to relational tables and their columns
  • Markup is ignored on storage, and reconstructed on retrieval
  • advantage of this approach:
    • the contents of an XML document can be handled with traditional SQL
xml support approach 1 cont
XML support approach(1) cont’
  • Shortcomings:
    • The sequence information lost

<Order CustomerId=”567” Date=”12- 12-2000”>

<Item ProductID=” 17” Quantity=”2”/>

<Item ProductID=”l6” Quantity=”9”/>

<Item ProductID=“ 19 ” Quantity=“8”/>

</Order>

The retrieval of the order:

<Order CustomerId=”567” Date=”12-12-2000”>

<Item ProductID=” 16” Quantity=”9’/>

<Item ProductID=” 17” Quantity=”2”/>

<Item ProductID=” 19” Quantity=”8”/>

</Order>

xml support approach 1 cont1
XML support approach(1) cont’
  • Data-centric documents sequence might not matter, it does for document-centric
  • this approach loses all comments and processing instructions
  • mixed content cannot be stored easily in this model
xml support approach 2
XML support approach(2)
  • Leaves the XML document intact and stores it in a large text field (“BLOB”)
  • Or even outside the database
  • Text search is possible
  • Can limit a certain text-based condition
xml support approach 2 cont
XML support approach(2) cont’
  • Limitations:
    • no structure-aware combinations are possible
    • Value-based search is not supported on these text fields
      • IBM solution: side tables
      • But, direct manipulation of side tables destroys the consistency of the database
    • Security can be defined on document level only, but not on elements orattributes
summary
Summary
  • Tamino was designed with particular attention to the XML
  • Schema handling for XML is different from relational databases does
  • In Schema handling, external entities cause conceptual problems
  • value-based indexes are useful for XML, as well as text index and structural index
  • Comments and processing instructions should be preserved when documents are stored
  • The result of a query against an XML database should be XML
slide35
Q&A

Thanks!