Xml and databases
Download
1 / 48

XML and Databases - PowerPoint PPT Presentation


  • 75 Views
  • Uploaded on

XML and Databases. Ronald Bourret [email protected] http://www.rpbourret.com. Overview. Is XML a Database? Why Use XML with Databases? Data vs. Documents Storing and Retrieving Data Storing and Retrieving Documents. Is XML a Database?. Is XML a database?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' XML and Databases' - obedience-dunn


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Xml and databases
XML and Databases

Ronald [email protected]://www.rpbourret.com


Overview
Overview

  • Is XML a Database?

  • Why Use XML with Databases?

  • Data vs. Documents

  • Storing and Retrieving Data

  • Storing and Retrieving Documents



Is xml a database1
Is XML a database?

  • This is really two questions

    • Is an XML document a database?

    • Are XML and its surrounding technologies adatabase management system (DBMS)?


Is an xml document a database
Is an XML document a database?

  • Yes, it is a collection of data

  • Pros

    • Self-describing

    • Portable (Unicode)

    • Can store directed graphs

  • Cons

    • Slow access

    • Verbose


Are xml and surrounding technologies a dbms
Are XML and surrounding technologies a DBMS?

  • Yes, they have:

    • Data storage (XML documents)

    • Schemas (DTDs, XML Schemas, RELAX, etc.)

    • Query languages (XPath, XQuery, XQL, etc.)

    • APIs (SAX, DOM)


Are xml and surrounding technologies a dbms cont
Are XML and surrounding technologies a DBMS? (cont.)

  • No, they don’t have:

    • Separation of logical and physical data

    • Efficient storage

    • Indexes

    • Transactions

    • Multi-user access

    • Security

    • ...


Using xml as a database
Using XML as a database

  • Good for small, single-user databases

    • .ini files

    • Simple address book

    • List of browser bookmarks

    • Catalog of MP3s stolen with the help of Napster

  • Almost useless for large or multi-user databases



Why use xml with databases1
Why use XML with databases?

  • Expose legacy data as XML

  • Transfer data between databases

  • Integrating data from a variety of sources

  • Store semi-structured data

  • Queue e-commerce messages

  • Manage and query large document collections



Data vs documents1
Data vs. documents

  • Are you storing documents or the data in them?

    <Address> <Street>123 Main St.</Street> <City>Chicago</City> <State>IL</State> <PostCode>60609</PostCode> <Country>USA</Country></Address>Yellow = Data White + Yellow = Document

  • Helps determine the system you need

  • Look at your XML documents to decide


Data centric documents
Data-centric documents

  • Use XML primarily as a data transport

  • Designed for machine consumption

  • Sales orders, scientific data, dynamic Web pages

  • Characteristics

    • Regular structure

    • Fine-grained data

    • Little or no mixed content

    • Sibling order not significant


Example sales order
Example: Sales order

<Order> <Number>1234</Number> <Customer>Gallagher Industries</Customer> <Date>29.10.00</Date> <Item Number="1"> <Part>A-10</Part> <Quantity>12</Quantity> <Price>10.95</Price> </Item> <Item Number="2"> <Part>B-43</Part> <Quantity>600</Quantity> <Price>3.99</Price> </Item></Order>


Example dynamic web page
Example: Dynamic Web page

<html>

<head>

<title>Flight Schedule: SFO to FRA</title>

</head>

<body>

<p>Daily flights from SFO to FRA</p>

<table>

<tr><th>Airline</th><th>Num</th><th>Depart</th><th>Arrive</th></tr>

<tr><td>Air France</td><td>527</td><td>12:00</td><td>10:33</td></tr>

<tr><td>Lufthansa</td><td>459</td><td>13:55</td><td>10:05</td></tr>

<tr><td>American</td><td>385</td><td>14:17</td><td>11:48</td></tr>

<tr><td>Delta</td><td>99</td><td>15:30</td><td>14:02</td></tr>

</table>

</body>

</html>


Document centric documents
Document-centric documents

  • Designed for human consumption

  • Use XML to provide structure, metadata

  • Books, presentations, email, static Web pages

  • Characteristics

    • Irregular or semi-regular structure

    • Large-grained data

    • Lots of mixed content

    • Sibling order significant


Example product description
Example: Product description

<Product>

<Para><Name>XML-DBMS</Name> is <Summary>middleware for transferring data between XML documents and relational databases</Summary>. It is written by <Developer>Ronald Bourret</Developer>.</Para>

<Para>XML-DBMS uses an object-relational mapping in which complex element types are viewed as classes and simple element types, PCDATA, and attributes, as well as references to complex types, are viewed as properties.</Para>

<Para>You can:

<List>

<Item><Link URL="Readme.htm">Read more about XML-DBMS</Link></Item>

<Item><Link URL="jxmldbms.zip">Download Java version</Link></Item>

<Item><Link URL="pxmldbms.zip">Download PERL version</Link></Item>

</List>

</Para>

</Product>


Storing data and documents
Storing data and documents

  • Store data in traditional database

    • Use a native XML database under certain conditions

  • Store documents in native XML database

    • Use a traditional database under certain conditions

  • Boundary between data and documents not always clear in practice


Storing and retrieving data
Storing andRetrieving Data


Goals and non goals
Goals and non-goals

  • Goals

    • Preserve data and hierarchical order

    • Optionally preserve sibling order

    • One- or two-way data transfer

  • Non-goals

    • Preserve physical structure (entity use, encodings, ...)

    • Preserve DTD, comments, processing instructions...

    • Preserve document identity


Data transfer software
Data transfer software

  • May be middleware or integrated into DBMS

  • If integrated, DBMS is said to be XML-enabled


Mapping data in xml documents to databases
Mapping data inXML documents to databases

  • Most common mapping strategies

    • Template-driven

    • Model-driven

  • No mapping needed for native XML databases


Template driven mappings
Template-driven mappings

  • Commands embedded in template

  • Extremely flexible

    • Retrieve data with SQL or other query language

    • Place values almost anywhere in document

    • Parameterize subsequent SQL statements

    • Programming constructs such as if-then-else and for

  • Transfer from database to XML only


Example template
Example: Template

<?xml version="1.0"?>

<FlightInfo>

<Intro>The following flights have available seats:</Intro>

<SelectStmt>SELECT Airline, FltNumber, Depart, Arrive

FROM Flights</SelectStmt>

<Conclude>We hope one of these meets your needs.</Conclude>

</FlightInfo>


Example output
Example: Output

<?xml version="1.0"?>

<FlightInfo>

<Intro>The following flights have available seats:</Intro>

<Flights>

<Row>

<Airline>ACME</Airline>

<FltNumber>123</FltNumber>

<Depart>Dec 12, 1998 13:43</Depart>

<Arrive>Dec 13, 1998 01:21</Arrive>

</Row>

...

</Flights>

<Conclude>We hope one of these meets your needs.</Conclude>

</FlightInfo>


Model driven mappings
Model-driven mappings

  • Two mappings are common

    • Table-based

    • Object-relational

  • Data transferred according to model

  • Two-way data transfer

  • Simpler than templates, but less flexible

  • Often used with XSLT


Table based mapping
Table-based mapping

  • Map document with “table” structure to RDBMS

<database>

<table1>

<row>

<column1>value 1</column1>

<column2>value 2</column2>

...

</row>

...

</table1>

<table2>

...

</table2>

...

</database>

Table1

Column1

Column2

...

Table2

Column1

Column2

...


Pros and cons
Pros and cons

  • Pros

    • Easy to understand

    • Code is simple and fast

    • Useful for serializing databases

  • Cons

    • Only works on a small subset of XML documents


Object relational mapping
Object-relational mapping

  • Map XML document to objects...

Order

Customer Item

Part

<Order SONumber="12345">

<Customer CustNumber="543">

...

</Customer>

<OrderDate>150999</OrderDate>

<Item LineNumber="1">

<Part Name="Cherries">

...

</Part>

<Qty Unit="ton">2</Qty>

</Item>

</Order>


Object relational mapping cont
Object-relational mapping (cont.)

  • ... and objects to tables

Orders

Number

Customer

...

Items

OrderNumber

ItemNumber

Part

...

Customers

...

Parts

...

Order

Customer Item

Part


Objects are data specific
Objects are data-specific...

  • Different for each DTD (schema)

  • Model the content (data) of the document

Order

Customer Item

Part

<Order SONumber="12345">

<Customer CustNumber="543">

...

</Customer>

<OrderDate>150999</OrderDate>

<Item LineNumber="1">

<Part Name="Cherries">

...

</Part>

<Qty Unit="ton">2</Qty>

</Item>

</Order>


Not the dom
... not the DOM

  • Same for all XML documents

  • Model the structure of the document

Element Attr

(Order) (SONumber)

Element Element Element

(Customer) (OrderDate) (Item)

... ... ...

<Order SONumber="12345">

<Customer CustNumber="543">

...

</Customer>

<OrderDate>150999</OrderDate>

<Item LineNumber="1">

<Part Name="Cherries">

...

</Part>

<Qty Unit="ton">2</Qty>

</Item>

</Order>


Pros and cons1
Pros and cons

  • Pros

    • Can handle any XML document

    • Maps well to existing data structures

  • Cons

    • Very inefficient for mixed content


Data transfer issues
Data transfer issues

  • Data types

    • All XML data is string

    • Conversion problems due to many formats

  • Null data

    • Equivalent to missing element or attribute


Data transfer issues cont
Data transfer issues (cont.)

  • Binary data

    • No standard way to store in XML

    • Commonly stored as unparsed entities or Base64

  • Character sets

    • XML can use any encoding, including Unicode

    • Databases often require single encoding

    • Unicode is inefficient to store


Storing data in a native xml database
Storing data in anative XML database

  • Data stored in XML (document) format

  • Pros

    • Handles semi-structured data efficiently

    • Fast retrieving whole documents

    • Support for XML query languages, XLinks, etc.


Storing data in a native xml database cont
Storing data in anative XML database (cont.)

  • Cons

    • Slow retrieving views outside of document hierarchy

    • No referential integrity

    • Data not accessible by non-XML applications



Goals
Goals

  • Preserve entire document

    • Data: elements, attributes, PCDATA

    • Logical structure: element hierarchy, sibling order

    • Physical structure: entities, CDATA, encoding...

    • Other: DTD, comments, processing instructions...

  • Preserve document identity


Storing documents as blobs
Storing documents as BLOBs

  • Pros

    • Exploits existing capabilities: transactions, security...

    • Many databases have text search tools

  • Cons

    • Text-based searches of XML unreliable


Indexing xml blobs with side tables
Indexing XML BLOBswith “side tables”

  • Consider the following DTD

    <!ELEMENT Brochure (Title, Author, Content)><!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)> <!-- To be indexed --><!ELEMENT Content (%Inline;)> <!-- Inline entity from XHTML -->

  • Store complete documents in one table

    Brochures---------BrochureID INTEGER <--------- Index brochure IDsBrochure LONGVARCHAR <--------- Complete XML documents


Indexing xml blobs with side tables cont
Indexing XML BLOBswith “side tables” (cont.)

  • Store elements to be indexed in separate table

    Authors----------------------Author VARCHAR(50) <--------- Index authorsBrochureID INTEGER

  • Search index table and join to document table

    SELECT Brochure FROM Brochures WHERE BrochureID IN (SELECT BrochureID FROM Authors WHERE Author='Chen')


Storing documents in native xml databases
Storing documents innative XML databases

  • Store whole XML documents in “native” form

  • Define a (logical) model for an XML document

    • Minimal model is elements, attributes, PCDATA, and document order

    • Store and retrieve documents according to that model

  • Have normal database features

    • Query language, indexes, transactions, security, etc.


Implementation strategies for native xml databases
Implementation strategies for native XML databases

  • Text-based

    • Store documents as text

    • Proprietary or file-system storage

  • Model-based

    • Store pre-parsed documents according to model

    • Relational, object-oriented, hierarchical, or proprietary storage


Persistent doms pdoms
Persistent DOMs (PDOMs)

  • Implement DOM over persistent storage

  • Returned DOM tree is “live”

  • Used by DOM applications that process very large XML documents

  • Database is usually local


Content management systems
Content management systems

  • Manage document fragments (content)

  • Hide database from user

  • Maintain versions, document metadata

  • Include editors, publishing systems, etc.

  • Extensible through scripting or programming


Resources
Resources

  • Ronald Bourret’s Papers Page

    • http://www.rpbourret.com/xml/index.htm

  • XML:DB.org’s Resources Page

    • http://www.xmldb.org/resources.html

  • XML:DB Mailing List

    • http://www.xmldb.org/projects.html


Questions
Questions?

Ronald [email protected]://www.rpbourret.com


ad