1 / 45

By Intan, Chan & Lina February, 2003

XML Databases. By Intan, Chan & Lina February, 2003. Contents. Introduction XML Databases XML- Enabled Databases Native XML Databases XML Database Products, Benchmarks and Cost Issues XML Database Applications Future Trends Conclusion. 1.Introduction. What is XML?

ronat
Download Presentation

By Intan, Chan & Lina February, 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Databases By Intan, Chan & Lina February, 2003

  2. Contents • Introduction • XML Databases • XML- Enabled Databases • Native XML Databases • XML Database Products, Benchmarks and Cost Issues • XML Database Applications • Future Trends • Conclusion

  3. 1.Introduction What is XML? • XML (eXtensible Markup Language) is an open standard for describing data from the W3C (World Wide Web Consortium) • used for defining data elements on a Web page and business-to-business documents • uses a similar tag structure as HTML • HTML uses predefined tags, but XML allows tags to be defined by the developer of the page

  4. 1.Introduction Data-centric documents • are documents that use XML as a data transport • designed for machine consumption • characterised by fairly regular structure, fairly consistent organisation of detail and fine-grained data, with little or no mixed content • examples are sales orders, flight schedules, scientific data, and stock quotes

  5. 1.Introduction Document-centric documents • designed for human consumption • characterised by less regular or irregular structure, larger grained data and highly mixed content • books, email, advertisements, and almost any hand-written XHTML document

  6. 1.Introduction Data, Documents and Databases • distinction between data-centric and document-centric not always clear • characterising documents as data-centric or document-centric will help to decide what kind of database to use • data-centric documents are stored in a traditional database, such as a relational, object-oriented, or hierarchical database • Document-centric documents are stored in a native XML database or a content management system

  7. 2.XML Databases • XML & Database: two very different concepts driven by two very different communities with different expectations and requirements. • Yet, an increasing demand for consistent and reliable methods to manage XML data suggeststhe marriage of the two.

  8. 2.XML Databases • Is XML a database?--An XML document is a database only in the strictest sense of the term since basically it is a collection of data. • XML facilitates some operations, which are commonly used in databasessuch as storage, schemas, query languages, programming interfaces, etc.

  9. 2.XML Databases • It may be possible to use XML document as a database only in a scenario withsmall volume of data, few users, andmodest performance requirements. • It won’t function satisfactorily in a production environment which havemany users, strict data integrity requirements, and the need for good performance.

  10. 2.XML Databases • AnXML document database(or more generally anXML database, since every XML database must manage documents) can be defined to bea collection of XML documentsandtheir parts, maintained by a system having capabilities tomanage and controlthe collection itself and the information represented by that collection.

  11. 2.XML Databases • XML databases areschema agnostic. • Capability of managing XML data thatsupports extensibility and granular access simultaneously. • Ideal for information that is likely tochange unpredictably. • Uniqueand targeted at solvingnew and differentproblems.

  12. 2.XML Databases • XML databases manage active data that is being shared between legacy systems, partners, and web services. • The management process can be automated, audited, and dynamically improved. • XML’s inherent flexibility and extensibility make it easy to design and build an infrastructure for business information interoperability that is designed for change.

  13. 2.XML Databases • Further demands : • Closely related W3C specifications that extend the capabilities specified in XML 1.0 should be accommodated. • XML database systems should include Internet resource management. • An SGML document was always associated with a DTD, and the DTD could be used in many different ways to support the data management.

  14. 2.XML Databases • Benefits of XML Databases • Unrivaled performance: designed for quick handling of very large data volumes, and profits from technologies to be executed quickly. • Data independence: XML databases inherit all the benefits derived from using XML which is easy to use, remains flexible and extensible. • Quick access and high-speed retrieval: provide lightening-fast access to any type of stored data either from a single resource or from a distributed system across a network.

  15. 2.XML Databases • Benefits of XML Databases • Manages and accesses all types of data: even allow storage of and access to audio, video or other files and handling of several nested objects • Support for major application servers: With proper API services, XML database can play the role of a content store. • Reduce production cost for business significantly: support automation of business process from order through delivery reduces production cost significantly.

  16. 2.XML Databases • Data Models • Modeling document collections as well as enterprises: support the description of the documents. W3C has developed the abstract structures in four different specifications, namely, the Infoset model, the XPath data model, the DOM model, and the XQuery 1.0 and XPath 2.0 data model that are often used to encode enterprise data.

  17. 2.XML Databases • Data Models • Conceptual model for documents: the conceptual model incorporates not only all the objectsand relationships, but also all the document components that are to be made available to any XML application.

  18. 2.XML Databases • Data Models • Well-defined equivalence: W3C has proposed that Canonical XML be used to compare the equivalence of two documents. And another possible solution is to define documents equivalence in terms of a model that include all document features, after which such equivalence can be specified by applying document equivalence to application-specific transformations.

  19. 2.XML Databases Query Languages • There are currently 3 query languages that are used • Template-Based Query Language • SQL-Based Query Language • XML Query Language (Bourret, 2003)

  20. 2.XML Databases Template-based Query Language • most common query language that returns XML from relational databases • no predefined mapping between the document and the database • SELECT statements are embedded in a template and the data transfer software processes the results

  21. 2.XML Databases SQL-based Query Language • uses modified SELECT statements, the results of which are transformed to XML • a number of proprietary SQL-based languages are currently available • simplest of these SQL-based languages uses nested SELECT statements, which are transformed directly to nested XML

  22. 2.XML Databases XML Query Language • XML Query Language was specifically designed by Microsoft, Texcel and WebMethods to cross-examine XML documents • XML query languages can be used over any XML document, unlike the previous two that can be used only with relational databases • To use these with relational databases, the data in the database must be modeled as XML, thereby allowing queries over virtual XML documents

  23. 3. XML-Enabled Databases XML-Enabled Database Concept • Using BLOB (Binary Large Object) to store XML documents with document extensibility Weakness : Not support node-level access, update or any structure dependent query such as XPath and XQuery. • Mapping XML documents to tables in relational databases or objects in object oriented databases Weakness : do not support extensibility and do not support important feature such as round tripping

  24. 3. XML-Enabled Databases Mapping Document Schemas to Database Schemas • To transfer data between XML documents and a database, it is necessary to map the XML document schema to the database schema • 2 types of mappings that are used to map an XML document schema to the database schema • Table-based Mapping • Object-Relational Mapping (Bourret, 2003)

  25. 3. XML-Enabled Databases Table-based Mapping • used by many of the middleware products that transfer data between an XML document and a relational database • documents that use table-based mappings often include table and column metadata • useful for serialising relational data, such as when transferring data between two relational databases

  26. 3. XML-Enabled Databases Table-based Mapping (cont’d) <database> <table> <row> <column1>...</column1> <column2>...</column2> ... </row> <row> ... </row> … </table> <table> ... </table> ... </database>

  27. 3. XML-Enabled Databases Object-Relational Mapping • used by all XML-enabled relational databases, and some middleware products • models the data in XML document as a tree of objects that are specific to data in the document • model is then mapped to relational databases using traditional object-relational mapping techniques or SQL 3 object views

  28. 3. XML-Enabled Databases Object-Relational Mapping (cont’d) Sales Order Customer Item Item Price Price

  29. 4. Native XML Databases Native XML Database Concept • designed especially to be stored XML documents • A native XML database defines a (logical) model for an XML document, stores and retrieves documents according to that model.

  30. 4. Native XML Databases Native XML Database Concept • Database management features • transaction management • Security • multi user access and • interface APIs

  31. 4. Native XML Databases Text-based Native XML Database • Stores XML documents as text • BLOB in relational database or • A proprietary text format • Retrieving and returning data according to predefined path is outperformed

  32. 4. Native XML Databases Model-based Native XML Database • Internal object model • Performances similar to text-based native XML databases

  33. 4. Native XML Databases Features Native XML Databases • Data Definition • Support the notion of collections similar to a table in a relational database or • A directory in a file system • Allow to stores chema-independent XML documents • Risk of lower data integrity

  34. 4. Native XML Databases Features Native XML Databases • Data Manipulation • Query Language XPath and XQL • XPath - a lack of grouping, sorting, cross document joins, and support for data types • Use XSLT • more database-oriented language - XQuery.

  35. 4. Native XML Databases • Data Manipulation • Updates ad Deletes • a real area of weakness for current NXDs • XML:DB XUpdate from the XML:DB initiative • Indexes • Management Tools • programmatic API, ODBC-like interface • Round-Tripping • get the same document back again • External Entity • how to handle external entities ?

  36. 4. Native XML Databases Differences between Native XML Databases & Relational Databases • on well established Codd’s relational theory • XML is yet immature • Relational databases are the best for long term storage of the durable data at the back end • XML databases sit in the middle tier and manage activedata betweensystems

  37. 5.XML Database Products • Middleware • XML-Enabled Databases • Native XML Databases • XML Servers • Content Management Systems • Discontinued Products • Related products: XML Query Engines and XML Data Binding

  38. 5. XML Database Products • What to choose? • If your goal is to store and retrieve data-centric documents, it might be an XML-enabled database, middleware or an XML server. • If it is for document-centric documents, a native XML database or content management system might be appropriate.

  39. 5.Benchmarks • Has to meet the ten challenges: • Bulk loading • Reconstruction • Path traversals • Casting • Missing elements • Ordered access • References • Joins • Construction of large results • Containment, full-text search

  40. 5.Benchmarks • Infrastructureandtotal cost of ownership. • Eg. Access protocols, Result representations, Responsiveness versus completeness, The expressiveness of the query language, and Data throughput. • XML databaseAPI, enable a common access mechanism to XML databases.

  41. 5.Cost Issues • Comparisonof products available in the market. • Thetotal cost of ownership. • Installation effort • Generality support • Consistency support • Preparation effort • Training • Interaction paradigm • Updates

  42. 6.XML Database Applications • Key applicationsinclude: web services, B2B document exchange, e-commerce which most probably require online and often interactive processing. • And allinformation-rich scenarios: corporate information portals, membership databases, product catalogs, parts databases, patient information tracking, etc.

  43. 7.Future Trends • XML:DB initiative are working very hard on thebenchmarkingfor XML database industry, to be made into thestandard toolsetused by IT departmentsworldwide. • Comformint to the XML:DB API, some developers are also working on thegraphical query.

  44. 7.Future Trends • Better solutions forquery optimizationin the web context,compressingXML data and guaranteeingtransparent accessto compressed data through existing APIs. • New XML related languageshas been creating such as XML Update Language-XUpdate, Simple XML Manipulation Language. • The potential project may be theXML Access Control.

  45. 8.Conclusion • XML is changing the way that data and documents are represented, exchanged and integrated among heterogeneous computing systems • it is also inducing and facilitating the convergence of the World Wide Web, the Internet and database research communities • it is expected that XML databases will be extensively used in numerous domains and applications in the near future

More Related