1 / 50

XML-Based Information Systems

XML-Based Information Systems. National Cheng Kung University Department of Electrical Engineering DSLab Shang-Rong Tsai. Outline. Background XML-based databases and information systems An XML-based Information Server. Background. What is XML? Is XML a Database?

tad-glenn
Download Presentation

XML-Based Information Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML-Based Information Systems National Cheng Kung University Department of Electrical Engineering DSLab Shang-Rong Tsai

  2. Outline • Background • XML-based databases and information systems • An XML-based Information Server

  3. Background • What is XML? • Is XML a Database? • What is an XML Database? • What is the goal of XML Database? • What is the difference between RDB and XDB? • XML in the Web

  4. What is XML? • XML stands for eXtensible Markup Language • XML is a textual encoding system for describing structured documents • HTML documents are SGML documents which conform to the HTML DTDs • DTDs (Document Type Definitions) are the syntax defined in SGML to describe the tag structure for a particular type of document.

  5. What is XML? (cont.) • XML is a subset of Standard Generalized Markup Language (SGML) defined by the World Wide Web Consortium (Use only 10% of SGML to express 90% power of SGML) • HTML is for presentation only • XML allows developers to define their own markup languages to express their information more meaningfully • XML lets developers describe, deliver and exchange structured data between applications, including Web servers and browsers.

  6. The features of XML • Extensible • Self-described • Separate data from presentation • Text based, platform neutral • Unified if confirm to schema of specific domain • Integration

  7. XML Technologies • XML/DTD • XML Namespaces • XSL/XSLT • XLink/XPointer/XPath • XML Schema • XML data query • XHTML

  8. XML and Database • XML is basically a data format, we still need persistent store • Lots of the information on the Web come from databases • Data model of XML and RDBMS / OODBMS • XML mismatches with relational databases

  9. XML and Database (cont.) • Schema mapping between XML documents and RDBMS • data unit as XML document/element/attribute • keys for relational tables • data type mapping • relationship between the stored tables

  10. XML and Database (cont.) • Query/update languages • Indexing and search • A new database system for XML ? • XML-enabled database. • native XML database (the data is actually stored as XML internally)

  11. Is XML a Database? • Something similar • data storage (XML documents) • DTD/Schema • Query languages (XQuery, XPath, XQL, XML-QL, QUILT, etc.) • Programming interface (DOM/SAX)

  12. Is XML a Database? (cont.) • Something it lacks • transaction • security • indexing • concurrent access • query from multiple data objects • data integrity

  13. XML as platform independent data format

  14. Data integration with XML

  15. What is an XML Database? • Databases that store XML documents and provide a view of operational data, generally either as indexed text or as some variant of the DOM mapped to an underlying data store.

  16. The Goal of XML Database • Solve the problem of mismatches between the XML-structure data and data model RDB products support • Provide a complete solution for storing, accessing and manipulating XML documents • Make the data integration and exchange easier • Support the original goal of Web • Human communication through shared knowledge • The universe of network-accessible information • More meaningful and clear to represent data (than HTML)

  17. Difference between RDB and XDB • Data • Table vs. XML documents • Modeling • Logical Model • Entity-Relationship vs. XML model • Physical Model • Interface • SQL vs. XQuery • Application • Transaction-based vs. Document-based

  18. Storing and Retrieving XML Documents • File System • BLOB (Binary Large OBject) • Native XML Databases • Persistent DOMs (PDOMs) • Content Management Systems • Systems for managing fragments of human-readable documents and include support for editing, version control, and building new documents from existing fragments.

  19. Data oriented vs. Document oriented • Data oriented • Documents that use XML as a data transport • Designed for machine consumption • Regular structure, fine-grained data, little or no mixed content • Document oriented • Designed for human consumption • Irregular structure, larger grained data , lots of mixed content

  20. Data oriented Document oriented <invoice> <orderDate>1999-01-21</orderDate> <shipDate>1999-01-25</shipDate> <billingAddress> <name>Ashok Malhotra</name> <street>123 Microsoft Ave.</street> <city>Hawthorne</city> <state>NY</state> <zip>10532-0000</zip> </billingAddress> <voice>555-1234</voice> <fax>555-4321</fax> </invoice> <memo importance='high' date='1999-03-23'> <from>Paul V. Biron</from> <to>Ashok Malhotra</to> <subject>Latest draft</subject> <body> We need to discuss the latest draft <emph>immediately</emph>. Either email me at <email> mailto:paul.v.biron@kp.org</email> or call <phone>555-9876</phone> </body> </memo> Two typical examples of XML instances

  21. Taxonomy of XML Database • Native XML Database (NXD) • A database fundamentally designed to store and manipulate XML data. • Defines a (logical) model for an XML document and stores and retrieves documents according to that model. • Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage. • It is NOT required to have any particular underlying physical storage model. • XML Enabled Database (XEDB) • A database that has an added XML mapping layer provided either by the database vendor or a third party

  22. Applications of XML Database • Corporate information portals • Membership databases • Product catalogs • Parts databases • Patient information tracking • Business to business document exchange

  23. Some related standard • W3C • XML Schema • XPath • XQuery • XMLDB ORG • XML:DB API • XUpdate

  24. XML Schema • The purpose of a schema is to define a class of XML documents, and so the term "instance document" is often used to describe an XML document that conforms to a particular schema.

  25. XML Schema • XML Schema is todefine and describea class of XML documents by using [schema] constructs toconstrain and document themeaning, usage and relationships of their constituent parts. • Structure • Data type

  26. <<?xml version="1.0" encoding="Big5"?> <<xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://chip.ee.ncku.edu.tw/buy> <xsd:element name="Message"> <xsd:complexType> <xsd:sequence> <xsd:element name="request" type="xsd:string"/> <xsd:element name=“name" type="xsd:string"/> <xsd:element name=“telephone" type="phoneType" maxOccurs="1"/> <xsd:element name="buyitem" type="buyitemType" minOccurs="1" maxOccurs="unbounded"/> <xsd:element name="rule" type="xsd:string" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <!– Definition buy items --> <xsd:complexType name="buyitemType"> <xsd:simpleContent> <xsd:restriction base="xsd:string"> <xsd:attribute name="num" type="xsd:positiveInteger" use="required"/> </xsd:restriction> <xsd:simpleContent> </xsd:complexType> <!– definition of telephone type --> <xsd:simpleType name="phoneType"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\d{2}-\d{7}"/> </xsd:restriction> </xsd:simpleType> <</xsd:schema> An Example of XML Schema

  27. XPath • The primary purpose of XPath is to address parts of an XML document. • XPath is also designed so that it has a natural subset that can be used for matching. • XPath models an XML document as a tree of nodes. • Element nodes • Attribute nodes • Text nodes

  28. Examples of XPath • Collections –‘element’ and ‘.’ • ./first-name • Selecting children and descendants –‘/’ and ‘//’ • author/first-name • bookstore//title • Collecting element children –‘*’ • author/* • book/*/last-name • Finding an attribute –‘@’ • @style • price/@exchange

  29. XQuery • A query language that uses the structure of XML intelligently can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware.

  30. <bib> <book year="1994"> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price> 65.95</price> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book> <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price>39.95</price> </book> <book year="1999"> <title>The Economics of Technology and Content for Digital TV</title> <editor> <last>Gerbarg</last><first>Darcy</first> <affiliation>CITI</affiliation> </editor> <publisher>Kluwer Academic Publishers</publisher> <price>129.95</price> </book> </bib> The XML data Used in the XQuery example

  31. FOR $p IN distinct(document("bib.xml")//publisher) LET $a := avg(document("bib.xml") /book[publisher = $p]/price) WHERE $a > 100 RETURN     <publisher>         <name> $p/text() </name> ,         <avgprice> $a </avgprice>     </publisher> An example of XQuery List each publisher and the average price which is greater than 100 of its books

  32. XML:DB API • XML:DB API is being developed by the XML:DB Initiative to facilitate the development of applications that function with minimal change on more then one XML database. • This is roughly equivalent to the functionality provided by JDBC or ODBC for providing access to relational databases.

  33. XUpdate • XUpdate is a specification under development by the XML:DB Initiative to enable simpler updating of XML documents. • XUpdate gives you a declarative method to insert nodes, remove nodes, and change nodes within an XML document.

  34. Some XML database products • Commercial • Tamino • X-Hive • Excelon • Open Source (All Java based) • Xindice (dbXML Core) • eXist • Ozone

  35. Present Web System

  36. The Original Goal of Web • Human communication thru shared knowledge. Working together: • Social efficiency, understanding and scaling • The Universe of network-accessible information

  37. The problems of Current Web • HTML is for presentation only • Not agent and search engine friendly • Web Automation is difficult • Enter, search and click… • Integration is difficult • Data format is not unified and extensible

  38. The Web System in the future

  39. Features of the System • A large scale Information Server based on XML technologies • Tools for data input, query and presentation • Information/documents sharing, exchange and integration • Multimedia contents support • Document-oriented • Systematic way for retrieve useful and precise information • As an XML data storage for specific XML-based applications

  40. XML-based Information Server

  41. Four types of users

  42. The Architecture of XML Storage

  43. The XML Data Input Subsystem

  44. The Data Capture Template Editor

  45. The Schema Editor for the DB Designer

  46. The GUI for the Information Provider

  47. The Input form generated by the Data Capture Template Processor

  48. Data Presentation generated by XSL and XSL Processor

  49. Epilogue • XML makes the web more automatic. • More and more Internet applications use XML technology • Information sharing using XML would be more effective than HTML approach • XML can describe data in a more appropriate way than using Relational model • XML plays an important role in the database area. More efforts are devoted to XML based database developments.

  50. Reference • XML Database Overview • Oasis: XML and Databases, http://www.oasis-open.org/cover/xmlAndDatabases.html • XML and Database, http://www.rpbourret.com/xml/XMLAndDatabases.htm • Programming • Java XML Tutorial, http://java.sun.com/xml/tutorial_intro.html • Java World, http://www.javaworld.com • http://xml.apache.org • http://jakarta.apache.org

More Related