1 / 31

XML and Databases

XML and Databases. Aug’10 – Dec ’10 . Introduction. volume of XML used by businesses is increasing Many websites use XML as a data store, which is transformed into HTML or XHTML for online display

Download Presentation

XML and Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML and Databases Aug’10 – Dec ’10

  2. Introduction volume of XML used by businesses is increasing Many websites use XML as a data store, which is transformed into HTML or XHTML for online display XML data supplied directly to data stores such as Microsoft Access or SQL Server from forms filled in by a variety of information workers XML is being used increasingly for business-critical data, some of which is particularly confidential issues such as security, scalability reliability Aug’10 – Dec ’10

  3. This chapter includes ❑ Use cases for XML-enabled database systems ❑ How to perform foundational tasks using eXist, an Open Source native XML database ❑ How to use some of the XML functionality in Microsoft SQL Server and MySQL, two major relational databases with XML functionalities Aug’10 – Dec ’10

  4. The Need for Efficient XML Data Stores If XML is stored as text documents, how can it be processed efficiently? When volumes of XML data grow, the efficiency of searching becomes important – the addition of indexes to speed up searching that XML becomes increasingly necessary Example : in XML web services, performance is of great importance if the user is to feel that the system is sufficiently responsive If data is stored as something other than XML, how fast data can be transformed into XML comes into play Issues of reliability also come into play Aug’10 – Dec ’10

  5. The Increasing Amount of XML XML has enormous flexibility in representing data. It can represent data structures that are difficult or inefficient to represent as relational data Native XML database is a database designed primarily or only to handle XML data the term structured data refers primarily to relational data Semi-structured data is a term used to refer to nonrelational data, very often XML data Loosely structured data typically refers to document-centric XML Aug’10 – Dec ’10

  6. Comparing XML-Based Data and Relational Data In a relational database – no ordering of data XML - document order is intrinsically present Relational databases, as typically structured, have no hierarchy XML documents, which are intrinsically hierarchical Relational database – use of keys relationship between tables Storing even simple data in a relational table -> loss of ordering. May need to assemble the data in XML at a later date to recapture the original structure Aug’10 – Dec ’10

  7. Approaches to Storing XML Storing XML on File Systems: The very notion of an XML “document” suggests storage on disk just like you store any other kind of “document” on your desktop Many applications continue to store XML documents on file systems why XML databases have been so slow to take off ??? because storing XML documents on file systems works so well hierarchical organization of a file system is very similar to the hierarchical organization of a file document. Aug’10 – Dec ’10

  8. Limitations of storing XML documents on file systems Document Size important factor: granularity of the information you need to retrieve if you need to retrieve small pieces from big documents through DOM or XPath, you will incur a huge overhead -have to read the full document before you can extract Updates If you want to enable multiple users to update these documents, or, even worse, if you’re writing a transactional application, you need to take extra care to perform these updates Solution : use a version control system such as Subversion (http://subversion.tigris.org/) Aug’10 – Dec ’10

  9. Limitations contd… Indexes Issue if you store your documents on disk: queries. need to implement some kind of indexing mechanism If you have few predefined fields to index, you can use the directory structure as an index with Subversion, you can easily get a list of documents for a specific version, committed by a specific user, modified between two dates, and so on full-text search- use a search engine such as Lucene (http://lucene.apache.org/) Building Your Own: Although most issues can be worked around, keeping XML documents on disk with write access and indexes is a “build your own” kind of solution and exposes you to a fair amount of integration work. By contrast, XML databases give you a much more packaged approach Aug’10 – Dec ’10

  10. Build your own… XML databases may not have the features you find in a version control system and for full-text search. Most XML databases do not match search engine features. you can save a lot of time by using a stable XML database instead of adding a bunch of software on top of your file system storage to implement features that are natively available in these databases. Aug’10 – Dec ’10

  11. Using XML With Conventional Databases Relational databases are one of the most popular ways to store data. They are mature, very well fitted to store structured data, store a huge amount of legacy data, and are well understood by a large number of developers. These reasons make them good candidates to use together with XML Producing XML from Relational Databases Large numbers of HTML and XHTML websites are created, directly or indirectly, from relational data Data is stored conventionally as relational tables, and the programmer writes code to create HTML or XHTML, sometimes using XML as an intermediate stage it is possible to map relational data to hierarchical XML structures and return those hierarchical structures to a user Aug’10 – Dec ’10

  12. Moving XML to Relational Databases many relational databases allow XML to be returned to the user from data held in relational tables Similarly, many relational database management systems now have the capability to accept XML data from a user, convert it into a relational form, and then store that latter data in relational tables. Shredding refers to processing XML and inserting its contents into standard database tables. it may be possible to reconstitute the original XML document Data Binding Data binding frameworks acknowledge the fact that several representations of the same data need to coexist in applications, automating the mapping between those representations XML, SQL databases, and objects- representations supported Aug’10 – Dec ’10

  13. Data binding cont… Data binding frameworks that can directly map XML and SQL databases include ADO.NET(http://msdn.microsoft.com/data/ref/adonet/) in Microsoft’s world and Castor (http://www.castor.org/) in the Java open- source community Depending on the situation, the XML or XHTML is generated manually, through templates, or through another data binding library Aug’10 – Dec ’10

  14. Native XML Databases a native XML database is designed to store XML A native XML database might choose to implement XML using a model like the XML Infoset, the XMLDOM, XPath. It is also likely to capture aspects of an XML document,such as document order. native XML databases: recent, not the same theoretical underpinning as RDBs, evolving a native XML database product also maps an XML document to the storage model. The mapping differs substantially from the detail of the shredding Aug’10 – Dec ’10 e

  15. Native XML DB contd… Native XML databases often store XML documents in collections, and queries can be made across a collection a collection may be defined by a schema or may contain documents of differing structure many native XML databases use XQuery as the query language, even though it is not yet a W3C Recommendation Updates to native XML databases currently lack standardization XQuery 1.0 lacks insert, delete, and update functionality Microsoft’s SQL Server, Oracle, Sybase Adaptive Server Enterprise, and IBM’s DB2 9 have the ability to store a new xml datatype without discarding their traditional strengths as relational database management systems Aug’10 – Dec ’10

  16. Native XML DB contd… Whether use a native XML database or an XML-enabled relational database product –doesn’t matter! Three very different database examples of native XML databases and XML-enabled database management systems: ❑ eXist is the most mature open-source XML database, written in Java. ❑ SQL Server is a Microsoft enterprise-capable relational database management system with some XML functionality. ❑ MySQL is the open-source database most widely used to power websites. Its XML capabilities are still well behind those of its commercial competitors Aug’10 – Dec ’10

  17. Using Native XML Databases Obtaining and Installing eXist: ready-to-run native XML database can be used in three different modes: ❑ You can use eXistas a Java library to embed a database server in your own Java application. ❑ You can run it as a standalone database server as you would run a SQL database server. ❑ You can run it embedded in a web server and get the features of both a standalone database and a web interface to access the database. Aug’10 – Dec ’10

  18. Using eXist in the last two modes using a different set of scripts that you can find in its bin subdirectory: ❑ server (.sh or .bat depending on your platform) is used to run eXist as a standalone database server. ❑ startup (.sh or .bat) is used to start eXist embedded in a web server, and shutdown (.sh or.bat) is used to stop this web server. Aug’10 – Dec ’10

  19. Opening eXist home page Aug’10 – Dec ’10

  20. Using the Web Interface Administration Log in with user name and password Only registered users are allowed Once logged in you have access to commands Aug’10 – Dec ’10

  21. Browsing Collection Aug’10 – Dec ’10

  22. Create Collection Create new collection, Collection1 Upload xml documents in the newly created collection The documents will be stored in /db/Collection1 Aug’10 – Dec ’10

  23. Newblog.xml <?xml version=”1.0”?> <item id=”1”> <title>Working on Beginning XML</title> <description> <p> <a href=”http://www.wrox.com/WileyCDA/WroxTitle/productCd-0764570773.html”> <img src=”http://media.wiley.com/product_data/coverImage/73/07645707/0764570773.jpg” align=”left”/> </a> I am currently working on the next edition of <a href=”http://www.wrox.com/WileyCDA/WroxTitle/productCd-0764570773.html”> WROX’s excellent “Beginning XML”.</a> </p> </description> <category>English</category> <category>XML</category> <category>Books/Livres</category> <pubDate>2006-11-13T17:32:01+01:00</pubDate> <comment-count>0</comment-count> </item> Aug’10 – Dec ’10

  24. XQuery Sandbox Web Interface for querying XML documents http://localhost:8080/exist/sandbox/sandbox.xql. Query example : /item[@id=‘1’] Aug’10 – Dec ’10

  25. XQuery Sandbox Newblog.xml To determine the title, id and links of blog entries with a link on Wrox site for $item in /item where .//a[contains(@href, ‘wrox.com’)] return <match> <id>{string($item/@id)}</id> {$item/title} {$item//a[contains(@href, ‘wrox.com’)]} </match> Aug’10 – Dec ’10

  26. XQuery Sandbox Aug’10 – Dec ’10

  27. eXist client Standalone graphical tool that can perform the same kind of operations as a web interface The following operations can be performed once logged in with username and password : Browse Collections Open and edit documents Query documents using XQuery or XPath Trace tab in results window which shows the execution path of queries. Aug’10 – Dec ’10

  28. eXist client Aug’10 – Dec ’10

  29. eXist client Aug’10 – Dec ’10

  30. WebDAV Web-based Distributed Authoring and Versioning define how HTTP can be used to not only read resources, but also to write them properties (creation, removal, and querying of information) Collections: Group resources into collections that are organized like a file system, similar to a directory or desktop folder Locking: Use locks to prevent others from editing the same content you're working on in WebDAV Aug’10 – Dec ’10

  31. XML IDE One feature not present in WebDAV is the capability to execute queries XML IDE can access the eXist database through WebDAV Also queries can be executed from the IDE itself Aug’10 – Dec ’10

More Related