Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Characterizing XML • XML is a set of rules for building markup languages. It is not just glorified HTML or only for the internet. • XML is a family of technologies that can do everything from formatting documents to filtering data. • XML is a philosophy for information handling that seeks to maximize usefulness and flexibility of data by refining it to its purest and most structured form.
XML – A Family of Technologies • XML to Formatted Presentation (HTML, PDF, user created format, etc.) • Combining what we wanted to previously keep separate; markup and style. • Why? • Author can concentrate on meaning • Designer can concentrate on appearance • More options for presentation
XML Application – DocBookhttp://www.oasis-open.org/docbook/ • DocBook – a markup language (DTD) for technical documentation of computer hardware and software. • Consists of several hundred elements. • Elements include <bibliography>, <keyword>, <itemized list>, <figure>, <table>
DocBook vs. Word Processor • <table> prevents a user from creating an image to hold data which cannot be searched. • Papers would be easier to write! Each publisher could still use their own display format (different CSSs or XML Schemas) and users only need to worry about content. • Searching for scientific papers on a topic would be easier and faster. • Search methods not possible now could be done. • Can search by element instead of just by keyword. • Can have more complex querying*
XML Applications and Tools • XML Application – a markup language derived from XML rules (e.g. DocBook) • XML Software applications – set of components, each performing a crucial step on an assembly line (XML processors) • XML Tools – commercially available programs that help a user work with XML
XML Processors Set of components, each performing a crucial step on an assembly line: 1. Parser – translate XML markup and data into tokens 2. Event Switcher – routes tokens to event handling routines (CSS) 3. Tree Representation – a tree structure is built if more complicated processing of an XML document is needed a. Simple – hierarchy of nodes b. Object Model – each node represented as an object 4. Tree Processor -traverses a tree so operations can be done on the tree model -validity checker to a full transformation engine
XML Software Applications http://www.garshol.priv.no/download/xmltools/cat_ix.html • Editing andComposition • Electronic Delivery • Control Information and Development • Conversion • Document Storage and Management • Parsers and Engines
XML Software Applications • XML has a lot of different parts and at first seems to be very complex. • Each part by itself is simple. • There is no need to be an expert in all of XML to be able to use it productively.
XML Software Applications • Editing andcomposition(Tools for interactive creation, modification and composition of XML documents.) • XML Editors • Text – emacs, vi • Graphical – HTML-Kit, etc. • Electronic delivery (Tools for electronic delivery and display of XML documents) • XML Browsers • Amaya (W3C) - HTML/XHTML browser/editor w/CSS and XLinks support • InDelv XML Client – XSLT style sheets for display, supports XPath and XPointer • IE5.5 – displays XML with CSS or XSLT • Mozilla(Netscape) – displays XML with CSS • IBM Alphaworks – DTD aware
XML Software Applications • Control information and development(Tools for creating, modifying and documenting DTDs, XSL style sheets etc.) • CSS Editors/DTD Editors (similar to XML editors – some overlap) • DTD Documenters • LiveDTD - parses XML DTDs and generates documentation HTML files from the DTDs with cross-links to element and parameter entity definitions. • DTD Generators • Data Descriptors by Example -automatically generate an XML DTD or schema from a set of document instances. • *Rhythmyx XSpLit - claims to be able to automatically generate an XML DTD and an XSLT style sheet from a sample HTML document. The XSLT style sheet can then be combined with XML documents that conform to the XML DTD to produce HTML pages with the same design as the original, but with new content. • SAXON – includes a small application that can generate DTDs from sample input files • DTD Parsers • Schema Converters • DTD2RELAX - converts a DTD into a RELAX schema module.
XML Software Applications • Control information and development (con’d) • XSL Checkers • XSL Lint – checks XSLT style sheets for mistakes • XSL Trace – debugger for XSLT style sheets • XSL Converters • XSLT is a subset of the more general XSL, these programs convert from XSL to XSLT • XSLT Editors • XSLT Generators • WH2FO - reads HTML files produced by Microsoft Word and converts it into an XML document, with two XSLT style sheets: one for conversion back to HTML and one for conversion to XSL-FO (Extensible Style Sheet Language for Formatting Objects – another subset of XSL) • Rhythmyx XSpLit
XML Software Applications • Conversion(Tools for scripted creation and modification of XML documents.) • General N-converters – convert from non-XML (usually word processing document) to XML • General S-converters – process XML documents (Transformation) • Publishing Converters • TeXML - an XML DTD and a Java application called TeXMLatte. TeXMLatte takes TeXML documents and converts them to TeX. This can be used with, for example, an XSL XML-to-TeXML conversion to produce TeX output from XML source documents. TeXML can also convert to plain text.
XML Software Applications • Document Storage and Management (Tools for supporting document management, such as document databases and search engines.) • XML document database systems(systems for persistently storing XML documents and providing access to their structure and individual parts.) • Lore- a DBMS built specifically for the XML data model, complete with query language, query optimizer, indexing, multi-user support and recovery. • XML-DBMS- a Java library that can be used to move data from XML to a relational database and also back again. • XML search engines • Xset - an XML search engine oriented towards performance. It keeps its working set in memory (using paging to support large documents) and can be accessed through RMI. The query language is very simple. • sgrep - a general tool for searching and indexing text that supports XML (and SGML). It also has its own very powerful query language.
XML Software Applications • Parsers and Engines (XML parsers, parsing toolkits, HyTime engines and DSSSL engines • DOM implementations – set of Java interfaces declaring methods that the developer should create • DSSSL engines – for formatting SGML docs • RDF parsers • SGML/XML parsers • XLink/XPointer engines • Jaxon - parse XPath expressions, and evaluate them against XML tree representation • XML Middleware - General software packages for making XML-aware applications of some form. • XML Parsers – translate XML markup & data into tokens to be processed • XML Validators - Software for validating XML documents by other means than DTDs. • XSL engines - Engines that support the XSL formatting objects specification. • XSLT engines - Engines that support the XSL Transformations specification.
DTD vs. XML Schema • DTD drawbacks: • Enforcing an element’s range of occurrences • A fruit_basket can have between 5 and 7 banana elements • <!ELEMENT fruit_basket ( (banana, banana, banana, banana, banana) | (banana, banana, banana, banana, banana, banana) | (banana, banana, banana, banana, banana, banana, banana)> • Enforcing a numbering scheme on child elements • A fruit_basket can have 3 bananas each numbered 1 through 3 • Cannot be done with a DTD • <!ELEMENT fruit_basket (banana*)><!ELEMENT banana EMPTY><!ATTLIST banana banana_number (1 | 2 | 3) "1" >
DTD vs. XML Schema • Enforcing an element’s range of occurrences • Use XML Schema • <xsd:complexType name="fruit_basket"> <xsd:element name="banana" minOccurs="9" maxOccurs="11"/></xsd:complexType>
DTD vs. XML Schema • Enforcing a numbering scheme on child elements – XSLT style sheet • <xsl:style sheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml" version="1.0"><!-- Process fruit_basket element(s) --> <xsl:template match="fruit_basket"> <html> <body><!-- Validate number of banana children in fruit_basket --> <xsl:choose><!-- Note escaped form of boolean > and < operators --> <xsl:when test="count(banana) > 8 and count(banana) < 12"> <h3># of banana children OK</h3> </xsl:when> <xsl:otherwise> <h3>Whoops! # of banana children is <xsl:value-of select="count(banana)"/></h3> </xsl:otherwise> </xsl:choose><!-- Set up table of info about banana children --> <table border="1"> <tr> <th>banana #</th> <th>banana_number</th> </tr><!-- Process all banana children of fruit_basket --> <xsl:apply-templates select="banana"/> </table>
DTD vs. XML Schema • </body> </html> </xsl:template> <!-- Process banana element(s) --> <xsl:template match="banana"><!-- Each banana element goes in its own table row --> <tr> <th><xsl:value-of select="position()"/></th> <td><!-- Test for banana's position matching banana_number attribute value--> <xsl:choose> <xsl:when test="position() = @banana_number"> OK </xsl:when> <xsl:otherwise> <strong>Whoops!</strong>... <xsl:value-of select="@banana_number"/> </xsl:otherwise>9557xnbo </xsl:choose> </td> </tr> </xsl:template></xsl:style sheet>
Cascading Style Sheets • Limitations to CSS • Its simplicity limits more complex formatting • Elements are processed in the order of their appearance • Arithmetic operations on element positions or values cannot be done • May be replaced by XSL-FO • More detailed than CSS • Is an XML application • Is more closely tied to XML’s nested-container structure
XSLT (subset of XSL) • XSLT (Extensible Style sheet Language for Transformation) • Allows data in a document to be used later for applications that do searches, queries and other sophisticated operations. • Transform a document into something else