Chapter 7 - Transformation - XSLT Learning XML by Erik T. Ray

Chapter 7 - Transformation - XSLTLearning XMLbyErik T. Ray Slides were developed by Jack DavisCollege of Information Scienceand TechnologyRadford University

XSLT • Transformation is one of the most important and useful techniques for working with XML • Transforming an XML document can be used in many ways -- changing a non-presentational document into a form that can be displayed- changing one tag set into another- extracting specific pieces of information and formatting them in a different way- changing an XML document into text, i.e. - transforming an XML data file into a comma delimited format for Excel- reformatting or generating new content- many, many, more

XSLT Concepts • An XSLT processor takes two inputs- an XSLT stylesheet to govern the transformation- an input document (source tree) • XSLT processor generates one output- the result tree (usually a document) • The XSLT stylesheet controls the transformation process. XSLT is really a script or program, not just a stylesheet • XSLT processor is a state engine, at any point in time it has a state. There are rules to drive processing forward based on the state. - The state consists of a set of nodes and the process is recursive -- meaning that for each node processed there may be children that also need processing. (The current node may be set aside until child processing is complete.)

XSLT Concepts (cont.) • An XSLT engine begins by reading the XSLT stylesheet and caching it as a lookup table. For each node in the source it will look in the table for the best matching rule. The rule specifies what to put in the output tree and also how to continue. • Starting from the root node, the rules are found, nodes processed and results put in the output tree. This process continues until all input nodes are processed. • Consider the following XML document excerpt- example 7-01 • Suppose we want to transform the XML document to an html document- XSL stylesheetA template is a mixture of markup, text content, and XSLT instructions. The instructions may be conditional statements, content formatting functions, or instructions to redirect processing to other nodes.

XSLT Output • The output of the XSLT transformation process- html document • The elements in the source tree have been mapped to different elements in the result tree. • If the transformation will be done by the web server or client, you must include a reference to the stylesheet in the document as a processing instruction, similar to the one used to associate documents with CSS stylesheets.<?xml-stylesheet type="text/xml" href="mytrans.xsl"?> • the type attribute for IE 6 must read as follows: type="text/xsl"

Namespaces • XSLT can be extended by the implementer to perform special functions not contained in the specification. For example, you can add a feature to redirect output to multiple files. These extensions are identified by a separate namespace that you must declare if you want to use them. And, just to make things clear for the XSLT engine, you should set the attribute extension-element-prefixes to contain the namespace prefixes of extensions.the element below declares namespaces for XSLT control elements (prefix XSL) and implementation-specific elements (prefix ext). Finally, it specifies the version 1.0 of XSLT in the last attribute.<xsl:stylesheetxmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:ext="http://www.myxslt.org/extentions" extension-element-prefixes="ext" version="1.0" >

Templates • XSLT stylesheets can be a collection of templates. Each template associates a condition (an element in the source tree with a particular attribute) with a mixture of output data and instructions. Templates are matched to nodes based on priority, typically the most specific match will have a higher priority than a more general match. For example, one template may match all elements with the XPath expression *. Another may match a specific element, while a third matches that element and further requires an attribute. Templates can also have a priority specified using a priority attribute.Templates are compact pieces of code that are easy to read and manage. The match and priority attributes show exactly when each template is to be used.

Matching Nodes • XSLT patterns used inside the match attributes of template elements are a subset of XPath expressions. The first restriction on XSLT patterns is that only descending axes may be used: child and attribute. Paths are actually evaluated right to left, not the other direction as is usual with XPath. As the processor moves through the source tree, it keeps a running list of nodes to process next, called the context node set. Each node in this set is processed in turn. The processor looks at the set of rules in the stylesheet, finds a few that apply to the node to be processed, and out of this set selects the best matching rule. The right-to-left processing helps the XSLT engine prioritize eligible templates.

Rule Conflicts • More than one rule may match a node, in this case, the XSLT processor must select one rule from the possible rules. The basic assumption is that rules that are more specific in their application take precedence over rules that are more general. The rules for selecting the best match are:- if the pattern contains multiple alternatives specified with the or ( | ) bar, each is treated with equal importance.- a pattern that contains specific hierarchical information has higher priority than a pattern that contains general information.- a wildcard is more general that a specific element or attribute name and therefore has lower priority. - a pattern with a successful test expression in square brackets ([]) overrides a pattern with no test expression.- other information such as position in the stylesheet may be considered if these rules don't establish a priority

Rules (cont.) • The xsl:template element has an optional priority attribute that can be set to give it precedence over other rules and override the process of determination. The value must be a real number (i.e., it must have a decimal point unless it is zero) and can be positive, negative, or zero. A larger number overrides a smaller number. • Default RulesXSLT defines a set of default rules to make the job of writing stylesheets easier. If no rule from the stylesheet matches, the default rules provide an emergency backup system. Their general behavior is to carry over any text data in elements from the source tree to the result tree, and to assume an implicit xsl:apply-templates element to allow recursive processing. Attributes without matching templates are not processed.

apply-templates instruction • The apply-templates element interrupts the current processing in the template and forces the XSLT engine to move on to the children of the current node. This enables recursive behavior so that processing can descend through the tree of a document. It is called apply-templates because the processor has to find new templates to process the children.<xsl:template match="manual"> <html> <head><title>Instructions Guide</title> </head> <body> <h1>Instructions Guide</h1> <xsl:apply-templates select="parts-list" /> </body> </html></xsl:template>

for-each instruction • The for-each element creates a template-within-a-template. Instead of relying on the XSLT engine to find matching templates, this directive encloses its own region of markup. Inside that region, the context node set is redefined to a different node set, again determined by a select attribute. Once outside the for-each, the old context node set is reinstantiated.<xsl:template match="book"> <xsl:for-each select="chapter"> <xsl:text>Chapter</xsl:text> <xsl:value-of select="position()" /> <xsl:text>.</xsl:text> <xsl:value-of select="title" /> <xsl:text> </xsl:text> <xsl:for-each><xsl:apply-templates /></xsl:template>

Output • Here's the output from the previous XSLT stylesheet.Chapter 1. Teething on Transistors: My Early YearsChapter 2. Running With the Geek GangChapter 3. My First White Collar CrimeChapter 4. Hacking the Pentagon

Named Templates • Named templates are similar to defining functions in programming. You set aside a block of code and give it a name. Later, you can reference that function and pass it data through arguments. This makes code simpler and easier to read overall, and functions keep frequently accessed code in one place for easier maintenance. A named template is like any other template except that it has a name attribute. You can use this with a match attribute or in place of one. Its value is a name (a qualified name) that uniquely identifies the template.

Processing Instructions • Creating processing instructions and comments is a simple task. The element processing-instruction takes an attribute name and some textual content to create a processing instruction:<xsl:template match="marker"> <xsl:processing-instruction name="formatter"> pagenumber=<xsl:value-of select="@page" /> </xsl:processing-instruction></xsl:template>This rule creates the following ouput: <?formatter pagenumber=1?>

Sorting • Elements often must be sorted to make them useful. Catalogs and surveys are two examples of documents that require sorting. Imagine a telephone book sorted by three keys: last name, first name, and town. The document might look like this:<telephone-book> … <entry id="44456"> <surname>Mentary</surname> <firstname>Rudy</firstname> <town>Simpleton</town> <street>123 Bushwack Ln</street> <phone>555-1234</phone> </entry> ………………….</telephone-book><xsl:template match="telephone-book"> <xsl:apply-templates> <xsl:sort select="town" /> <xsl:sort select="surname" /> <xsl:sort select="firstname" /></xsl:apply-templates></xsl:template>

Examples • classic_cars1inventory.xsl • classic_cars2inventory2.xsl • classic_cars3inventory3.xsl • classis_cars4inventory4.xsl • classic_cars_ifinventory_if.xsl

Combining Stylesheets • XSLT provides two ways to combine stylesheets: inclusion and importing. • Including a stylesheet means inserting its contents directly into the target stylesheet. All the rules and directives will be treated as if they were in your stylesheet all along. The include element has an href attribute, which holds a URI for the stylesheet to include. This element can be inserted anywhere in a stylesheet as long as it isn't inside a rule. • The element import can also be used to insert one stylesheet into another. It also uses an href attribute to specify a stylesheet, but it can be placed only at the very top of the stylesheet, before any other rules or directives.The advantage of the import element is that it can override parts of a more complete set of rules to customize the results. Include puts rules in at the same level of precedence as your own, import gives you more control over the remote set, allowing you to pick and choose among rules.

Combining Stylesheets (cont.) • There may be times when you want to override your own rules in favor of those that are imported for a localized region. The element apply-imports is analogous to apply-templates, except that it considers only imported rules, and ignores those that are physically present. Also, it only operates on the current node whereas apply-templates operates on whatever nodes you select (the child nodes by default). • You can include or import any number of stylesheets. The order of inclusion is used to break ties between conflicting rules from different sets: earlier imports override later ones.<xsl:stylesheet version="1.0" xmlns:xsl= "http://www.w3.org/1999/Transform"> <xsl:import href="basic_style.xsl" /> <xsl:import href="table_styles.xsl" /> <xsl:import href="chem_formulae.xsl" /> …..

Modes • At times we want to treat nodes differently depending on where they are used in the document. For example, you may want footnotes in tables to be alphabetized instead of numbered. XSLT provides special rule modifiers called modes to accomplish this. To set up a mode , simply add a mode attribute set to a particular label to the affected template and template-calling elements. The mode label can be anything you want as long as it's unique among mode labels. mode example ch 7

Chapter 7 - Transformation - XSLT Learning XML by Erik T. Ray