HOME  |   TRAINING  |   FREE TUTORIALS   |   JOBS
Find out more about our new RSS feed.
FREE Tutorial
PROFESSIONAL VISUAL BASIC 6 XML PART 4 - VARIABLES AND PARAMETERS (Page 2)

CATEGORY
SEARCH OUR OTHER TUTORIALS

DESCRIPTION

Once you have read this section, you will have found out that its title is a bit deceptive. You, as a programmer, have certain expectations about a new programming language when you start learning it. One of them is that you expect it to be possible to store values in variables, change them and retrieve them later. Although XSLT has an element called variable, you actually cannot do much with it. This may sound unbelievable, but it is a result of the way XSLT works that you cannot have variables. This is a thing that beginning XSLT programmers have many difficulties grasping.


This free tutorial is a sample from the book Professional Visual Basic 6 for XML.


html

If the method attribute on the output element is set to html, the results of some of the other attributes change a bit compared to the xml method.

  • The version attribute now refers to the version of HTML, with a default value of 4.0. The processor will try to make the output conform to the HTML specification.
  • Empty elements in the destination document will be outputted without a closing tag. Think of HTML elements like BR, HR, IMG, INPUT, LINK, META and PARAM.
  • Textual content of the script and style elements will not be escaped. So if the XSLT document contains this literal fragment:

<script>if (a > b) doSomething()</script>

This will be output as:

<script>if (a > b) doSomething()</script>

  • If any non-ASCII characters are used, the processor should try to use HTML escaping in the output (ë instead of ë).
  • If an encoding is specified, the processor will try to add a META element to the HEAD of the document. This will also contain the value for media-type (default is text/html).

<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=EUC-JP">
...

text

If the method attribute is set to text, the output will be restricted to only the string value of every node. The media-type defaults to text/plain, but you can use other MIME types. Think of generating RTF documents from an XML source document. These have no XML mark up, so the most appropriate method is text, with media-type set to application/msword. The encoding attribute can still be used, but the default value is system dependent (on most Windows PCs it will be ISO-8859-1).

Let's have a look at an example. The following stylesheet is used:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:template match="/">
<HTML><BODY>
 <TEST>
  This is literal text with an ëxtended character
  <BR/>
  <TABLE>
   <TR><TD>Cell data</TD>
   <TD>Second cell</TD></TR>
  </TABLE>
 </TEST>
</BODY></HTML>
</xsl:template>
</xsl:stylesheet>

We use this stylesheet on an arbitrary, valid XML document. Note that the output will always be the same literal XML tree. We will now only change the output method and have a look at the result. First the result for the xml method:

<?xml version="1.0" encoding="utf-8"?>
<HTML>
<BODY>
<TEST>This is literal text with an ëxtended character
 <BR/>
<TABLE>
<TR>
<TD>Cell data</TD>
<TD>Second cell</TD>
</TR>
</TABLE>
</TEST>
</BODY>
</HTML>

Note that every element starts on a new line. This is the result of the indent="yes" attribute. If this had not been specified, all content would be concatenated on one line. This XSLT processor has defaulted its output to encoding UTF-8. UTF-8 supports the extended character ë, so this is not escaped.

Setting the method to html would generate:

<HTML>
<BODY>
<TEST>This is literal text with an ëxtended character
 <BR>
<TABLE>
<TR>
<TD>Cell data</TD><TD>Second cell</TD>
</TR>
</TABLE>
</TEST>
</BODY>
</HTML>

Note that the XML declaration has disappeared and the processor appears to have decided on a slightly different formatting around the TD elements. The processor has been assigned to indenting the resulting document, but in html mode, this may only be done in places that cannot influence the appearance of the document in a browser. Also, the ë character cannot be used in HTML, so it is escaped using the preferred HTML entity ë (not the numeric XML entity).

Using the text method, the result would be:

This is literal text with an ëxtended character
 Cell dataSecond cell

Only the string values of the nodes have been printed. The specified encoding is used, so the special character is no problem. Note that no white space appears between the values of the two TD elements. We will see more on white space in the next sections.

strip-space and preserve-space What exactly happens to the white space in a document and in the XSLT document itself? This is one of the subjects that often puzzle XML developers. Spaces, tabs and linefeeds seem to emerge and disappear at random. And then there are the XSLT elements to influence them: strip-space, preserve-space and the indent attribute on the output element. Let's take a closer look.

During a transformation, there are basically two moments when white space can appear or vanish:

  • When parsing the source and stylesheet documents and constructing a tree.
  • Encoding a generated XML tree to the destination document.

Before any processing occurs, the XSLT processor loads the source and stylesheet into memory and starts to strip unnecessary white space. The parser removes all text nodes that:

  • consist entirely of white space characters.
  • have no ancestor node with the xml:space attribute set to preserve.
  • are not children of a white space-preserving element.

For the stylesheet, the only white space preserving parent element is xsl:text. For the source element, the list of white space preserving elements can be set using the strip-space and preserve-space elements from the stylesheet. By default, all elements in the source document preserve white space. With the elements attribute of strip-space, you can specify which elements should not preserve white space. Adding elements to the list of elements that have their white space preserved is done with preserve-space. The elements attributes accept a list of XPath expressions. If an element in the source matches multiple expressions, the conflict is resolved following the rules for conflicts between matching templates.

So if a stylesheet contained these white space elements:

<xsl:strip-space elements="*"/>
<xsl:preserve-space elements="PRE CODE"/>

The processor would strip all text nodes in the source document, except for those inside a PRE element or a CODE element.

After stripping space from the source and stylesheet documents, the processing occurs. The generated tree of nodes is then persisted to a string or file. By default, no new white space is added to the result document, except if the output element has its indent attribute set to yes.

attribute-set

On document level, it is possible to define certain groups of attributes that you need to include in many elements together. By grouping them, the XSLT document can be smaller and easier to maintain:

<xsl:template match="chapter/heading">
<font xsl:use-attribute-sets="title-style">
 <xsl:apply-templates/>
</font>
</xsl:template>

<xsl:attribute-set name="title-style">
<xsl:attribute name="size">3</xsl:attribute>
<xsl:attribute name="face">Arial</xsl:attribute>
</xsl:attribute-set>

Here the attribute-set element defines a group of two attributes that are often used together. In the template for chapter headings, the attribute-set is applied to a literal element, but use-attribute-set can also be used on element, copy and attribute-set elements. Be careful not to use use-attribute-set by itself (directly or indirectly), as this would generate an error.

namespace-alias

The namespace-alias element is used in very special cases, especially when transforming a source document to an XSLT document. In this case, you want the destination document to hold the XSLT namespace and lots of literal XSLT elements, but you don't want these to interfere with the transformation process. See the problem? You are shooting yourself in the foot there.

Using namespace-alias, you can use another namespace in the stylesheet, but have the declaration for that namespace show up in the destination document with another URI:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias">

<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>

<xsl:template match="/">
<axsl:stylesheet>
 <xsl:apply-templates/>
</axsl:stylesheet>
</xsl:template>
...
</xsl:stylesheet>

Instead of declaring the literal XSLT output elements in their real namespace, they have a fake namespace in this document. In the destination document, the same prefixes will be used, but they will refer to another URI:

 
<?xml version="1.0" encoding="utf-8"?>
<axsl:stylesheet xmlns:axsl="http://www.w3.org/1999/XSL/Transform">
...
</axsl:stylesheet> 

key

The key element is a very special one. It will take a little time to discover its full potential. It is more or less analogous to creating an index on a table in a relational database. It allows you to access a set of nodes in a document directly with the key() function, using an identifier of that node that you specify. Let's describe an example. We could, using the key element, define that the key person-by-name gives us access to PERSON elements by passing the value of their name attribute. If the key is set up correctly, we would use key('person-by-name', 'Teun') to get a result set of PERSON elements that have their name attribute set to 'Teun'.

To set this key, you would have used the element like this:

<xsl:key name="person-by-name" match="PERSON" use="@name"/>

Try to see what each of the attributes name, match and use specifies. The name attribute is simple: it just serves to refer to a specific key of which there may be many. The match attribute holds a pattern that nodes must match to be indexed by this key; this pattern is identical to the template match attribute. It is not a problem if the same node is indexed by multiple keys. For each node in the selected set, the XPath expression in the use attribute is evaluated. The string value of the result of this expression is used to retrieve the indexed node. Multiple nodes can have the same result when evaluating use in their context. When the key function is called with this value, it will return a result set holding all nodes that had this result. The result can be a node set. In this case, each of the nodes will be converted to a string and each of these strings can be used to retrieve the selected node.

Don't worry if you can't see the point of this yet. We will do an extensive example on this. Suppose we have this XML document:

<?xml version="1.0"?>
<FAMILY>
<TRADITIONAL_NAMES>
 <NAME>Peter</NAME>
 <NAME>Mary</NAME>
</TRADITIONAL_NAMES>
<PERSON name="Peter">
 <CHILDREN>
  <PERSON name="Peter"/>
  <PERSON name="Archie"/>
 </CHILDREN>
</PERSON>
</FAMILY>

You are transforming the XML source with an XSLT document that starts like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

<xsl:key name="all-names" match="PERSON" use="@name"/>
<xsl:key name="parents-names" match="PERSON[CHILDREN/PERSON]" 
use="@name"/>
...

If you now use the key() function, your results will be:

Now what are the cases where using a key is a good idea? Think of situations where XML elements often refer to each other using some sort of ID, but without using the validation rules for IDs (because these are sometimes too rigid). The key construct can:

  • keep your code more readable.
  • depending on the implementation, which may help performance. The XSLT processor can keep a hash-table structure in memory of all key references in the source document. If these references are often used, performance gains can be substantial.

Continued...



PREVIOUS PAGE
NEXT PAGE



5 RELATED COURSES AVAILABLE
HTML 4.0 INTRODUCTION
To create, format and publish a small website using HTML 4.0. You will learn to create web pages incorporating fo....
MICROSOFT INTERNET EXPLORER 6.0 INTERNET INTRODUCTION
This course provides readers with an introduction to the concept of the Internet and the opportunity to gain a br....
A+ MODULE 5 - THE INTERNET
At the end of this course you will be able to: describe the functions of an operating system, describe the featur....
JAVASCRIPT PROGRAMMING
This training course aims to teach the reader the fundamentals of JavaScript. This course covers topics such as -....
I-NET+ MODULE 8 - DEVELOPING A WEB SITE
On completion of this module, readers will be able to: create HTML pages incorporating different document-, parag....
 
0 RELATED JOBS AVAILABLE
CONTACT US
Saturday 21st November 2009  © COPYRIGHT 2009 - VISUALSOFT