Saturday, August 30, 2008

Schema aware processing with XSLT 1.0

I wrote following article, about implementing some of the Schema aware stylesheet ideas (as defined in the XSLT 2.0 spec.) using a XSLT 1.0 processor and Java extensions, and a suitable validating XML parser. A requirement from a XSLT user on this blog motivated me to work on this idea.

The article on my web site showcases just how few of the XSLT 2.0 Schema aware facilities like, validating the output tree (or even a result tree fragment) prior to serialization can be done. This covers one of the very important aspects of Schema aware XSLT 2.0 stylesheet design.

I think it's not possible to implement following Schema aware XSLT 2.0 facilities, using XSLT 1.0 and extensions alone:

1) Pass the validated XML instance tree to the XSLT processor, and the stylesheet is able to access Schema annotated XPath 2.0 data model tree. The XSLT 2.0 and XPath 2.0 languages define various Schema related instructions and expressions (for e.g., element(*, typeName) etc.), which cannot be simulated with XSLT 1.0 and extensions.

2) Since we cannot access Schema annotated XPath 2.0 data model tree in a XSLT 1.0 stylesheet, we cannot access XML Schema type names in a stylesheet, which prohibits the enhanced static typing features in XSLT stylesheets.

Using a complete Schema aware XSLT 2.0 system allows very rich static typing in XSLT stylesheets out of the box.

Friday, August 22, 2008

Nice use case for xsl:analyze-string instruction

I thought that this was interesting to share.

Recently an XSLT user discussed a problem on xsl-list, which was solved by Jeni Tennison using XSLT 1.0 long time ago.

I presented a XSLT 2.0 solution for the same problem. The 2.0 solution is lot shorter as compared to the 1.0 solution, and utilizes the XSLT 2.0 instruction, xsl:analyze-string.

The link to this thread is at, http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/200808/msg00383.html.

This problem could be a nice use case for xsl:analyze-string instruction.

Thursday, August 14, 2008

Transforming tree structure from one format into another

An interesting question was asked on xsl-list,

The input XML file is something as:

<Objs>
<obj name="a" child="b"/>
<obj name="b" child="c"/>
<obj name="b" child="d"/>
<obj name="c" child="e"/>
</Objs>

Let's say that XML file has only one root node.

The output XML file would be:

<Obj name="a">
<Obj name="b">
<Obj name="c">
<Obj name="e"/>
</Obj>
<Obj name="d"/>
</Obj>
</Obj>


A tree structure is defined in input XML, by the 'name' and 'child' attributes. The output represents a true logical tree. We should be able to cater to unlimited number of tree nodes.

We need to write a XSLT stylesheet for this.

At first thought, I imagined that this could be a tough problem. But a little bit of patience helped me to write the stylesheet for this. The solution is presented below.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="Objs">
<xsl:variable name="start" select="obj[not(@name = ../obj/@child)]" />
<xsl:variable name="startName" select="$start[1]/@name" />
<Obj name="{$startName}">
<xsl:for-each select="obj[(@name = $startName) and not(../obj/@name = @child)]">
<Obj name="{@child}" />
</xsl:for-each>
<xsl:call-template name="makeTree">
<xsl:with-param name="list" select="obj[@name = $start/@child]" />
</xsl:call-template>
</Obj>
</xsl:template>

<xsl:template name="makeTree">
<xsl:param name="list" />

<Obj name="{$list[1]/@name}">
<xsl:for-each select="$list">
<xsl:variable name="child" select="@child" />
<xsl:choose>
<xsl:when test="not(../obj[@name = $child])">
<Obj name="{$child}" />
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="makeTree">
<xsl:with-param name="list" select="../obj[@name = $child]" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</Obj>
</xsl:template>

</xsl:stylesheet>

At first thought, I felt that XSLT 2.0 constructs will be required to solve this problem. But the problem can be solved completely with a XSLT 1.0 stylesheet.

My belief that XSLT is a wonderful language for processing XML data, became stronger after solving this problem.

Sunday, August 3, 2008

Multiple values for XSLT keys

An XSLT user, asked following question, on xsl-list.

How do I use multiple key values?

Declaration:

<xsl:key name="keyname" match="subroot" use="ccc"/>

During the usage, I want to specify multiple values:

<xsl:variable name="keyname" select="key('keyname', '11' or '22')"/> ==> Here I want to use multiple values 11 and 22.

xsl-list members suggested useful options,

1. David Carlisle

<xsl:variable name="keyname" select="key('keyname', '22')|key('keyname', '11')"/>

2. Michael Kay

In XSLT 2.0, you can supply a sequence:

key('keyname', ('111', '222'))

In 1.0, you can supply a node-set with one value per node - but of course it's hard to set that up, you need the xx:node-set() function.

I worked upon Mike's idea for a XSLT 1.0 solution,

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exslt="http://exslt.org/common"
exclude-result-prefixes="exslt"
version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:key name="x" match="subroot" use="ccc"/>

<xsl:variable name="x-values">
<v>11</v>
<v>22</v>
</xsl:variable>

<xsl:template match="/root">
<result>
<xsl:for-each select="key('x', exslt:node-set($x-values)/v)">
<value>
<xsl:value-of select="eee" />
</value>
</xsl:for-each>
</result>
</xsl:template>

</xsl:stylesheet>

Mukul: I think, this could be better than David's suggestion, because if we want to have quite large number of different values to search by the key, we just have to change following code fragment,

<xsl:variable name="x-values">
<v>11</v>
<v>22</v>
<!-- more values -->
</xsl:variable>

G. Ken Holman responded to my post, and provided a brilliant idea,

The node-set extension can be avoided to achieve what you want.

<xsl:for-each select="key('x', exslt:node-set($x-values)/v)">

The above can be replaced with standard XSLT 1.0 to read the stylesheet file as a source node tree.

<xsl:for-each
select="key('x',document('')/*/xsl:variable[@name='x-values']/v)">

Ken further wrote,

I grant, though, that if your stylesheet is large then putting this into a small included or imported fragment would keep any overhead of building the tree small.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:key name="x" match="subroot" use="ccc"/>

<xsl:include href="ken2values.xsl"/>

<xsl:template match="/root">
<result>
<xsl:for-each select="key('x',$x-values)">
<value>
<xsl:value-of select="eee" />
</value>
</xsl:for-each>
</result>
</xsl:template>

</xsl:stylesheet>

ken2values.xsl
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:variable name="x-values-data">
<v>11</v>
<v>22</v>
</xsl:variable>

<xsl:variable name="x-values"
select="document('')/*/xsl:variable[@name='x-values-data']/v"/>

</xsl:stylesheet>

I think, Ken's idea of having an included stylesheet (ken2values.xsl, above) is brilliant, as it is memory efficient, and we are able to avoid the node-set extension (as mentioned earlier).