Sunday, July 27, 2008

XML Schema 1.1 assertions implementation in Xerces-J

One of the interesting enhancements, that are happening recently to Xerces code base, is XML Schema 1.1 implementation.

The Xerces-J team motivated me to implement some of the XML Schema 1.1 features into Xerces. I've started with the implementation of XML Schema 1.1 facility, "assertions".

I've completed quite a bit of work regarding this, and am hoping that the assertions support I'm writing would be available in Xerces-J in near future.

2008-11-04: Xerces team approved my work so far for assertions implementation, and have committed my patch to the Xerces code base. In the coming weeks, I would be working on integrating XPath 2.0 processing for assertions.

Saturday, July 26, 2008

An elegant XSLT solution

We usually see some nice posts, on the xsl-list.

David Carlisle recently posted a very elegant XSLT solution to a question asked on xsl-list. It's archived here, http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/200807/msg00574.html.

I really liked the following expression, in David's solution:

<xsl:attribute name="level">
<xsl:value-of select="sum(.|preceding::sect[1]/@depth)"/>
</xsl:attribute>

I might have solved this problem differently, but not as elegantly like this. Nice thought, David!

Friday, July 18, 2008

XSLT 2.0 shines over 1.0

I was pondering over XSLT 2.0's advantages over XSLT 1.0, and came up with a simple example that illustrates XSLT 2.0's benefits.

Below are a 1.0 and 2.0 stylesheets, for finding the 1st n fibonacci numbers (and, analysis later on):

XSLT 2.0

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:x="http://localhost"
version="2.0">

<xsl:output method="text" />

<xsl:param name="n" />

<xsl:template match="/">
<xsl:for-each select="1 to $n">
<xsl:value-of select="x:fibonacci(position())" /><xsl:text> </xsl:text>
</xsl:for-each>
</xsl:template>

<xsl:function name="x:fibonacci" as="xs:integer">
<xsl:param name="n" as="xs:integer" />

<xsl:sequence select="if (($n = 1) or ($n = 2)) then 1 else x:fibonacci($n - 1) + x:fibonacci($n - 2)" />
</xsl:function>

</xsl:stylesheet>

XSLT 1.0

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">

<xsl:output method="text" />

<xsl:param name="n" />

<xsl:template match="/">
<xsl:call-template name="iterateAndFib">
<xsl:with-param name="x" select="1" />
</xsl:call-template>
</xsl:template>

<xsl:template name="iterateAndFib">
<xsl:param name="x" />

<xsl:if test="$x <= $n">
<xsl:call-template name="fibonacci">
<xsl:with-param name="n" select="$x" />
</xsl:call-template>
<xsl:text> </xsl:text>
<xsl:call-template name="iterateAndFib">
<xsl:with-param name="x" select="$x + 1" />
</xsl:call-template>
</xsl:if>
</xsl:template>

<xsl:template name="fibonacci">
<xsl:param name="n" />

<xsl:choose>
<xsl:when test="($n = 1) or ($n = 2)">
<xsl:value-of select="1" />
</xsl:when>
<xsl:otherwise>
<xsl:variable name="x">
<xsl:call-template name="fibonacci">
<xsl:with-param name="n" select="$n - 1" />
</xsl:call-template>
</xsl:variable>
<xsl:variable name="y">
<xsl:call-template name="fibonacci">
<xsl:with-param name="n" select="$n - 2" />
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="$x + $y" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>

</xsl:stylesheet>

Why therefore I think, XSLT 2.0 is better over 1.0,

1. A 2.0 stylesheet can be written with very few lines of code as compared to 1.0 stylesheet. In this case, the 2.0 stylesheet is of 22 lines, and the 1.0 stylesheet is of 51 lines (considering normal indentation markers in code).

2. In a 2.0 stylesheet, there is no need of recursion to iterate. The for-each loop natively supports iteration in a numerical range.

3. In the 2.0 stylesheet, we can utilize the xsl:function construct to write shorter code, which is better logically understood. The recursive calls in xsl:function in this example are easy to understand.

In a 1.0 stylesheet, we need to write named template to achieve recursive calls, which can get cumbersome if logic is complex.

4. In XSLT 2.0, the data model type system has lot more data types, than 1.0 (All built-in XML Schema types, as well user defined types can be used in XSLT 2.0 stylesheets).

I have no doubt, XSLT 2.0 shines over XSLT 1.0.

I read Norman Walsh expressing following thoughts on his blog post, "Every experience that I have with XSLT 2.0 increases my enthusiasm for it.". I totally agree with Norm.

Saturday, July 12, 2008

Constructing StreamSource and StreamResult for JAXP transformation

Recently, I had some tough time converting file paths having spaces into correct StreamSource and StreamResult objects, to be used by the JAXP transformer.

I figured out that below suggestion is probably the best way to solve this issue.

String sourceSystemId = "file:///C:/... .xml"; (use like this, if you are sure that URI string doesn't contain any illegal characters, like spaces etc.)

OR

String sourceSystemId = (new File(pathname)).toURI().toString(); (this will correctly escape characters that are illegal in URIs)

Then, construct StreamSource or StreamResult like following

StreamSource source = new StreamSource(sourceSystemId);

StreamResult result = new StreamResult(outputSystemId);

A useful function exists in XPath 2.0, which applies the %HH escaping convention to a URI, escaping both disallowed characters and reserved characters such as "/" and ":" (encode-for-uri(string $uri-part) → string).