Saturday, May 30, 2009

Function parameters vs global variables

I bumped upon this problem, and thought of sharing my experiences here.

There are occasions, where I have to write an XSLT function and simply use it. For e.g. (this is just an illustration. we could have more function parameters, and a different return type),

<xsl:function name="fn:somefunction" as="xs:boolean">
<xsl:param name="pName" as="xs:string" />

<!-- use $pName and the variable, $someList (defined below) -->
<xsl:sequence select="something.." />
</xsl:function>


The evaluation of this function also depends on some data/information other than the parameters being passed. This external information on which the function depends, could be a global variable. Say for e.g.,

<xsl:variable name="someList" as="element()+">
<x>a</x>
<x>b</x>
..
..
</xsl:variable>


In my case, this is a fairly static data (the variable, $someList), and is needed for the evaluation of above function.

I could see two option, on how the function may use an external data ($someList in this case):
1) Have an external data as a global variable (as illustrated above)
2) Supply this data as additional function parameter

I was in a sort of dilemma recently, where I had to decide whether I should go for option 1) or 2).

In my case, I opted for option 1) i.e., the global variable.

I can think of few pros and cons of both of the above options:
1. Having a global variable: This is good, if the external information is fairly static and perhaps has big chunk of data. Having global variable could be also useful, if the data is shared between multiple functions.
2. Having a parameter for the data: This option looks good from the point of view of the principle of composability. Functional programming advocates like this idea. I think, in classical computer science theory, a function (a callable module) is an abstraction which takes some input and produces some output. I think, the notion of functions accessing data which exists outside it's body is a mechanism devised by specific programming languages, and not as such defined by computer science theory. So from the point of view of this idea, having parameter for data is a good option. In fact I would also support this option, as far as possible.

In my case, I was working with XSLT. But I guess, these concepts would apply to many of other programming languages as well.

This topic could turn into a discussion, about how we must write good computer programs.

Any ideas are welcome please.

Tuesday, May 26, 2009

PsychoPath XPath 2.0 processor update

We recently implemented quite a few built in XSD numeric data types in PsychoPath XPath 2.0 processor (ref, http://www.w3.org/TR/xmlschema-2/#built-in-datatypes). Now all (I mean, really all of xs:decimal ones :)) the data types in the xs:decimal hierarchy are available in PsychoPath, and these should be available in Eclipse WTP 3.2 M1 (which should be released sometime soon after the Eclipse Galileo release, at around June 09' end).

Now almost all the major built in Schema types are available in PsychoPath, except for few subtypes of xs:string (like xs:normalizedString, xs:token etc.). These shouldn't be much difficult to add.

Dave Carver reported, that the improvements we have done recently in PsychoPath have significantly improved it's compliance to the W3C XPath 2.0 test suite.

PsychoPath processor version is now enhanced from 1.0 to 1.1.

Sunday, May 17, 2009

Xerces-J XSD 1.1 assertions and PsychoPath XPath 2.0 processor update

I recently contributed few patches to the Eclipse PsychoPath XPath 2.0 engine, to support Schema aware XPath (2.0) expressions. These patches would enhance Schema aware support in PsychoPath XPath2 engine, for element and attribute nodes, for the XML Schema primitive types.

These enhancements in PsychoPath engine would make XPath expressions like following possible, to be evaluated by PsychoPath engine:

person/@dob eq xs:date('2006-12-10') // if dob is an attribute, and of schema type xs:date

person/dob eq xs:date('2006-12-10') // if dob is an element, and of schema type xs:date

@max ge @min // this would work if 'max' and 'min' have say schema types, xs:int

The patch in my local environment already exhibits these improvements. As promised by Dave Carver (the PsychoPath engine project lead), users would likely get these improvements in Eclipse WTP (Web Tools Project) 3.2.
2009-05-24: These changes are now committed to the Eclipse CVS server, and the improvements are flagged to be delivered in Eclipse WTP 3.2 M1, which should be quite sooner. Thanks to Dave Carver for testing all my patches, and committing them to the server.

PsychoPath engine already has a framework (thanks to Andrea Bittau and his team) for supporting Schema awareness (based on the Xerces-J XSD schema model). I just added in small pieces of code in attribute and element node implementations (particularly, improving the "typed value" of attribute and element nodes for built in XSD schema types), to enhance schema aware support.

I think, we are gradually moving to a more mature schema aware support in PsychoPath.

I'm also using these new PsychoPath processor capabilities, to implement schema aware XPath 2.0 evaluations in Xerces-J XSD assertions support.

I'm currently working on to construct a typed XPath data model instance, for XSD 1.1 assertions evaluations. Having this capability, would allow users to write XPath expressions like, following:

@max ge @min

or

person/@dob eq xs:date('2006-12-10')

In the absence of this (i.e, typed XDM nodes), currently users have to make explicit cast operations, like following:

xs:int(@max) ge xs:int(@min)

or

xs:date(person/@dob) eq xs:date('2006-12-10')

The XML Schema 1.1 assertions spec recommends a typed XDM instance.

We hope to provide this capability within Xerces-J, inline with the XML Schema 1.1 assertions specification.

2009-05-23: These improvements are now implemented, and I've submitted the code improvements to the Apache Xerces-J JIRA server. I'm hoping, we'll have these improvements committed on the Xerces-J SVN server some time soon.

Friday, May 8, 2009

Became Apache Xerces-J committer

As per the voting process for becoming an Apache project committer, the Apache Xerces-J team granted me the committer status for the Xerces-J project on May 5, 2009.

This gives me an opportunity to contribute to the Xerces-J codebase, in a more direct way.

It's indeed a privilege for me to be part of the core Xerces team. Starting from being an Xerces user (since long time ago :)), to becoming project committer has been a rewarding journey in numerous ways.

Sunday, May 3, 2009

Apache Xerces-J assertions implementation and PsychoPath XPath 2.0 processor

I shared sometime back on this blog, on the work I am doing regarding XML Schema 1.1 assertions support in Xerces-J. The XML Schema 1.1 assertions processing requires a XPath 2.0 processor for performing Schema validation.

The Xerces-J team has opted to use the open source XPath 2.0 processor, PsychoPath. PsychoPath was developed by Andrea Bittau and his team. The PsychoPath team donated the PsychoPath code base to Eclipse community, where it is now formally used in the Eclipse, Web Tools Platform project. Future enhancements to PsychoPath are now taking place at Eclipse.

Since Xerces-J is using PsychoPath XPath 2.0 engine, we wish that PsychoPath be ideally 100% compliant to the XPath 2.0 specification, so Xerces-J users can use much of the failities of the XPath 2.0 language while using XML Schema 1.1 assertions.

After looking at the PsychoPath source code and using it quite a bit, my personal observation is, that PsychoPath has a pretty good XPath 2.0 implementation. Please refer to this documentation for knowing more about PsychoPath and the current compliance status.

The Eclipse WTP team is working actively to solve any remaining non-compliant items in PsychoPath. Incidentally, I have been working recently to help improve PsychoPath's compliance to the XPath 2.0 spec, and have contributed few patches to Eclipse.

We are also planning to run the W3C XPath 2.0 test suite on PsychoPath, and targetting PsychoPath to pass the test suite, with 100% coverage. This should give the PsychoPath adopters more confidence while using it.