Saturday, February 28, 2009

Ptarithmetic: logic in computer science

A post on comp.theory newsgroup referred to this paper, by Giorgi Japaridze. I read a bit of this paper, and believe the thoughts presented here are quite promising (at least, I found the goal of this study quite ambitious). I guess, people with interest in computer logic might find this interesting.

This paper introduces ptarithmetic (short for "polynomial time arithmetic") - a formal number theory similar to the well known Peano arithmetic, but based on the recent, computability logic instead of classical logic.

The arithmetic primitives defined in this theory, compute in polynomial time (as against classical logic, operations in which can compute in exponential or combinatorial time).

An ideal compiler architecture

I found the compiler architecture description here, http://lambda.uta.edu/cse5317/notes/node5.html to be the most appealing from all the definitions I have read upto now.

Particularly I like the following idea:

"Suppose that you want to build compilers for n programming languages (eg, FORTRAN, C, C++, Java, BASIC, etc) and you want these compilers to run on m different architectures (eg, MIPS, SPARC, Intel, alpha, etc). If you do that naively, you need to write n*m compilers, one for each language-architecture combination.

The holly grail of portability in compilers is to do the same thing by writing n + m programs only. How? You use a universal Intermediate Representation (IR) and you make the compiler a two-phase compiler. An IR is typically a tree-like data structure that captures the basic features of most computer architectures. The first phase of this compilation scheme, called the front-end, maps the source code into IR, and the second phase, called the back-end, maps IR into machine code. That way, for each programming language you want to compile, you write one front-end only, and for each computer architecture, you write one back-end. So, totally you have n + m components."


Though the above idea of compiler construction is quite ideal, the author cautions with these statements:

"But the above ideal separation of compilation into two phases does not work very well for real programming languages and architectures. Ideally, you must encode all knowledge about the source programming language in the front end, you must handle all machine architecture features in the back end, and you must design your IRs in such a way that all language and machine features are captured properly."

I guess, this is something interesting to think about. How good it will be, if all major programming languages translate to the same intermediate representation.

Saturday, February 21, 2009

Some thoughts about IPv6

I read an article at IANA about the current state of IPv6 implementation on the world wide internet. I just thought of sharing few points here, as mentioned in the IANA IPv6 report, to help spread some awareness about this important issue. The article presents following facts:

1. IPv4 (the current traditional sceme of IP address allocation) theoretically has address space of about 4-5 billion IP addresses. When this scheme was designed, when the modern internet came into existence, it was imagined that about 4-5 billion devices will be a good maximum limit to the number of devices which can be connected to the internet.

But with the growing popularity of the internet across the world, the IPv4 scheme has proved to be insufficient to scale the internet using new devices. By Oct 2007, only 17% IP addresses were available for allocation under IPv4 scheme. Somewhere in 2010-11, all the IPv4 addresses will be exhausted. So after that time, it will be inevitable for the world to shift to IPv6 scheme to add more IP devices to the internet.

Apart from computer machines needing to connect to internet, many different kind of devices are now being designed to have IP capability (like many home appliances for example). To be able to assign all these new devices the globally unique IP addresses, adoption of IPv6 is bound to take place.

2. During the transition period from IPv4 to IPv6, both IPv4 and IPv6 hosts can continue to coexist and interoperate. But interoperation of these two schemes would require investment of time and money from hardware vendors, and network service providers.
The IPv4 scheme is expected to continue in operation for couple of years from now (I guess 5-10 years).

3. The IPv6 support in the softwares seems not to be a problem. Major OS vendors already have IPv6 support built into the OSs. The challenging aspect is the IPv6 support from the hardware and network equipment providers. I am sure that hardware and network vendors who intend to be be in the internet business for a long time in future, will commit the time and money to bring this vital expansion of the internet.

Interestingly, JDK 1.4 and above has provided support for IPv6 based IP addresses. So writing IPv6 enabled client applications is already possible with Java JDK.

Google is also serious about IPv6 adoption. The google search service is now available for IPv6 enabled devices. Please refer, http://googleblog.blogspot.com/2008/05/looking-towards-ipv6.html for more information about Google's efforts to promote IPv6 scheme. http://ipv6.google.com/ is the web URL to do google search from IPv6 machines. If you do a google search from an IPv4 machine (as I do, from the IP address 192.168.1.8) to this host, you'll get an "unknown host error". The machine should have an IPv6 based IP address to access this search service from Google.

Wednesday, February 11, 2009

DOM vs JAXB

The DOM API and JAXB (Java Architecture for XML Binding) look a bit similar to me, for the task of mapping Java objects to XML documents (the concept popularly known as, XML data binding).

JAXB essentially does [1],

1. Marshalling of Java objects to XML.

2. Unmarshalling of the XML data to Java objects.

The DOM API also does [2],

1. Transformation of DOM tree to XML (known as the XML serialization process).

2. Transforming of XML documents to a DOM object tree.

At a broad level, the approaches [1] and [2] look similar.

But according to me, following are the differences between the two approaches:

1. The JAXB transformation is driven by a Schema. The JAXB binding compiler generates Java type definitions for the Schema components. With JAXB, the object model (generated by the JAXB framework) maps closely to the concepts in the problem domain (for e.g, there will be a Java class PurchaseOrder corresponding to a Schema type PurchaseOrder).

Whereas with DOM, the Schema does not drive the XML and Java mapping. And, all the elements in an XML document map to the DOM class, org.w3c.dom.Element. The name of the element, the data inside it and other properties of the element are accessible using the methods provided by the DOM API.

2. There are perhaps some performance differences between JAXB and DOM. A study about this has been done by Santiago Pericas-Geertsen, as shared in these blog posts.

http://weblogs.java.net/blog/spericas/archive/2005/12/dom_vs_jaxb_per.html
http://weblogs.java.net/blog/spericas/archive/2005/12/dom_vs_jaxb_per_1.html

Under certain circumstances (to my opinion), the XML data bindings needs can be met by the DOM API (where a Schema does not drive the data binding process, and a raw mapping of XML and Java is required).