May 8, 2008

Terracotta at JavaOne

Today I went to see what was going on at JavaOne which is held a couple of blocks from my office. The Pavilion was not much different from last year so I ended up talking mostly to Terracotta guys and brainy Taylor in particular. I was asking different nasty questions about their approach and he was kind enough to talk to me for at least half an hour. Right after he talked to Brian Goetz himself.

My opening salvo quite naturally was about their messaging framework. I was essentially told that JGroups clusters of more than 4 servers had been widely known and Apache Tribes did not fit exactly and bug-fixing was insufficiently quick. From what I know the whole JBoss stack (at least in its JBoss AS 5 reincarnation) is supposed to depend on JGroups and so I am at a loss to reconcile such contradictory statements. Probably the key is the number of servers in a cluster.

At face value it means there is no open source reliable messaging framework capable of scalability beyond a couple of servers. Taking into account that pretty much anything which is not yet "in the cloud" is clustered nowadays that sounds odd. And it drives home the idea of reliable messaging as a truly challenging thing to make of production quality.

I learned more about there positioning as well. They are after the middle market of, roughly speaking, up to 50 servers in domains such as web application. Which I guess implies that JBoss is a competitor more than Coherence/Gigaspaces going after larger clusters in Finance.

An active-active L2 server configuration is expected by the end of the year although the common belief is that the 10 seconds required to switch to a backup server now are tolerable. From how I understood it they are planning to send separate updates to both L2 servers instead of multicast or replication between the two. I might have misconstrued something though.

We talked about their paradigm a little bit. I admit to being rather uncomfortable with it because they are the only company I know literally exploiting the conceptual similarity of concurrent and distributed systems (i.e. CPUs sitting on the same bus differ from servers in a cluster only by communication delays much more pronounced in the case of a LAN). It is so different from pretty much any product (exposing a real API in terms of actual interfaces in, say, JCache as opposed to delimiting transactions with monitorenter/monitorexit pairs) that either they have invented the best thing since sliced bread or they are likely to fail as mavericks. They might as well be the next "the network is the computer" after all.

The foundational paradigm of Terracotta as a distributed JVM (complete with a DGC) evokes the same kind of argument as JVM used to ten years ago. Back then the idea of Java performance comparable with C++ was ridiculous although it was said at the very beginning that JIT-style dynamic optimizations would do the trick one day. It looks like JVM guys have pulled the trick after all so this lesson may have significant implications for Terracotta.

As an example, it can detect that a particular instance is used exclusively by one L1 server and transfer lock ownership to that L1 server from the central L2 host and so effectively avoid using distributed locking. As a result we have a sort of buddy replication (between L1 and L2). Like a JVM silently eliminating synchronization in a sequential program. Theoretically neat :)

One thing I can say safely is that in contrast to many companies in this field Terracotta is not afraid to share its source code. They do not pretend like Coherences of this world do that someone can steal anything from the code base and ruin their empire (anybody heard of a new M$ after the windows source code was leaked on the net?). As a developer I believe that code quality says a lot about the corresponding system (not to mention things one can learn from a large successful system) and I applaud Terracotta for their bravery.