Nikita Dolgov's technical blog: December 2014

Dec 10, 2014

Aeron

I expect Aeron will follow a trajectory similar to the one Disruptor had. Namely, fewer people will actually use it in production than learn lots of valuable lessons from its design principles. With the exception of a market feed processing system I have actually seen the Disruptor used only in a low latency logger. But I still remember how surprising the emphasis on a single writer and all the nifty non-blocking tricks looked a few years ago.

So I went through the original Strange Loop presentation for a first taste of extreme messaging wisdom. Well, for better or worse actually learning any new low-level details will take some source code digging. But from the first pass one technique in particular somehow reminded me both Disruptor and Kafka thinking.

On the sender side, there are three buffers. A clean one (empty for the time being), a dirty one (completely filled, could be used to go a little back in sent message history) and the current one (partially filled, to be used for the next send request). There is a current offset pointer shared by all the sender threads. When a message is being sent, the sender thread first tries to increment that counter with a CAS, retrying if necessary. That CAS allows multiple senders to do most work, including actually writing a message to the output buffer, without synchronization.

On the receiver side, the setup is similar but there are two offset pointers. One for the high water mark (the highest message offset seen so far) and the other for the last message received in sequential order (i.e. without missing messages with lower offsets). The two pointers allow to asynchronously restore the original sequence for messages received out of order. In addition, another thread can periodically check the difference between the two offsets to detect a loss of messages in transit and so the need for re-transmission.

Dec 5, 2014

AsterixDB

The other day I noticed that the Hyracks research prototype had a continuation. A few years ago it was quite interesting. Back in the day when the Dryad paper was in vogue but its source code was unavailable. The sql-on-hadoop segment was not as overcrowded. Heck, the term itself did not exist then if I remember correctly. I perused their recent papers and I am not quite sure what to think about it.

I liked the one on data storage because it illustrates nicely the usage of LSM tree indexes. Their streaming support looks somewhat unnatural in comparison with the way event-oriented systems are built and used Storm-style. The critique of mainstream Java memory wastefulness is nice but too basic.They don't even consider vectorization which is such a hot topic in analytics. I'd rather see what the researchers think about using the off-heap storage with the latest JDK.

I understand that the academics avoid making real contributions to production systems. I am not convinced it's always justified though. Consider something like a DB cost-based optimizer. Judging from the literature it's a legit academic topic. There is only one open source optimizer project I know, Apache Calcite. It's actually a very mature framework with a long history and major products using it. It is also very meta which makes it extremely difficult to really grasp despite of impressive javadocs. I have not found any design papers on it and I regret that deficiency very much. Anyway, the AsterixDB guys don't even mention Calcite/Optiq even though they admit that their own optimizer is rule-based. The way they describe their optimizer makes me think Algebricks wants to be Calcite when it grows up.

On a lighter note, we are running out of previously unused words. When I hear that AsterixDB is a BDMS I conjure up images of a venerable video genre. When I read about Algebricks I remember Algerbird.