Mar 19, 2008

Consistent Hashing

An interesting and conceptually simple approach to choosing what to put on a new cluster member or where to move the state of a failed one.

Mar 10, 2008

Interview-quality concurrent primitives

The exciting thing about interviews is that you hardly ever know how any of those will be conducted. Take concurrency as an example. It could be ignored, some people may ask you about plain vanilla wait/notify semantics, other about using primitives from the util.concurrent library, and then someone is likely to ask you about arguably more intellectual algorithmic side. In the latter case, a typical task is to implement something like a thread-safe bounded buffer or a read-write lock.

Well, I guess except for Messrs Lea and Goetz few would probably attempt to roll out a production quality CAS-based design having just a white board and half an hour. Which means that knowing simpler approaches makes sense at least in the context of interviews. Here where a simplistic book can help.

A half of it is dedicated to the RTSJ which seems to be more of a research toy. The other half provides a rather messy introduction to concurrency. I used to consider this book to be mostly useless (too thin theoretically, insufficiently util.concurrent-oriented for real-life development). But a few basic implementations of primitives such as semaphores and locks certainly can be used for interview purposes.

Mar 3, 2008

Distributed algorithms

Distributed algorithms are a fascinating but often overlooked area in which academic research meets real-life challenges. Well-known practical applications such as the JGroups framework tend to simplify (for performance reasons obviously, certain levels of consistency are prohibitively expensive in terms of network traffic or latency) the approaches suggested by academicians. On the other hand, virtually all such frameworks are firmly rooted in hard-core research (e.g. Bela Ban was influenced by research done in Cornell where Ken Birman seems to be the founding guru) and you won't go far on common sense and re-invented wheels.

In mainstream development layers upon layers of middleware protect programmers from the challenges of real engineering. Quite few are lucky enough to closely work with JGroups-level components and even fewer use NIO and MINA-like frameworks (BTW MINA's lead developer has just been hired by JBoss) to build infrastructure of their own. In my experience, even among CS graduates not everyone has a good grasp of this field. Once upon a time I struggled mightily to learn basics of reliable messaging and its internals. As a result, I have been keeping my eye on the literature but surprisingly enough for our age of grid computing there are only a few sources catering to practicing engineers. Most CS books tend to jump to the theoretical aspects and never come back to something more practical.

So far I can truly recommend only a couple of books paying attention to real code.
  • Concurrent and distributed computing in Java - a very good introductory book covering both essential concurrency issues and distributed algorithms. The author did a very good job of actually explaining many concepts and their simple Java implementations (the code is probably not perfect but it makes so much easier to grasp the ideas) without overwhelming the reader with gratuitous mathematical rigor. The book consists of chapters of roughly equal size which makes it possible to read in manageable chunks. Each chapter refers to selected original articles and so explains the history of the field, again very succinctly.
As a side-note, I found the JGroups source code rather messy. I frequently saw places where refactoring was long due. Consequently, I doubt that framework is a good educational instrument at least for low-level mechanics especially if you do not have a pretty good idea of what exactly to expect. My UML sketch of TCP/UDP stacks looks rather unwieldy but they say most highly useful pieces of software look puzzling inside.