Mar 3, 2008

Distributed algorithms

Distributed algorithms are a fascinating but often overlooked area in which academic research meets real-life challenges. Well-known practical applications such as the JGroups framework tend to simplify (for performance reasons obviously, certain levels of consistency are prohibitively expensive in terms of network traffic or latency) the approaches suggested by academicians. On the other hand, virtually all such frameworks are firmly rooted in hard-core research (e.g. Bela Ban was influenced by research done in Cornell where Ken Birman seems to be the founding guru) and you won't go far on common sense and re-invented wheels.

In mainstream development layers upon layers of middleware protect programmers from the challenges of real engineering. Quite few are lucky enough to closely work with JGroups-level components and even fewer use NIO and MINA-like frameworks (BTW MINA's lead developer has just been hired by JBoss) to build infrastructure of their own. In my experience, even among CS graduates not everyone has a good grasp of this field. Once upon a time I struggled mightily to learn basics of reliable messaging and its internals. As a result, I have been keeping my eye on the literature but surprisingly enough for our age of grid computing there are only a few sources catering to practicing engineers. Most CS books tend to jump to the theoretical aspects and never come back to something more practical.

So far I can truly recommend only a couple of books paying attention to real code.
  • Concurrent and distributed computing in Java - a very good introductory book covering both essential concurrency issues and distributed algorithms. The author did a very good job of actually explaining many concepts and their simple Java implementations (the code is probably not perfect but it makes so much easier to grasp the ideas) without overwhelming the reader with gratuitous mathematical rigor. The book consists of chapters of roughly equal size which makes it possible to read in manageable chunks. Each chapter refers to selected original articles and so explains the history of the field, again very succinctly.
As a side-note, I found the JGroups source code rather messy. I frequently saw places where refactoring was long due. Consequently, I doubt that framework is a good educational instrument at least for low-level mechanics especially if you do not have a pretty good idea of what exactly to expect. My UML sketch of TCP/UDP stacks looks rather unwieldy but they say most highly useful pieces of software look puzzling inside.

No comments: