Nov 30, 2014

Kafka design ideas


A new company behind Kafka development and a fascinating new high-throughput messaging framework making waves inspired me to go look at major ideas in Kafka design. Its strategy in general is to maximize throughput by having non-blocking, sequential access logic accompanied by decentralized decision making. 
  • A partition corresponds to a logical file which is implemented as a sequence approximately same-size physical files. 
  • New messages are appended to the current physical file. 
  • Each message is addressed by its global offset in the logical file. 
  • A consumer reads from a partition file sequentially (pre-fetching chunks in the background). 
  • No application level cache. Kafka relies on the file system page cache. 
  • A highly efficient Unix API (available in Java via FileChannel::transferTo()) is used to minimize data copying between application/kernel buffers. 
  • A consumers is responsible for remembering how much it has consumed to avoid broker-side book keeping. 
  • A message is deleted by the broker after a configured timeout (typically a few days) for the same reason. 
  • A partition is consumed by a single consumer (from each consumer group) to avoid synchronization. 
  • Brokers, consumers and partition ownership are registered in ZK. 
  • Broker and consumer membership changes trigger a rebalancing process. 
  • The rebalancing algorithm is decentralized and relies on typical ZK idioms. 
  • Consumers periodically update ZK registry with the last consumed offset. 
  • Kafka provides at least-once delivery semantics. When a consumer crashes the corresponding last read offset in ZK could lag behind. So the consumer that takes over will re-deliver the events from that window. 
  • Event-consuming application itself is supposed to apply deduplication logic if necessary.