I expect Aeron will follow a trajectory similar to the one Disruptor had. Namely, fewer people will actually use it in production than learn lots of valuable lessons from its design principles. With the exception of a market feed processing system I have actually seen the Disruptor used only in a low latency logger. But I still remember how surprising the emphasis on a single writer and all the nifty non-blocking tricks looked a few years ago.
So I went through the original Strange Loop presentation for a first taste of extreme messaging wisdom. Well, for better or worse actually learning any new low-level details will take some source code digging. But from the first pass one technique in particular somehow reminded me both Disruptor and Kafka thinking.
On the sender side, there are three buffers. A clean one (empty for the time being), a dirty one (completely filled, could be used to go a little back in sent message history) and the current one (partially filled, to be used for the next send request). There is a current offset pointer shared by all the sender threads. When a message is being sent, the sender thread first tries to increment that counter with a CAS, retrying if necessary. That CAS allows multiple senders to do most work, including actually writing a message to the output buffer, without synchronization.
On the receiver side, the setup is similar but there are two offset pointers. One for the high water mark (the highest message offset seen so far) and the other for the last message received in sequential order (i.e. without missing messages with lower offsets). The two pointers allow to asynchronously restore the original sequence for messages received out of order. In addition, another thread can periodically check the difference between the two offsets to detect a loss of messages in transit and so the need for re-transmission.