Mar 21, 2016

AWS SQS for reactive services

Last year I left the cozy world of self-managed systems for the now predominant AWS platform. It's been a mixed blessing so far. On the positive side, devops are relieved of so many daily troubles of running infrastructure components such as message brokers. On the negative side, developers are deprived of many conveniences provided by any JMS/AMQP-style product such as topics and no need for polling. 

It's very common to use a message broker to implement asynchronous API for your backend services. Especially in a world where major RPC frameworks (I am looking at you, PBs/Avro/Thrift) have only partial support for fully asynchronous clients and servers. Not everyone is running AKKA or Finagle in production. All you need is message broadcasting (i.e. topics) and an instance identifier to ignore responses sent to someone else.

By contrast, SQS is designed for essentially one topology: a number of worker nodes processing a shared queue. There is a gap between SQS and Kinesis which I still find to be rather painful. But for services with one-way requests only SQS is indeed simple to use.

For people coming from traditional message broker world there are a few noteworthy differences:
  • you need to poll a queue to recieve messages, there is no broker to push messages to you
  • you can receive up to ten messages in one request
  • there is a visibility timeout associated with messages; messages received by one consumer but not deleted from the queue within the visibility timeout are returned to the queue (and become available to other consumers)
  • an absolute queue name and a queue URL are two different strings
  • you can list queues by queue name prefix
I have a handy example illustrating SQS message processing with AWS SDK. It implements a typical message processing sequence as shown below:

  • SqsQueuePoller is supposed to be called periodically to poll a queue
  • AsyncSqsClient is a half-sync/half-async-style wrapper around AWS SDK client
  • Handler represents a service capable of processing multiple requests concurrently
  • MessageRepository keeps track of the messages being processed
  • VisibilityTimeoutTracker makes sure the visibility timeout never expires for the messages being processed