Apr 20, 2017

Rules of three in software development

Most people agree that software development is a craft. As such, it accumulated heuristics and rules of thumb that are hard to prove in any remotely scientific way. But see enough code bases over a few years and certain patterns will certainly look plausible. I would like to ponder a couple of representative examples that struck a chord with me.

Make it work, make it work right, make it work fast

I believe this saying originated in the heydays of the OO. You could probably interpret it as "building a vertical slice of the system" in RUP or "YAGNI and iterations" in Agile. If you squint enough even MVP-centered thinking common in startups could be another reincarnation of the same principle.

In practical terms it boils down to:
  • find a small use-case, choose OSS libraries that can help implement it quickly, prepare project infrastructure (e.g. build system, code tree structure), write the code, produce something small but runnable and call it a prototype
  • once you acquire basic experience with new technologies and have your prototype to share with other people you are ready to have a meaningful technical discussion with them; if all goes well, extend the prototype into an MVP that could be deployed to production
  • once you have a basic version running in production, start working on a second major release that can improve scalability and stability

My last experience with this flow was with Spark and Parquet. In mid-2015 I prototyped a few basic ideas such as exporting data from our system as Parquet files and running Spark locally to work with Parquet files. I made a few observations about small details not discussed in the documentation. So when ideas for a new ETL pipeline were discussed I could show actual code examples closely related to our particular needs.

Once we could show some numbers from running the prototype in Spring 2016 other people were comfortable with making an architectural decision. By Fall 2016 we had an MVP that could do most of what the legacy code was capable of. While migration to the new pipeline started in production in late Fall, we could switch attention to the second major release which addressed stability issues and applied data storage optimizations.

OSS/3d party, low-level API, less generic in-house implementation 

Another evolution I see frequently goes like that:
  • Come up with big ideas for a new service
  • Quickly develop a prototype using an OSS product that implements some of the big ideas
  • In case of encouraging performance test results, finish the MVP and go to production
  • Once initial performance gets insufficient, start digging into the OSS code and documentation is search for obscure lower-level APIs; replace usage of simple and easy APIs with more efficient but cumbersome ones
  • Once you understand your system trade-offs and data access patterns consider having your own, less generic but more optimized for your circumstances replacement for the initially used OSS component  
A recent example would be solving the challenge of data retrieval for interactive analytics. Postgres was not fast enough anymore. The big idea was to use search engine-style approaches such as fast bitmap indices. The obvious OSS candidate was Lucene. We used a less popular but still high-level API with additional optimizations added later. 

Once we had it in production the urgency subsided and we had time to recognize some context-specific assumptions about our data we could make. The second major release dived much deeper into Lucene internals and resulted in a few customizations built to take advantage of what we found. At some point in the future the next step could be going the same way as Druid did - replacing Lucene altogether with an in-house implementation of the inverted index.

Three kinds of developers

By the virtue of being a human issue and not just a technical question this example is harder to discuss. As a matter of fact, there is a beautiful post that you should read even if you ignore the rest of that highly recommended blog. 

There is no question that both individual psychology and level of experience play a huge role in this. Unfortunately all I can imagine to be actionable about it is to try to be self-aware enough to see those patterns in yourself and your teammates. 

It might be also the case that if you are a settler you will not like A-round companies because of all the chaos, poor engineering, and one-man components. Conversely, a pioneer might feel strange after a B round when a larger team starts building real technology and team communications and documentation needs grow in importance.

Probably because of the time I spent in B-round startups I believe I have seen more pioneers and settlers than town planners. The latter are probably those mythical "enterprise developers" frequently mentioned on HackerNews.