How Bloomberg is Advancing C++ at Scale

John Lakos manages the Bloomberg Development Environment group, which offers a set of C++ software libraries, development tools, and methodology to well over a thousand Bloomberg developers. He is an authority on large-scale C++ software infrastructure, receiving recent acclaim for two publications by Pearson Education on methodology for industrial software development [Part 1, Part 2]. BDE and its libraries are open source and can be found on GitHub. In this conversation, Lakos discusses the importance of instilling process and discipline in all software development projects.

The conversation with John has been edited for length.

How are large development projects different than small ones?

Large projects differ in complexity and difficulty in multiple dimensions, which kick in at different magnitudes. For example, as software size crosses the threshold where frequently recompiling the entire system becomes infeasible, you need to be taking insulation techniques seriously.

There are three global techniques available in C++, two of which are architectural and one of which is not. The procedural interface, the first of the two architectural techniques, is very specific to C APIs; the second is the pure abstract interface, or protocol, which we use routinely throughout BDE and in system integration in general. The non-architectural technique uses a concrete class, also called PIMPL, or “pointer to implementation”. But, all three are totally insulating, meaning that with them you can insulate the entire implementation.

How can small projects take advantage of large-scale methodologies?

As mentioned in my article, successful small projects often become large projects, and having to stop at some point and reorganize before continuing is not a productive use of a team’s time.

Besides, the component-based development methodology we employ naturally lends itself to projects of virtually all sizes. Although not everything that makes large-scale software development challenging is present in smaller projects, many aspects, such as the need for concise documentation and thorough unit testing are, and should be practiced uniformly — irrespective of size.

What changes once your program exceeds just a single file that holds ‘main’?

My goal is not to have a process change, but to instill a process and methodology that works regardless of scale. This is an early but important size threshold for software. Having all the source of a program in a single file makes sense only for the tiniest of projects. Once we have multiple physical pieces (we call them components) it immediately makes sense to talk about their physical dependencies, which we ensure are acyclic, and to have a separate, standalone unit test driver associated with each component.

Why does a group of developers necessarily need to go from ad hoc to more process-oriented as it grows?

To put it bluntly, because “ad hoc” doesn’t scale. Techniques that may be adequate for small or even medium-sized projects are simply not sufficient to address issues – such as consistent packaging and unique naming – that arise as software systems grow to arbitrary size and serve an unbounded number of clients. As I said earlier, rather than changing processes mid-stream, it’s much better to have a single seamless process, as we do, that scales up (and down) to projects of any size.

What about junior developers; how long does it take them to learn the process?

Surprisingly, it often takes less time than a seasoned developer, because a seasoned developer typically has formed strong opinions – and it can, at times, be hard to undo what they think they know. A junior person can often just learn for the first time right off the bat how to do it properly, without complaints. In other words, un-training people can often be more expensive than training them.

How about when a team reaches hundreds or thousands of developers: how does your process handle that?

We address scale with hierarchy – both in terms of the people we train and manage, and also in terms of the software work product we ultimately produce. For example, every fundamental unit of software we write takes the form of a component, which has certain fundamental properties. Related components roll up into packages, and then related packages reside within a single Unit of Release, which we call a package group.

Herein lies the challenge: The larger a team gets, the more potential there is for inefficiency due to the inherent costs of wide-area communication, especially across distinct geographical locations, let alone widely disparate time zones. It is essential that we have a cohesive process that allows everyone to participate in a uniform way – especially from an integration perspective.