DevOps KPIs: Distance between latest and oldest production versions

Recently I’ve been thinking a lot about the subject of KPIs that should be used in DevOps. Among them, maybe the most important is the number of different versions the team has to maintain at a time. This particular indicator frequently comes to the top of my list.

To me, this indicator mainly resembles Army Supply Lines in military. If a Supply Line becomes too long, it’s hard to manage, hard to protect and hard to defend against attacks. And if a Supply Line is cut, that is a huge problem, which may some time lead to losing a war.

Famous example – Napoleon invasion of Russia in 1812. His army went deep into Russian territory which over-extended its supply lines. Even though Napoleon managed to be victorious in pretty much every battle while in the offensive, he ultimately lost the war because of those supply lines being too stretched out.

So how does this reflect in DevOps?
My today’s view is that in DevOps, the Supply Line starts at the oldest version we need to maintain — that is usually the oldest version deployed to production. And then the Supply Line stretches all the way to our latest and greatest version that has just been introduced in a test / development environment.

The length of the Supply Line is measured in the number of versions between the oldest one we maintain and the latest and greatest version we have deployed somewhere. For example, if we have versions 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.3.4 all deployed somewhere and under maintenance, then the length of our Supply Line is 5 (5 versions that we maintain at this very moment).

Why do I compare the number of versions under maintenance with Supply Line? Imagine a piece of update, say a critical patch that needs to be deployed across all systems. If our Supply Line spans 5 versions, that means 5 merges, 5 test cycles, and 5 multiplied by the number of environments deployments (i.e., if for each version we have test, staging and production, that would be 5*3 = 15 deployments, and also 15 environments to maintain in total). So it is like we move our little patch piece by piece from one version to the next, and from one environment to the next – that is we move something essential across our Supply Line, just like food and supplies are moved across Army Supply Lines.

If the Supply Line, and some versions that we have to support are not compatible any more, that means our Line is broken, and it multiplies the amount of work required for a patch.

Now, if we think a little deeper, different versions have also different sub-components, dependencies, references, you name it. So having long Supply Lines in DevOps essentially translates into combinatorial explosion problem. Yes, I’m coming from AI / Big Data / NLP world and there we have to deal with this term a lot. But it is no better in DevOps where many versions of different components are at stake and require maintenance.

Going back to the KPI view, there is a certain theoretical threshold to the length of a DevOps Supply Line (which depends on project structure and complexity, quality of management and headcount). After hitting the threshold, the whole operation becomes too stretched out and more-or-less non-maintainable.

This threshold is usually established by experiment, but there are also indirect indicators that we are approaching it. Such as: time it takes the team to release a trivial update to production (start-to-finish), number of branches where the engineering team needs to merge a fix, opinions of individual developers expressed during one-on-ones.

How to remedy this? First thing is to keep track of it. It is imperative to be mindful of the existing Supply Lines and their lengths. If they become too stretched out, focus on updating oldest environments to something newer and stable and try to prioritize such updates higher than delivering new features.

Remember that new features are essentially at the other end of the equation here. Focusing solely on new features will inevitably extend he Supply Line, and if it’s close to its threshold, this will risk the whole operation. So there has to be a balance between releasing new features and maintaining healthy Supply Line. Breaking this balance would essentially compromise both maintainability and velocity of new features. That is why I believe it is very important to capture the length of Supply Lines as one of KPIs and track it together with other important indicators for organization.

2 comments

Leave a comment

Your email address will not be published. Required fields are marked *