Lego World – No One DevOps Solution Fits All

DevOps Software Lego World
Photo by James Pond on Unsplash

I was entertaining the idea of Lego-style DevOps software for quite a while now. I wanted to write about this pretty much after success of my previous post about microservices, but then the whole situation with the virus started to unfold – which led me to a bit of paralysis. Today I’m trying to break out of this paralysis and write this down as a sort of return to normalcy 😉

Let me start with software in general and then move to DevOps specifically. This topic is really general, but as I’m building a startup in the DevOps field, I choose DevOps as being “close to home”.

The key narrative here is that gradually, since the start of the computer era, it progressively becomes easier to install a piece of software. For end users, app stores were the culmination of how easy things are – you click the install button in the store, and voila – you now how the software up and running. How cool is that!

But there is always a gap between end users and businesses in how we deal with IT and software.

For businesses – small, medium, enterprises – we still have this old IT Ops narrative that things are complex and therefore must be hard. Fortunately, even with all that resistance things get easier on different fronts with tools like Apt, Ruby Gems, npm, pre-built AMIs from vendors, and most recently – you guessed it – containers, compose files and helm charts. I’m absolutely positive the puck doesn’t stop where we are now – as many things still have a large room to grow and API integration is still a major pain point.

However, the trend is clearly to install software that you need when you need it, quickly and easily. Businesses are made of people – and those people want the ease of an application store to suit their needs just as all other people. Who wants to be stuck in a 10-year old tooling just because it costed a fortune back then? Yet questions like this one on StackExchange make me doubt whether IT departments actually get it.

So, as a business professional, I should have access to the tooling that suits my job best as fast as possible (ideally – immediately). Does this sound right to you too?

Now, let’s transition this thinking to the DevOps world. If I like EC2 on AWS but I’m positive that Azure Container Registry is a better product than ECR, I shouldn’t have an issue to mix-and-match the two. If I want to use GitHub for code storage but CircleCI for integration – that’s perfectly fine. These are examples of tools that are relatively easy to mix-and-match.

However, there is also an opposite direction in the market – namely to try to lock people down in a DevOps vertical and thus create a moat around all your products.

What is a DevOps vertical? It means you simply have your VCS Repository, your CI, your artifact storage, your CD, your infrastructure management and your actual cloud – all from the same vendor! How does that sound?

This certainly sounds great from the prospective of that vendor. But as a user, I don’t want to be forced into ECR just because I’m using ECS – if there is a better product on the marketing. Luckily, it works fine in ECR / ECS pair – I can mix and match and replace ECR with say Docker Hub easily.

But other vertical solutions are trying to actually lock users in pairs of products that could not be simply mixed-and-matched with alternatives. Even though alternatives exist, they are not connected or not compatible for subtle reasons. I.e., try to assign your 2nd level domain name to AWS Load Balancer without using Route 53 for DNS. Possible, but very difficult (because AWS only gives you cname for load balancer and not actual IP addresses, and you can’t add cname record to 2nd level domain).

Now, if I were to switch at this moment to a lock-in solution, what would happen in the future? Does it now mean I’m fully on the mercy of the vendor for the time being? Can they raise prices any time they want like Google just did with GKE? Plus, as their vertical grows and they add more components to it, can it get worse indefinitely? Does it remind you of “old” Microsoft?

My answer to all that is always plan for a switch to another product. When choosing a product, consider ease of transition in and out of this product as one of the key priorities. Stay away from those that force you on the same vertical. Not only because of today’s considerations but also because of the future and vendor’s philosophy. It’s a Lego-world. Moat of a product should be the ability for it to inter-operate with other software, not a lock placed on its users making them unable to get out.

Simple Card Shuffler For Mafia Game

Due to self-isolation we switched from weekly playing mafia (werewolf) game offline to online (via zoom). But we needed a card shuffle mechanics, so I wrote this one: over weekend.

Source code on GitHub:
Back-end –
UI –

Note that this assumes classic rules, namely only 4 roles: villager, mafia, godfather, sherrif. Sample rules could be found here.

Get sha256 hash on a directory

Update (2020-03-06): Following this conversation on reddit with issue raised by u/atoponce I updated the result to include file renames and moves and added LC_ALL=C section.

Today I started building a new use case for Reliza Hub where we would match file system digest of the deployed directory to what we have in our metadata. We do such matching via sha256 hashes.

Previously we were mostly covering docker images or archive files where digest extraction was trivial. But this time around it’s a file system and sha256sum utility in linux does not have built-in option to compute digest on directory.

I first encountered this problem some time ago when we were building Reliza Hub Playground and corresponding monorepo sample repository. Use case was to integrate this command into GitHub Actions CI script so it would create releases of sub-projects in monorepo only if those projects actually changed.

To do so at CI run, script would call Reliza Hub to check if this sha256 was already registered, and only if it was not – then we would create a new release. So to get sha256 on directory back then I just did a quick DuckDuckGo search which brought to this superuser post and this askubuntu post. Switching to sha256sum from md5sum and sha1sum brought me to:

find /path/to/dir/ -type f -exec sha256sum {} \; | sha256sum

And this is what I initially used for Reliza Hub Playground Helper project. And this worked perfectly.

However, now when I started step 2 of the same workflow – where we promote same file system to the instance and need to match it from the instance to sha256 recorded on Reliza Hub side, I realized with disappointment that sha256 digest now suddenly doesn’t match when I executed the command above on the target instance. Another words, in the CI build and in the target instance I got 2 different sha256 values on the same git code base.

Why? After quick debugging I realized that first find command included file paths, and those were unsurprisingly different. To deal with that I used awk and left only digests of files, then calling sha256sum as following:

find /path/to/dir/ -type f -exec sha256sum {} \; | awk '{print $1}' | sha256sum

Looks good so far – but it still did not match! Another round of debugging – and I realized that on different machines find command would return files in different sorting order.

My next try to fix this was based on this superuser post trying to sort by date. But it quickly turned out that since we were using git clone frequently, dates on files were not matching either and sorting order was not universal.

Next idea I came with was to try to change sort to use file names and dictionary sort instead of dates. Surprisingly, that was also inconsistent across different Linux boxes (slightly, but enough to not get digests right). After further research, LC_ALL=C comes to the rescue here – as I further discovered.

So in the end after couple of hours overall, I came up with the following solution:

  1. Find all files in the directory and subdirectories using find -f
  2. Execute sha256sum on these files to get digests
  3. Use awk to only take digests from the previous command
  4. Sort those digests in the alphabetic order
  5. Only now compute final sha256sum on sorted digests

This worked and finally provided me with universal way to compute sha256 hash on directory across different platforms. Here is same thing in code:

find /path/to/dir/ -type f -exec sha256sum {} \; |  awk '{print $1}' | sort -d | sha256sum | cut -d ' ' -f 1

Happy, I posted this on r/bash but as mentioned above luckily u/atoponce correctly pointed out that this solution would ignore file renames or moves within repository. He suggested great solution that is:

dir=<mydir>; (find "$dir" -type f -exec sha256sum {} +; find "$dir" -type d) | LC_ALL=C sort | sha256sum

That is great, but still we have an issue of absolute versus relative paths and digests computed differently based on those. I.e., dir=/home/myuser/path/to and cd /home/myuser && dir=path/to produce different sha256 hashes. To solve this I decided to use sed with regex following this stackoverflow. And the final-final solution I have at the moment is:

dir=<mydir>; find "$dir" -type f -exec sha256sum {} \; | sed "s~$dir~~g" | LC_ALL=C sort -d | sha256sum

That’s it! I also published details about the actual use case on medium.

Microservices – Combinatorial Explosion of Versions

Combinatorial Explosion of Versions of Microservices
Combinatorial Explosion of Component Versions

As the IT world transitions to microservices and tools like Kubernetes are roaring, there is this one lingering issue that slowly comes full force. That is Combinatorial Explosion of versions of various microservices. Community expectation is that it is potentially much better than the dependency hell of the previous era. But nonetheless versioning of products built on microservices it is a pretty hard problem. To prove the point articles like “Give Me Back My Monolith” immediately come to mind.

If you’re wondering what this all is about, let me explain. Suppose your product consists of 10 microservices. Now suppose each of those microservices gets 1 new version. Just 1 version – we can all agree that sounds pretty trivial and insignificant. Now look back at our product. With just 1 new version of each component we now have 2^10 – that is 1024 permutations of how we can compose our product.

If this is not entirely clear, let me explain the math. We have 10 microservices, each has one update. So we have 2 possible versions for each microservice (either the old one or the updated one). Now, for each component we can use either one of those 2 versions. That is equivalent to having binary number with 10 places. In example, let’s say 1’s are new versions and 0’s are old versions so one possible permutation would be 1001000000 with 1st and 4th component updated and all others not. From math we know that binary number with 10 places has has 2^10 or 1024 variations. That is exactly the number we are dealing with here.

Now to continue with our thinking – what happens if we have 100 microservices and 10 possible versions each? Whole thing gets pretty ugly – it’s now 10^100 permutations – which is an enormous number. To me, it’s good to state it like this because now we’re not hiding behind words like “kubernetes”, but facing this hard problem face on.

Why I’m so captivated by this problem? Partly because coming from NLP / AI world – we were actively talking about the problem of combinatorial explosion in that field maybe 5-6 years ago. Just instead of versions we would have different words and instead of products we would have sentences and paragraphs. Now, while NLP and AI problem remains largely unsolved, a matter of fact is that substantial progress has been made recently (to me the progress could be faster if people would be a little less obsessed with machine learning and a little more considerant of other techniques – but that would be off-topic).

Back to the DevOps world of containers and microservices. We have this enormous elephant of a problem in the room and frequently what I hear is – just take kubernetes and helm and it’ll be fine. Guess what, it won’t be fine on its own. More so, closed-form solution for such problem is not feasible. Like in NLP, we should first approach this problem by limiting the search space – that is pruning outdated permutations.

One of the things that help – I mentioned last year in this blog about a need to keep minimal span of versions in production. Also it is important to note that good CI/CD process helps a lot in pruning variations. However, current state of CI/CD is not enough without proper accounting, tracking and tooling to handle actual permutations of components.

What we need is larger scale integration-stage experiments where we could establish risk factor per components, have some automated process to upgrade different components and test without human intervention to see what’s working and what’s not.

So the system could look like:

  1. Developers writing tests (this is crucial – because otherwise there is no reference point, it’s like labeling data in ML)
  2. Every component (project) has its own well-defined CI pipeline – this process is well established by now and CI problem per-component is largely solved
  3. “Smart Integration Engine” sits on top of various CI pipelines and assembles component projects into final product, runs the test and figures out shortest path to completion of desired features given present components and computes risk factors. If upgrades are not possible, such engine alerts Developers about best possible candidates and where it thinks things are failing. Again, tests are crucial – the integration engine uses tests as reference point.
  4. CD pipeline then pulls data from Smart Integration Engine and performs the actual roll-out. This completes the cycle.

In summary, to me one of the biggest pains right now is the lack of an integration engine that would mix various components into a product and thus allow for proper trace-ability of how things actually work in the complete product. I would appreciate thoughts on this (Spoiler alert – I’m currently working on Reliza to act as that “Smart Integration Engine”.)

One final thing I want to mention – to me monolith is not an answer for any project of substantial size. So I would be very skeptical of any attempt to actually improve lead times and quality of deliveries by going back to monolith. First, monolith has similar problem of dependency management between various libraries but it’s largely hidden in the development time. As a result, people can’t really make any changes in the monolith so whole process slows to a crawl.

Microservices make things better, but then they hit versioning explosion at the integration stage. Yes, essentially, we moved the same problem – from the dev stage to the integration stage. But, in my view, it is still better and teams actually perform faster with microservices (likely just because of a smaller batch size). Still, improvement we got so far by dismantling monoliths into microservices is not enough – version explosion of components is a huge problem with a lot of potential to make things better.

Link to discuss on HN.

Japanese translation by IT News: マイクロサービスにおけるバージョンの組み合わせ爆発

Chinese translation by InfoQ: 微服务——版本组合爆炸!

Russian translation by me on Микросервисы — комбинаторный взрыв версий

Reliza Hub Tutorial Using Our Playground

Reliza Hub is a DevOps Metadata Management System. It helps manage software releases in the era of kubernetes and micro-services. Tutorial covers the following:

  • Projects and Products, and how to create releases for them
  • How to connect CI script to generate new Project releases (we use GitHub Actions as an example)
  • How to send data from instances to Reliza Hub and how to request back data with details about target releases

Be sure to check Reliza Hub Playground and corresponding GitHub repository.

Reliza Hub Playground Tutorial

Automatic Version Increments With Reliza Hub: 2 Strategies

This article describes how to set up automated version increments for use in CI build pipelines. I will go over 2 possible strategies: for simple CalVer workflow I would be using open-source Reliza Versioning tool. For fully synchronized workflow I would be using Reliza Hub SaaS.

Update: Please check our tutorial for more advanced use cases of Reliza Hub.

I Choosing Versioning Schema

For a project architect, one of the necessary first steps is to choose a Versioning Schema. Two most popular conventional models for today are SemVer and CalVer

Both have their pros and cons. Discussing them in details is out of scope of this article, however I will highlight the differences very briefly.

Main benefits of SemVer are that it has a strict convention and allows to estimate amount of changes between versions by just looking at actual versions.

For CalVer, main benefit  is that it allows to quickly see version relevance from today’s prospective (by establishing the difference between version’s date and today’s date). This part is essentially missing from SemVer, since SemVer versions tell nothing about when they were created.

However, downside of CalVer is predictably lack of difference semantics – for example, a year difference in CalVer versions may only be in a single line of code – and CalVer version usually would not have enough semantics to compensate. Even though CalVer is less conventionalized and actually presents by itself a class of version schemas which common pattern (usually, year and month).

So with these and other considerations (i.e., certain tools would require to use particular schema), it is required to pick a schema for the project.

II Simple workflow with Reliza Versioning OSS

Simple standalone workflow is usually based on automatic increment of previous version referenced somewhere in the source code or in the build process.

Reliza Versioning Open Source Solution has CLI that may be used in-place for version auto-increments in such workflows.

Let’s say we are using Ubuntu flavor CalVer (YY.0M.Micro). If we need to generate first version, we would run

 docker run --rm relizaio/versioning -s YY.OM.Micro 

Which would produce CalVer version based on today’s date. Since I’m writing this in February 2020, I’m currently getting 20.02.0.

Let’s now assume that I have an old version referenced that happens to be 20.01.3 and I need to do CalVer style bump on it. This means, that if date has changed, it will bump date first, so if I perform:

docker run --rm relizaio/versioning -s YY.OM.Micro -v 20.01.3 -a bump

I would get 20.02.0 (again, I’m writing this in February 2020).

Note that, if we’re still in February 2020 and our previous version is 20.02.4, running simple bump on that would produce 20.02.5, since only micro component may be bumped.

Now, if I deliberately only want to bump micro component and not bump date, I can run 

 docker run --rm relizaio/versioning -s YY.OM.Micro -v 20.01.3 -a bumppatch 

This would in turn produce 20.01.4.

Simple enough? All that is left is to introduce this run command inside build pipeline.

Now, similar strategy works with SemVer:

docker run --rm relizaio/versioning -s semver 

would initialize version at 0.1.0.

docker run --rm relizaio/versioning -s semver -v 3.8.2 -a bump

Would produce 3.8.3.

If we would to bump minor instead (and get 3.9.0), run

 docker run --rm relizaio/versioning -s semver -v 3.8.2 -a bumpminor 

Or to bump major (and obtain 4.0.0):

docker run --rm relizaio/versioning -s semver -v 3.8.2 -a bumpmajor 

III Synchronized Automated Workflow using Reliza Hub

Reliza Hub is a deployment and release metadata management tool. Above other features, it keeps track of project version schemas and version pins in project branches.

What is version pin in a branch? Suppose we have SemVer schema and we have Release branch built in December 2019, and we also have our regular master branch. To distinguish between those branches, it is a good practice to keep branches on different minor versions (or in certain cases on different major versions).

This means that Master branch may have 1.3.x pin, while release branch may have 1.2.x pin. This way we can understand which branch a release belongs to just by looking at major and minor components of the version.

Similar effect may be achieved with CalVer versioning – suppose we’re using Ubuntu style CalVer as above (YY.0M.Micro). Then we may choose to pin some stable production branch to say 2019.11.Micro, while keeping our master branch on the latest (YY.0M.Micro) schema. Effectively, Reliza will bump version according to current date and resolve conflicts via increments of Micro component. It is very similar to SemVer, main difference is that Pin is usually set on the date and not on major / minor combination. More details about different version components can be found in the README of Reliza Versioning repository on GitHub.

Let us now discuss how to mount fully automated workflow on Reliza Hub (note: Reliza Hub is currently in the public preview – until mid-June 2020, and after that there will be a free tier for individual developers – see more pricing details here).

First, navigate to , read terms and if agreed either authenticate with GitHub or click OK. Then create your organization and navigate to Projects:

Projects in Reliza Hub
Projects in Reliza Hub

Then click on plus-circle icon to create new Project:

Add New Project - Select Project Version Menu in Reliza Hub
Add New Project – Select Project Version Menu in Reliza Hub

Enter desired project name and select one of provided version schema templates or click Custom and then enter your own custom project version schema (again refer to Reliza Versioning GitHub Readme for details on available comopnents).

You may also enter details of your VCS repository for this project or skip this step at this time – after all it is not required for the version synchronization workflow we are discussing.

Click “Submit”. Your project is now created.

Notice that the system has automatically created “master” branch for you. If you click on it, you will see releases of this branch registered in the system (predictably, there are none at this point). Also notice, that master branch’s version pin matches project’s version schema exactly.

If you want, you may modify the version pin after clicking on the wrench icon above releases and thus expanding branch settings.

Branch View in Reliza Hub
Branch View in Reliza Hub

Now if you click on the plus-circle icon in the Branch column, you would be able to create release manually and system would auto-assign first version for you – 0.0.0 in ourcase. Every next click on the plus-circle (Add Release) would call version synchronization logic and yield next version (making sure that every version is returned only once).

However, what we really want here is to configure programmatic approach. Here is how:

First of all, we need to generate our API Key. For this, first expand the project settings menu by clicking on the wrench icon in the Project column:

Reliza Hub - Wrench Icon To Open Project Settings
Reliza Hub – Wrench Icon To Open Project Settings

Then in the project settings menu click Generate Api Key and it is best to store obtained id and key in your favorite vault solution. Note that subsequent clicking on “Generate Api Key” would re-generate the key and make the old one invalid.

Once you have the key, you may use Reliza Client docker container to obtain Version Assignment from the system. The call needs to be made as following:

docker run --rm relizaio/reliza-go-client getversion -i $your_api_key_id -k $your_api_key -b master 

Notice here that getversion keyword is a trigger to obtain version details from Reliza Hub. -i parameter stands for your api key id and -k parameter for api key itself, also -b parameter is required and denotes branch.

The tool would return next version for the branch in the json format, such as {“version”:”1.1.5″}.

It is also possible to supply optional –pin flag for Reliza Go Client, which is required for new branches or would update version pin for existing branches. In example, if we want to create new Feb2020 branch, with SemVer version pin 1.2.patch, we would issue command as:

docker run –rm relizaio/reliza-go-client getversion -i $your_api_key_id -k $your_api_key -b Feb2020 –pin 1.2.patch

More details about Reliza Go Client are provided on its GitHub page.


We covered above:

1. Simple workflow to auto-increment versions in the build pipeline using open source reliza versioning tool.

2. More advanced automated synchronization workflow using Reliza Hub Metadata Management solution for synchronization. Note that Version Synchronization is a small portion of Reliza Hub features, but discussing other functionality would be out of scope of this article.

2 thoughts today while snowboarding

Tried snowboarding for the 2nd time in my life (was not too bad relative to the 1st time 😉 )
Had those 2 thoughts in the process:
1. Mountain skiing and snowboarding are really great sports to treat OCD: if you get too much control, you can’t get speed – you stop and you fall; if you get no control whatsover – you go to fast, and again – fall. So the idea is to find that optimal balance with some control but not too many (can’t control everything after all).
2. One thing that coronovirus story should re-enforce – is that remote workforce is the only way to go in the modern world. How many time did it occur that somebody would come to work sick, and then everybody in the office would go out sick, and then cycle repeats throughout the year. That is especially bad in crammed places, like call-centers. Has anyone tried to estimate the loss of productivity due to sickness (not even mentioning other things such as quality of life or life expectancy)? It is absolutely ridiculous to force everybody to work from the same space when there is no real need for that.

Kubernetes – list all deployed images with sha256 hash

While there is official documentation how to list all kubernetes images here, it’s missing imageID field that includes sha256 hash. Sha256 digest is crucial for our use-case at Reliza, so here are working commands to list all images and all image ids:

# get all imageIDs (with sha256 hash digest)
kubectl get pods --all-namespaces -o jsonpath="{.items[*].status.containerStatuses[0].imageID}"


  • Tried on kubernetes 1.17
  • Images returned are whitespace separated
  • Images are duplicated (one image per pod) – but should be trivial to de-duplicate

How to make microk8s work with helm 3

This is a quick note for self. When running microk8s and trying to wire helm 3 I was getting “Error: Kubernetes cluster unreachable”. Workaround I found is the following:

mkdir /etc/microk8s
microk8s.config > /etc/microk8s/microk8s.conf
export KUBECONFIG=/etc/microk8s/microk8s.conf

This block above pretty much does the trick. Obviously, for production or near production use it’s worth adding cron and adding export command to something like .bash_profile.
P.s. What helped me a lot was this discussion of a similar issue for k3s: