Nelson integrates Kubernetes
I was thrilled earlier this week to receive a pull request from Target that added support for Kubernetes to Nelson - the open-source continuous delivery system. Whilst this support is a work in progress, it demonstrates several really important (and validating) aspects which we will discuss in this article. Before we do that however, a little bit of context:
In recent years the battle to become predominant (or even at all popular) within the cluster scheduling space has really exploded. Mesos, Nomad and Kubernetes being some of the more popular ones, with each bringing something slightly different to the table. For example, Mesos is at one end of the spectrum, bringing a low-level toolkit for building custom two-phase schedulers. Kubernetes is at the other end of the spectrum with a monolithic scheduler and many of the ancillary bells and whistles bundled right into the project (discovery, routing etc). This leaves Nomad somewhere in the middle between Mesos and Kubernetes, providing a kick-ass monolithic scheduler, but little in the way of prescriptive choices higher up the stack.
Whilst these systems all carry a very different set of trade-offs and operational experiences, they are often operated in a similar manner and all equally suffer from several distinct drawbacks:
Scheduling systems typically democratize access to compute resources within an organization, and increase development iteration velocity significantly. Such improvements are a boon for the organization as a whole, but introduces a slew of additional complexities that are seldom considered ahead of time. One such complexity that is highly problematic is garbage collection, and the associated lifecycle management. Stated simply, if you previously deployed your monolithic application once a week but you are now deploying micro services 100 times a day, then you have 499 deployments (weekly) that are simply wasting resources or serving customers with old/buggy code revisions. Engineering staff seldom spend the time during the day to go back and figure out what unnecessary revisions they need to clean up - frankly it is not a good spend of engineering time to have them doing that, especially when the robots can do a better job (more on this in the following section).
More often than not, operators of cluster schedulers end up with multiple distinct clusters. This is often an artifact of Conway's law (very prevalent in large companies), but more broadly stems from historical operational thinking where implementors had hard separation between "environments", and they look for an analog (with many operators not currently trusting micro-segmentation of the network, or,application layer TLS alone). Another common case that results in multiple distinct clusters is a desire for global distribution; having separate clusters for East Coast America versus West Coast America, for example. Whatever the cause, the result is the addition of swaths of incidental complexity by having many control planes which can hamper operational use cases when considering the organization at large. For example, how can an operator quickly assess for a given application, in which clusters or datacenter domains an application is deployed in, and discern which of those are active? Often the answer is this is not possible, or an operator will pull out some janky bash script to scrape the result from every available cluster sequentially.
Scheduling systems often provide a great deal of control over low-level runtime parameters, sandboxing configurations, networking, security and so forth. A powerful tool to be sure. However, this power and flexibility comes with cyclomatic and cognitive complexity - is this a complexity cost that you wish every single developer or user of your cluster to pay? Typically this cost is too high, and instead we as operators look for the minimally powerful tools which we can distribute to a wider engineering organization. For example, each and every developer is – in most organizations – not deciding how they will manage ingress edge traffic, service to service traffic, or secure introduction (the act of provisioning credentials or secrets which should not be known by the majority/any staff). These are typically defined by a central group, or x-functional set of staff who decide on these policies for everybody - often such structures are required to ensure compliance or governance, which results in everybody else simply copying these configurations into their projects verbatim. Over time this broadens the security and maintenance surface area significantly, rather than decreasing it, making evolution and improvements over time ever more difficult. For example, consider needing to update thousands of project repositories simply because the preferred TLS cipher list needs to be updated to account for another cipher being compromised.
Not only are these challenges not new, they are extremely widespread. At one point or another, any team operating a scheduling system will run into one or more of these problems. During my tenure running infrastructure engineering at Verizon, my group set about building a solution to these problems. That solution is Nelson.
First and foremost I'd like to reiterate how awesome it is to be receiving community contributions for major features (just look how little code is needed). This is a testament to how easy Nelson is to extend and that its pure functional composition of algebras cleanly demarcates areas of functionality. From a more practical perspective I have a few goals with the Kubernetes support:
- Nelson itself should be deploybale either as standalone, or deploybale also via Kubernetes. This should be a near-zero cost to make happen but it is an explicit goal as there are users out there who want to "kubernetes everything".
- Vault support (and automatic policy management) should work just as they do for the Nomad-based Magnetar workflow. For the unfamiliar reader, this essentially means that Nelson generates a policy on the fly for use by the deployed pod(s), which at runtime determines what credentials are supplied to the runtime containers.
- When using Kubernetes, Nelson will have its routing control plane disabled. Istio is already becoming the defacto routing system for Kubernetes, and as such we will simply make the Nelson workflow integrate with the Istio pilot APIs. The net effect here is that users of Nelson can still specify traffic shifting policies but they will be implemented via Istio at runtime.
- Cleanup works exactly as-is for Kubernetes and is first-class just like any other scheduler integration. Nelson's graph pruning and logical lifecycle management systems will work across all scheduling domains Nelson is aware of (I.e multiple data-centers, clusters etc).
- The addition of a health checking algebra to Nelson, such that we can remove the last hard dependency on Consul and provide a pluggable interface. Whilst a key tenant of Nelson is that it is not in the runtime hot path, the health checking (or delegation to some health-aware system) is required for Nelson to know if an application successfully warmed up and indicated it was ready to receive traffic. Without this, applications could fail and Nelson would erroneously be reporting said application as "ready".
Whilst we will make a concerted effort to make the initial Kubernetes support broadly functional and reliable, I'm certain there are going to be areas of friction given the much more prescriptive nature of the Nelson interface (which is constrained by design). Additionally, I would love to think that we will be able to suffice with a single Kubernetes workflow, but in all probability there will be a variety of needs. If this becomes an intractable problem then the project could revisit earlier exploration around a mechanism to externalize workflow definitions (an eDSL for our internal workflow algebra). As such, I would really welcome feedback from users - or potential users - about these trade-offs. Striking the best balance between minimally powerful tools and sufficient flexibility is frequently a challenge with software engineering in the main.
That's about all for now. If you're interested to learn more about Nelson please visit the documentation or checkout a talk I gave earlier in the year. If you prefer something more interactive, we have a Gitter channel that is relativity active.