Scaling Containers

Containers are rapidly rushing to the fore. They’re the darling du jour of DevOps and it’s a rare conversation on microservices that doesn’t invoke it’s BFF, containers. SDx Central’s recent report on containers found only 17% of respondents that were not considering containers at all. That’s comparable with Kubernetes’ State of the Container World Jan 2016 assertion that 71% of folks were actively using containers, though Kubernetes’ found a much higher percentage of those who say they’re running containers in production (50%) than SDx found (7%).

Regardless, containers are “the thing” right now. And as at least some portion of containerized applications are making it to production, one of the questions that might (should) come up is How do you scale those bad boys?

From 50,000 feet (or the executive office) scaling containerized apps is really no different than scaling any other app. You slap a load balancer in front of it and voila! It’s scaled.

But the reality in the trenches is there’s a lot more going on than meets the eye. Traditional views of scalability focused solely on the practical execution of scale. How do we configure the network? What algorithm should we use? Are we load balancing based on TCP or leveraging HTTP, too?

Today’s view of scalability is far more complex and includes not just the operational minutia, but the operational orchestration manages the process of scalability. We’re not just concerned with what to do but how to do it efficiently and without delay.

Never before has the emphasis DevOps places on collaboration (sharing) been more critical. The success or failure of a scalability strategy no longer relies just on architecture and configuration, but on coordination across and between the systems required to scale containerized applications up and down.

This process must be automated and orchestrated. And I use MUST in the prescription RFC sense because our tolerance for delay is decreasing rapidly. When virtualization first hit the scene we had visions of virtual machines popping up and down with dizzying frequencies. But the reality was that it takes time to launch a virtual machine. We’re talking minutes, here, during which we have plenty of time to execute the processes required to add or remove it from the load balancer. Auto-scaling with virtual machines isn’t nearly the frenetic process we imagined.

But it could well be with containers, whose average lifespan is measured in minutes and their spin up times are counted in seconds. But spin up time doesn’t mean available time, because a newly launched container isn’t ready for prime time until it’s been added to the load balancer (or service registry, if you’re playing with microservices) for distribution.

And that’s where APIs and DevOps and automation become a critical component to this equation. It’s not just a nice to have, this operationalization of the infrastructure required to deliver an application to its ultimate destination (that’s the user). It’s essential, and that means making sure that infrastructure is enabled with APIs and templates that provide the means to integrate it with whatever orchestration system you’re using to make this process go as fast as you can. You don’t have time to submit a ticket, or call Alice, or even yell at Bob in the next cube. Each piece needs to be automated and the entire process, from end to end, must be orchestrated.

But let’s not pretend it’s this simple, either. One doesn’t just spin up a container and voila! It’s ready to go. There are networking considerations there, as well, that must be addressed. Containers are “born” with an isolated, private IP address that is only accessible to other containers on the same host. To broaden its ability to talk to other containers (and services and systems) residing on other hosts requires some iptables magic (involving NAT), because it’s that address (and port, don’t forget the PORT) that needs to be added to the load balancer, not its private, personal one.

And once you’ve got that you also have to remember that adding it to the pool isn’t just a “here it is” kind of thing. You have to know what pool you’re adding it to and be ready to tell it how to monitor that new container’s status. After all, we don’t want to distribute traffic to an inactive or broken app, do we?

Conversely, scaling down isn’t just “get rid of it”. Oh, you can, but shame on you if you do. What about all those users (or other services) that were connected to that instance? You’re not really just gonna cut them off like that, are you?

Of course not. They might be in the middle of executing a query or a transaction or leaving you a love note. Proper scaling etiquette requires that you take the container out of the rotation, yes, but you don’t actually cut it off until there are no more active connections. That’s called quiescence (graceful degradation) and it’s not only polite but it’s a requirement if you want to make sure business stakeholders don’t chase you around with abandonment reports.

So you see, it’s not so easy after all, and the shorter lifespans (and faster spin up times) make scaling containers not just an exercise in technical architecture and management, but one of automating tasks and orchestrating processes in order to accomplish that scale (up and down) as responsively as you can. That’s why API enabled infrastructure is such a big deal – because that’s the way integration (between systems, apps, and services) today happens.

Conceptually, scaling containers is nothing new. But the devil, as they say, is in the details, and with containers, those details are all about making sure you can automate and orchestrate the scaling process to make sure it doesn’t become the bottleneck and slow down the mad rush to get that app to market.

Published Mar 17, 2016

Version 1.0