#virtualization load balancing in a virtualized world is the same as it ever was, but different.

trad vs virt lb diagram

The introduction of virtualization and cloud computing to data centers has been heralded as “transformational” and “disruptive” and “game changing.” From an operational IT perspective, that’s absolutely true.

But like transformational innovation in other industries, such disruption is often not in how the core solution is leveraged or used, but how it impacts operations and the broader ecosystem, rather than the individual tasked with using the solution. The transformation of the auto-industry, for example, toward alternative fuel-sourced vehicles is disruptive and changes much about the industry. But it doesn’t change the way you drive a car; it still works on the same principles and the skills you’ve learned driving gas powered cars are still applicable to alternative fuel-source cars.

What changes for the operator – just as within IT -  is there may be new concerns with which you must contend.

Load balancing virtualized applications is in this category. While the core principles you’ve always applied to load balancing applications still applies, there are a few additional concerns that arise from the use of virtualization that you’re going to have to take into consideration.


Let’s remember quickly how load balancing traditional applications works, shall we?

The load balancing service presents to the end-user a single endpoint, i.e. “the application”. Users communicate exclusively with that endpoint. The load balancing service communicates with a pool of resources comprised of one or more application instances. It is by adding instances to the pool that an application is able to scale horizontally to meet demand.

In the most common traditional load balancing environment, each application instance is hosted on a single, physical server. The availability of the “application” is maintained by insuring there are always enough instances (nodes) available to compensate for any failures that might occur at the physical server, operating system, platform, or application layers.

Load balancing services also allow for the designation of “back up” nodes. Each node in a pool may have a back up node that is only activated in the event of a failure. This is used primarily for high-availability purposes to ensure continuous application availability rather than for scaling purposes.

Now, when we replace the physical servers with virtual servers, we have pretty much the same system. There still exists a pool of resources that comprise “the application”, the load balancing service still mediates for the end-user, and there are still enough application instances in the pool to compensate for failure, thus ensuring availability of “the application.”

However, there are some new potential sources of failure that must be addressed that impact the topology – the physical placement – of the application instances in the pool.


One of the most important changes coming from virtualization that must be addressed is fault isolation. Assume for a moment that we took all four physical nodes and consolidated them on a single, physical virtualized platform.

phys virt consolidation

In theory, nothing changes. The load balancing service views a “node” as a unique combination of IP address and TCP port, and whether that’s hosted on a virtual platform or a physical server is irrelevant to the load balancing service. The load balancing algorithms still work the same way, nodes are selected as directed by configured policies, backup nodes are still used to ensure continuous availability, and nothing about the way in which load balancing works changes. 

But it’s very relevant to operations because this type of server-consolidated deployment model introduces higher unrecoverable failure scenarios and it will directly impact the performance (in a bad way) of “the application.”

There are a couple operational axioms at work here:

1. Shared infrastructure (network, compute, storage) means shared risk.

2. As load increases, performance decreases.

Let’s say “Node 1” fails. In both the physical and virtual deployments, the load is simply shifted to the remaining active nodes. No problem.

But what if the network connectivity between the load balancing service and “Node 1” fails? In a physical deployment, no problem – each node has its own physical connection and is unlikely to impact the other nodes. But what about the virtual deployment? Each node has its own virtual network connection, certainly, but does it have its own physical network connection or is it shared? If it’s a shared physical connection and it fails, then all nodes will fail – leaving “the application” unavailable.

Load Balancing Virtualized Applications Rule #1: Team and Trunk.

Physical network redundancy is a must. Modern server platforms are generally enabled with at least 2 if not 4 GBE connections, use them.

So now you’ve got your network topology designed to ensure that a physical failure will not take out every application instance on the server. Next you need to consider how the application instances are isolated and deployed to ensure that a failure at the hypervisor layer does not disrupt all application instances.

Consider that there are two possible reasons you are implementing load balancing: scalability and availability. In the former, you’re trying to ensure supply meets demand. In the latter, you’re trying to mitigate potential failure in a way to ensure “the application” is always available, regardless of failure. If there is a failure at the hypervisor layer, all instances relying on that hypervisor will be impacted (and not in a good way). Regardless of why you’re implementing load balancing, the result of such a failure is the same, instances are unavailable. Similarly, if the physical device on which virtualized applications are deployed fails, every instance on that device will be down.

In both cases, if all your virtual eggs are in one basket and there’s a failure at the hypervisor layer, you’re in trouble.

Load Balancing Virtualized Applications Rule #2: Divide and Conquer.

Application instance redundancy is a must. Never put all your application instances on a single virtualized or physical platform. Spread them across at least two, to isolate potential failures in the virtualization layer or at the physical server layer.

Node backups should always be located on physically separate devices. Load balancing services are adept at discerning failure but they are not necessarily able to determine the source. A failure to communicate with an application instance could be caused by a bad cable, a failed port, an unresponsive network stack, or an application error. The load balancing service knows the application instance is down, but not necessarily why it’s down. If it’s a crashed instance, then failing over to a back up instance on the same server is probably going to work out fine. But if the root cause is a failed port or bad cable, failing over to a backup instance on the same server isn’t going to help – because it is down too.

It is imperative to ensure availability that there are always at least two of everything – and that means physical devices, as well. Never put all your eggs in one basket – at any layer.


Aside from general availability issues, there is also the very real possibility that where you deploy virtualized application instances will impact performance of “the application.” Remember that even though you can designate CPU and memory on a per application instance, they still ultimately shared I/O – both storage and network. That means even if you use rate limiting technologies to try to manage bandwidth consumption as a means to reduce congestion or latency, ultimately you’re impacting performance. If you don’t use rate limiting or other bandwidth-focused solutions to manage the shared network resource, you run the risk of congestion and increasing latency on the wire.

Similarly, shared storage is even more problematic because when you trace I/O down through the system, you end up at a single, shared I/O controller that is going to have some serious limitations on it. I/O intense application instances deployed on the same physical device are going to cause contention in the underlying system, which is going to negatively impact performance.

Again, divide and conquer. Disperse such instances across two (or more) physical servers. The number of servers will depend on the overall scale of the application and the resource consumption rate. Load balancing will  be able to assist in maintaining performance across instances if you take advantage of a response-time aware algorithm such as fastest response time (the assumption is that response time correlates directly to load and in most cases, this is true). This keeps any given instance from becoming overwhelmed.

phys virt arch

Ultimately, what this means is that you have to be a little more aware of physical deployment location for application instances than you did with pure physical deployments. Consolidation is a great way to reduce operational and capital expenditures, but it also means consolidating risk.


This is a particularly tough nut to crack especially when combined with the desire to implement auto-scaling operations in a more cloud-like environment. The idea that you can leverage “whatever idle resources” you can find to scale out applications on-demand is powerful, but it’s also potentially fraught with risk if you’re unable to control placement at all. While the possibility that every instance would end up deployed on a single server or even a select handful of servers is minimal, there is the possibility that multiple instances could be deployed in a way that means a single server failure could eliminate a sizeable number of application instances, resulting in an unacceptable degradation of performance or even downtime for some percentage of users.

In the end, location really does matter when it comes to load balancing virtualized applications. Where they are deployed and in what groupings becomes a critical factor for maintaining performance and availability. The tendency to increase VM density is high, but that tendency can lead to highly disruptive situations in the event of a failed component. Be aware that cost savings from mass-consolidation and “high efficiency” through increasing VM density metrics may look good now, but may not look so good through the lens of hindsight.