Availability means more than the dread “d” word


The focus on making servers unhackable to prevent service disruption (that’s such a politic way of saying the dread “d” word – downtime) is admirable but exposes the tendency of technical folks to go down rat holes when discussing application delivery challenges and specifically the challenge of assuring availability of applications and services. What generally seems to happen when we start talking about availability in the cloud is that we go down the rat hole of talking specifically about the cloud and not applications deployed within.

Lighting the way like a blazing torch so we can go even further down that rat hole is the spate of articles on cloud and reliability that mention it’s a problem, but never touch on how it should be addressed. In fact, the real rat hole in the cloud is that almost everything these days gets turned into a discussion on security. It’s closely related, after all, but it is a rat hole. We need to talk availability on its own and even more importantly, we need to define it in a way that we don’t end up down one rat hole or another before we get to the end of the discussion.

The lack of a definition is problematic because without understanding what it means for an application or service to “be available” it’s kind of hard to focus the conversation and determine how best to make that happen.


Availability of an application should never be construed to mean “the server is up and running.” Never. Just prepare to unlearn if you think that’s true. Do not pass go. Do not collect $200. Clear your mind and let go of that definition. Ready? Good. Let’s continue then.

That’s the minimum requirement for an application to be considered “available” and in reality there’s a lot more that goes into the definition. Availability should be considered to mean:

  1. The server (physical, virtual, and application) are running and accessible.
  2. The application is responding in an expected fashion to all requests
  3. The application is responding in a timely manner to all requests

The first two are absolutely necessary and the third should be, but depending on the application and organizational goals may be less important to you but is still a part of the definition for the user, so keep in mind that it may be more important than you think. If you’ve ever watched the flurry of tweets that go by when Twitter or GMail is performing poorly then you know that users consider timely responses a part of being “available” whether you make it part of your official definition or not.

The first requirement is the easy one. The server is either up or down, the application is either running or it isn’t. But the second two requirements are harder to determine, especially in a cloud environment. The proliferation of simple load balancing as a means to scale traditional and virtual environments is partially to blame. Simple load balancing and even rudimentary application delivery systems do not go beyond the most simplistic of measures as a means to determine correctness and timeliness of responses. A 500 internal error or a request is not accepted for five or more seconds should be, for all practical purposes, considered unavailable. But simple load balancing solutions as are employed by some cloud providers and even more organizations are incapable of understanding and supporting these two requirements. And by supporting I mean ensuring that if one application or virtual application instance isn’t meeting the requirement then another one is immediately selected that will meet the requirement. It’s that decision making capability that is of import in ensuring availability.

Application fluency, an integral component of context-awareness, is a necessary attribute of application delivery systems in order to assure availability of applications. It’s the ability to understand the responses coming from an application that form the basis for the decision making capability that makes it possible for infrastructure to support availability requirements. This is especially true in a volatile environment like cloud computing where applications can be provisioned and de-provisioned on a more frequent basis than is true of traditional data center architectural models.

This is where application health monitoring becomes an important part of a successful application delivery strategy and specifically application health monitoring at the application and data layers. Without the ability of a load balancer or application delivery platform to monitor application health at several levels it is impossible to determine correctness of responses. Without an agile solution capable of reacting to that information, it is further impossible to do anything about it.

For example, it is one thing for an application delivery platform to be able to identify that a particular instance of an application is not responding correctly, but it is completely another for it to be able to take action based on that information; action such as retry the request on another instance, or to inform the management or provisioning systems that the application is unavailable and a new instance needs to be launched immediately. It is similarly one thing for an application delivery platform to recognize that a particular instance of an application is responding poorly, but it is quite another for it to be capable of taking steps to remediate the situation – either by trying another server, or applying optimization and acceleration techniques to improve the response time of the application.

It is this flexibility and intelligence that makes a piece of application network infrastructure dynamic.


So how do you monitor the applications you have deployed in a cloud environment? What are the possibilities? Can you do intelligent health monitoring or is the determination of availability made via an ICMP ping (very bad) or TCP open/close (only slightly better than bad)? As you’re sitting around considering how to build our your own dynamic/cloud environment, are you thinking about how you’re going to monitor and determine availability?

In a dynamic environment the ability to deep-dive into the health of an application and determine true availability is paramount to success. A virtual machine bearing an instance of your critical application isn’t actually available until the application is up and running and responding flatline (correctly) to requests. Using ancient techniques like ICMP ping or even TCP open/close in a virtualized architecture is asking for trouble because neither are capable of determining the state of an application. It is the application we’re concerned about and while the state of the virtual machine, the network connection, and the underlying operating system is certainly part of the equation, it’s all just so many bits and bytes of data unless the application itself is accessible, responding correctly, and performing up to expectations.

Intelligent health monitoring is part and parcel of application delivery, but not necessarily load balancing so it’s important to consider just what is offered by your chosen (or about to be chosen or will be chosen in the future) cloud provider so you can determine whether it’s adequate for your needs. And when you’re building out your own dynamic environment, think carefully about how you’re going to manage those various applications from a load balancing and application delivery point of view. It’s all well and good to utilize included clustering solutions or inexpensive proxies to perform such tasks, but what you don’t pay for such a solution you’ll end up paying for either in downtime (lost revenue, lost productivity) or in third-party software solutions that monitor the applications’ health – or both. Because the latter is not integrated into the former, there’s a disjoint there that can only be solved by investing the time and effort to integrate them such that the former can act on data provided by the latter.

Let me say that again: a lack of integration between intelligent, active health-monitoring capabilities and request distribution (load balancing/application delivery) will result in additional work on the part of the purchaser in the long run.


Are you starting to see the problem? It’s going to cost you more in the long run to build out an infrastructure comprised of multiple (albeit probably more individually inexpensive) solutions than it would to simply invest up front in an application delivery solution capable of providing both the load balancing and the health-monitoring you need to ensure availability.

You’re going to need some kind of load balancing solution if you’re going to build out your own cloud environment, and you might as well as consider the ramifications of architecting a solution that does not leverage application intelligence and health monitoring to its fullest extent before you decide what’s going to provide that core functionality. Integration must occur at some point, and it’s much more efficient to acquire a solution that’s already integrated. A unified application delivery infrastructure is that solution and its internal integration forms the basis for simple deployment of ancillary solutions like security and acceleration that can help you ensure the third requirement in the definition of availability: timeliness of response.

First and foremost, however, your infrastructure needs to know the state of an application, and that means being able to understand the application and its data. That comes through health monitoring, whether it’s integrated into the application delivery solution or not. Recognizing the importance of intelligent health monitoring of applications in any environment is the first step toward putting a flexible, dynamic solution in place that can adapt to the needs of each application and provide the broad capabilities necessary to ensure availability in its complete definition. 

Follow me on Twitter View Lori's profile on SlideShare friendfeedicon_facebook AddThis Feed Button Bookmark and Share