Amazon’s ELB is an exciting mix of well-executed infrastructure 2.0 and the proper application of SOA, but it takes a lot of work to make anything infrastructure look that easy.
The notion of Elastic Load Balancing, as recently brought to public attention by Amazon’s offering of the capability, is nothing new. The basic concept is pure Infrastructure 2.0 and the functionality offered via the API has long been available on several application delivery controllers for many years. In fact, looking through the options for Amazon’s offering leaves me feeling a bit, oh, 1999. As if load balancing hasn’t evolved far beyond the very limited subset of capabilities exposed by Amazon’s API.
That said, that’s just the view from the outside.
Though Amazon’s ELB might be rudimentary in what it exposes to the public it is certainly anything but primitive in its use of SOA and as a prime example of the power of Infrastructure 2.0. In fact, with the exception of GoGrid’s integrated load balancing capabilities, provisioned and managed via a web-based interface, there aren’t many good, public examples of Infrastructure 2.0 in action. Not only has Amazon leveraged Infrastructure 2.0 concepts with its implementation but it has further taken advantage of SOA in the way it was meant to be used.
NOTE: What follows is just my personal analysis, I don’t have any especial knowledge about what really lies beneath Amazon’s external interfaces. The diagram is a visual interpretation of what I’ve deduced seems likely in terms of the interactions with ELB given my experience with application delivery and the information available from Amazon and should be read with that in mind.
WHAT DOES THAT MEAN?
When I say Amazon has utilized SOA in a way that it was meant to be used I mean that their ELB “API” isn’t just a collection of Web Services, or POWS, wrapped around some other API. It’s actually a well-thought out and designed set of interfaces that describe tasks associated with load balancing and not individual product calls. For example, if you take a look at the ELB WSDL
you can see a set of operations that describe tasks, not management or configuration options, such as:
To understand why these are so significant and most certainly represent tasks and not individual operations you have to understand how a load balancer is typically configured, and how the individual configuration components fit together. Saying “DeleteLoadBalancer” is a lot easier than what really has to occur under the covers. Believe me, it’s not as easy as a single API call to any load balancing solution. There’s a lot of relationships inherent in a load balancing configuration between the virtual server/IP address and the (pools|farms|clusters) and individual nodes, a.k.a. instance in Amazon-speak. Yet if you take a look at the parameters required to “register instances” with the load balancer, you’ll see only a list of instance ids and a load balancer name. All must be configured, but the APIs make this process appear almost magical.
The terminology used here indicates (to me at least) an abstraction which means these operations are not communicating directly with a physical (or even virtual) device but rather are being sent to a management or orchestration system that in turn relays the appropriate API calls to the underlying load balancing infrastructure.
The abstraction here appears to be pure SOA and it is, if you don’t mind my saying, a beautiful thing. Amazon has abstracted the actual physical implementation of not only the management or orchestration system, but also decoupled (as is proper) the physical infrastructure implementation from the services being provided. There is a clear separation of service from implementation, which allows for Amazon to be using product X or Y, hardware or software, virtual or concrete, and even one or more vendor solutions at the same time without the service consumer being aware of what that implementation may be.
The current offering appears to be pure layer 4 load balancing which is a good place to start, but lacks the robustness of a full layer 7 capable solution and eventually Amazon will need to address some of the challenges associated with load balancing stateful applications for its customers; challenges that are typically addressed by the use of persistence, cookies, and URI rewriting type functionality. Some of this type of functionality appears built-in, but is not well-documented by Amazon.
For example, the forwarding of client-IP addresses is a common challenge with load-balanced applications, and is often solved by using the HTTP custom header: X-Forwarded-For. Ken Weiner addresses this is a blog post, indicating Amazon is indeed using common conventions to retain the client IP address and forward it to the instances being load balanced. It may be the case that more layer 7 specific functionality is exposed than it appears, but is simply not as well documented. If the underlying implementation is capable – and it appears to be given the way ELB addresses client IP address preservation - it is a pretty good bet that Amazon will be able to address other challenges with relative ease given the foundation they’ve already built.
That’s agility; that’s Infrastructure 2.0 and SOA. Can you tell I’m excited about this? I thought you might.
This gives Amazon some pretty powerful options as it could switch out physical implementations with relative ease, as it so desires/needs, with virtually (sorry) no interruption to consumer services. Coupling this nearly perfect application of SOA with Infrastructure 2.0 results in an agility that is often mentioned as a benefit but rarely actually seen in the wild.
THIS IS INFRASTRUCTURE 2.0 IN ACTION
This is a great example of the power of Infrastructure 2.0
. Not only is the infrastructure automated and remotely configured by the consumer, but it is integrated with other
Amazon services such as CloudWatch
(monitoring/management) and Auto Scaling.
The level of sophistication under the hood of this architecture is cleverly hidden by the simplicity and elegance of the overlying SOA-based control plane which encompasses all aspects of the infrastructure necessary to deliver the application and
Several people have been trying to figure out what, exactly, is providing the load balancing under the covers for Amazon. Is it a virtual appliance version of an existing application delivery controller? Is it a hardware implementation? Is it a proprietary, custom-built solution from Amazon’s own developers? The reality is that you could insert just about any Infrastructure 2.0 capable application delivery controller or load balancer into the “?” spot on the diagram above and achieve the same results as Amazon. Provided, of course, you were willing to put the same amount of effort into the design and integration as has obviously been put into ELB.
While it would certainly be interesting to know for sure, the answer to that question is overridden in my mind by a bigger one: what other capabilities does the physical implementation have and will they, too, surface in yet another service offering from Amazon? If the solution has other features and functionality, might they, too, be exposed over time in what will slowly become the Cloud Menu from which customers can build a robust infrastructure comprising more than just simple application delivery? Might it grow to provide security, acceleration, and other application delivery-related services, too?
If the underlying solution is Infrastructure 2.0 capable – and it certainly appears to be - then the feasibility of such service offerings is more likely than not.