Cloud offers an appealing “pay only for what you use” that makes it hard to resist. Paying on a per-usage hour basis sounds like a good deal, until you realize that your site is pretty much “always on” because of bots, miscreants, and users. In other words, you’re paying for 24x7x365 usage, baby, and that’s going to add up. Ironically, the answer to this problem is … cloud.
Don and I occasionally discuss how much longer we should actually run applications on our own hardware. After all, the applications we’re running are generally pretty light-weight, and only see real usage by “users” about 12-20 hours a week. Yeah, a week. Given that Amazon AWS pricing per hour runs at or below the cost of electricity per kWh necessary to power the hardware, it seems on the surface that it would be cheaper – even when including data transfer costs – to just pack up the servers and deploy those couple of applications “in the cloud.”
Then I got curious. After all, if I was going to pay only when the system was in use, it behooved me to check first and ensure that usage really was as low as I thought.
Imagine my surprise upon logging into my BIG-IP and pulling up the total throughput for the last 24 hours that there is no single period of time in which traffic is not flowing to and from my servers.
Granted, the total amount of traffic is negligible, but I don’t recall anything in the AWS (or anyone else’s) pricing guides that set “usage” as X throughput or Y users per hour. If one user, one bot, or one miscreant happens to make a connection to the server in a given one-hour period, I have to pay for the whole hour even if no other users, bots, or miscreants touch the application for the rest of that time period.
Then I thought about the usage patterns of the applications themselves. One of the heavier trafficked ones uses AJAX to update components in real-time as long as the user is logged in. Unfortunately, many users never log out – they just leave the application running in a tab while they’re off doing other things. The updates still happen, every X seconds/minutes, and even though the user isn’t really “using” the application, it’s still running. So between inattentive users, AJAX, bots, spiders, and miscreants there is always someone – or something – using those applications.
Given the throughput – indicating activity 24x7 – that means if I move to the cloud I have to pay 24x7 because there’s likely going to be some activity that causes the application to “be in use” whether it’s a legitimate, real, actual user or whether it’s a spider, bot, or miscreant just out poking around.
ALL CONNECTIONS ARE CREATED EQUAL IN THE CLOUD
While the concept of “on-demand” and only paying for resources when they’re in use sounds
great, the reality is resources are likely always
in use. That’s because the cloud doesn’t – and can’t at the moment – distinguish whether a client opening a connection is a bot, a miscreant, or a user. And honestly, even if it does/could, would The Cloud care? The Cloud as implemented by most providers today has no reason to do so because this little oft-overlooked tidbit is part of what keeps them rolling in revenue. Constant access = constant charges.
Part of what bothers me – and should bother you – is that in The Cloud all connections are treated equally. A bot and a spider and a miscreant have as much validity as a user. There’s no distinguishing between them, no context in which requests are interpreted. For some industries and use cases that’s not acceptable. Sites and applications that are driven by advertising, for example, have to take special care that they are not charging – or collecting – payment for “views” by bots and spiders because they aren’t “real” eyes. There are myriad solutions to this problem, one of which involves the use of network-side scripting to effectively filter out bots and spiders and ensure they aren’t being delivered ads that count against – or toward – hit counts.
Oh, I’m sure the answer to this problem will be offered up by someone: deploy a SoftADC in the cloud that’s capable of network-side scripting and do the same thing. Filter out bots and spiders based on the User-Agent in the HTTP headers and … and what? Send them to a different instance? Reject them? What you do with them is irrelevant because the operative part of that solution is that you’re deploying a SoftADC. That means an instance, an image, an application. An application that will be running 24x7x365 filtering out requests and incurring charges on an hourly basis. That didn’t solve the problem, it just changed one part of the equation for an equal but different value: instead of 2 you’re using 3-1. Different number, same results.
VIRTUAL PRIVATE CLOUD AS THE SOLUTION
If you’re a large enough organization to already have an infrastructure and you' are looking to The Cloud to expand capacity or address availability but want to keep your CapEx and OpEx lower by not investing in more servers, then a virtual private cloud solution may be the ticket. Because a virtual private cloud basically extends your data center – the internal side, not the external side – into the cloud you can leverage existing application delivery network infrastructure to solve the problem. You can inspect the requests and filter bots and spiders to existing application instances inside your data center and only direct real users to The Cloud.
Basically, you have more control over when and how application instances are utilized which means you have more control over the costs of delivering applications to your users/customers/partners/etc.
It also leverages existing investments in application delivery infrastructure and skill sets, which keeps the costs associated with moving to The Cloud and choosing from among their limited offerings in infrastructure solutions to a minimum.
Before you decide to move to a usage-based billing system, i.e. cloud, it’s a good idea to figure out what that usage really is – and how much it might end up costing you. Now obviously for me and my tiny sites – even though I have a BIG-IP and could certainly implement it – this is not likely a cost-effective option. I simply don’t have the number of servers to manage that would make it cost-effective. But you probably do, so you should consider carefully whether a public cloud deployment or a virtual private cloud “extension” of your data center is best suited to your specific needs – and budget.
If you’re curious/interested in figuring out what a move to Amazon might cost you, they offer a simple web-based calculator into which you can enter hourly usage, bandwidth, and various services available that will come up with a monthly cost. YMMV, of course, but it’s a great place to at least start investigating the potential costs.