Search
Don MacVittie - Persistently Different
You are here: DevCentral > Weblogs

posted on Wednesday, March 25, 2009 3:10 PM

Well, now that we have discussed how you got here, and what options load balancing offers you, let’s hit upon what you’re up against when porting your application to run under a load balancer. Many resources online say that most applications can be moved behind a load balancer unchanged. I beg to differ with this view, there are a host of issues that crop up even if your application is not retaining state information. So we’ll hop right into it. Note that all of these core issues have solutions, some have many options for resolving them, but if you don’t know they exist, you’ll be blindsided by them, forewarned is indeed fore-armed.

If you’re new to this series, you can find the complete list of articles in the series on my personal page here

LoggingZapNGo!

Your system no doubt uses logs for a variety of things including management reporting, security auditing, and problem resolution. Most applications do, and load balancing makes utilizing logs harder. The problem is that your tools are all pointed at the logs on server1, and that was the entirety of your logs… But now you have logs on server1, server2, server3, server4, and all must be combined in some way to give you accurate reporting. Some load balancers and most ADCs take care of this for you with centralized logging or even reporting replacement functionality. Some, but not all. There are 3rd party tools out there for log aggregation, some commercial, some open source. I’d get one and familiarize yourself with it while your application is still in test – if you can make a log aggregation tool work for you, then your code doesn’t need to change at all, and none of your functionality breaks… You just have to point your reporting mechanisms at the location you use to store your aggregated logs. One problem resolved.

Client IP tracking

Lori and I have an application that records the IP address of those who log in, which sits behind a load balancer (an ADC, actually). While the IP address recording was just thrown in there because we could, I actually started using that information for reporting of people logging in as guest. All it does (and all most of this type of function does) is pull the IP Address out of the headers and throw it into a database. The problem is that our load balancer is a proxy for all users. That allows us to expose the Virtual IP and not actually expose any of our servers to the world except through the IP/Ports we dictate. Unfortunately, by virtue of being a full proxy, it replaces the IP Address field with the load balancers’ IP address. So it appeared that everyone in the world was logging into our app from the load balancer. Not the best situation for the reporting I was doing, and really not the best for our web server logs.

The easy fix for this one is to change your source code to work off of the x-forwarded-for header and make certain that your load-balancer is configured to support x-forwarded-for. Sadly, some load balancers don’t support this header, so you’ll have to think of something more inventive in those cases or eliminate the need to track the IP of users.

Persistence

Whoo boy, there are few words to make a developer with experience developing behind a load balancer shudder like persistence. Here’s the deal, if your app is tracking state, and that state is stored on the web/app server, then when the user returns and gets directed to a different server, you’ve lost all context for their experience. There are a variety of ways to fix this in any load-balanced environment, but they either aren’t optimal or require not just recoding, but re-architecting portions of your application. If you don’t maintain state, or use the browser to maintain state for you by passing it back with each response, then this is a total non-issue for you and you can move along – because the client will supply context info each time it returns, you don’t have to worry about which server you’re going to.

Which brings us back to the first option – use the browser to track state. In large applications with lots of database interaction this option isn’t feasible, but in smaller applications that just have controls on a page being fed back with each submit, you can do this rather readily. It is more difficult in newer applications – AJAX apps and advanced .NET functionality, but it’s wholly doable, just takes some forethought about how your application is used and what goes where.

The easiest solution to this whole problem is the group sidearm/server affinity/persistent connections. All of these options let you pass off a request to the server, and then always return to the same server (though how you return is different for each), but this introduces some issues of its own. For one, the ability to balance load amongst your servers is minimized because the load-balancer makes its decisions only on the first trip to the server – with some advanced load balancing algorithms that take server feedback these technologies can actually negate the benefits of load balancing. Still, this is the right solution for apps that have a pretty evenly spread load across all pages, so consider your client use cases and think about whether one of these technologies will solve your problem.

Another solution is to shift the storage of per-connection persisted data to the database. Unless you have a pretty high-end database server, both in hardware and software, this just moves the problem. If you’re running something like Oracle RAC, it’s a viable option, but if you’ve got a single-instance database on a low-to-mid-tier commodity server, you’re probably not going to be satisfied with this solution – it takes code to implement, and if you rearrange your code and then at the end discover that you have just switched the load and the single point of failure to your database, you will likely not be a happy camper. Thus, I don’t recommend this course, though in some situations it might be the right one.

Finally, you could rewrite the app to not utilize state at all. This is more work if your app is already finished… But it is the most robust of all of these solutions. If you’re just designing an app that you hope to be huge, avoid the Fail Whale and write it this way from day one.

SSL persistence

Another nasty bit – that is very similar to the persistence header above, but has unique problems of its own is SSL persistence. Yes indeedy, it’s passingly difficult to decrypt a stream that was encrypted for another server’s public key. And while this issue can be resolved by giving all the app servers that run a particular application the same cert (this is done, I suspect SANS doesn’t approve), there are other issues. Like the fact that when a client comes back and is directed by the load balancer to a different server, there is no existing connection, so the client and server have to renegotiate. Some load balancers and most ADCs provide SSL termination to resolve this issue, terminating the SSL session at the load balancer and communicating from the load balancer to the backend server – because it is all on your private network – in the clear. For security reasons, in some applications this is not a viable option, and even if it is, you need to check with your load balancing vendor to see if they support this mechanism. The most common solution to this problem is the sidearm/server affinity/persistent connections set of solutions mentioned above, because once a client connects to a server it is always redirected to that same server and all of these issues go away. Just test the effect this will have on your load balancing algorithm before going this route.

 Other options and issues

Of course in this short blog post I can’t hit everything, but these are the major issues I’ve seen. And I’m not touching on the things that a full-blown ADC can do for you that load balancers don’t. I think I’ll take next week’s blog to review load balancing algorithms so you know which does what, then we’ll start to peel away the power of an ADC – which really is amazing in comparison to simple layer 4 (commonly called L4 by networking folks) load balancing.

Until next time,

Don.



Feedback

3/28/2009 6:47 AM
Gravatar Nice article, thanks Don. You might consider adding 'Clustering' to your persistence options. With tools like Terracotta, it doesn't even have to be difficult.

Keep writing, this is good!

Rick
Rick
3/30/2009 3:15 PM
Gravatar Hi Don,

Thanks for the article.

I have access to an F5 LTM in our perf labs and I have been playing with the web interface to learn a little about it in my spare time. I wrote a iRule for redirecting new user logins while allowing existing sessions to continue, which can be used when our internal database server locks up due to heavy load. This is now used by our operation folks in production too.

Was now trying to look at ways to improve our servers utilization in production. Since I am doing this in my spare time, I dont have much info from our 'official' route. So I am not sure what load balancing method is used in production for now (I can ask, but I would rather experiment and understand things before that).

In the web GUI of our perf Big-IP, I can see Load Balancing under the 'virtual server -> resources'. And there are dropdowns for default pool, and persistence profiles.

However beyond this I haven't been able to find info on how to change the load balancing algorithms - round-robin, ratio based, connection based, dynamic etc. I am also interested to know if we can implement our own algorithm using iRules (and if this places a load on the F5 as compared to using and inbuilt algorithm).

Could you point me to the relevant manual or write-up which might have this info?

Thanks,
Sriram
Sriram
3/30/2009 3:36 PM
Gravatar Hey Sriram!

Very cool that you're getting things done that are moving into production!

It's kind of hidden. Once you know where to look, it makes sense, but the first time it's a little painful. Log into the GUI, choose pools, select any pool, and then choose members. It's the first dropdown.

I can argue that makes sense because the members are what is balanced... But I expected to find it at the top level of Pool when I first went looking.

Tomorrow's article is about the different algorithms - all of the ones we support and ones with a following that we don't, mostly those used by popular software vendors.

Hope that helps!
Don.
Don MacVittie
3/30/2009 4:23 PM
Gravatar Thanks Don. I found some docs too, relevant to what I am trying to learn/do:

https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/bigip9_0config/ConfigGuide9_0-05-1.html
https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/bigip9_0config/ConfigGuide9_0-15-1.html#wp1172375

The ability to plug WMI as an input to load balancing seems especially interesting.
sriram
3/31/2009 9:53 PM
Gravatar Thanks for the solid, high-level summary of the gotchas developers should watch out for when developing apps for use with load balancers. In order to take advantage of technologies, particularly the performance enhancing variety, adapting code is an unavoidable way of life.

This issue reminds me of the early 1990s when multi-threaded programming models were just catching on. I attended a USENIX session by Brian Kernighan (one half of the famous K & R tandem that wrote every developer’s favorite reference on C programming) and heard the promise of performance gains set against the pain of having to code (or re-code) applications to achieve those gains.

Moving persistence down to a database is a fairly common way to address the problem. And, it is important to know that if you are looking to move persistence to the data tier, there are options besides Oracle RAC to ensure the database does not become the new bottleneck in your quest to crank out better end-to-end performance.

I am a co-founder and chief strategy officer for xkoto, which offers GRIDSCALE, a database load balancer. GRIDSCALE sits in front of a dynamic pool of active-active databases (DB2 or SQL Server today) so that data queries get load balanced and even DMLs get asynchronously replicated to each database with a tight consistency model. We have learned a great deal about the performance enhancements that often are accompanied by development gotchas. In working with developers, we have analyzed hundreds of custom and packaged applications from startups to big Fortune 100s. The outcome of all this analysis is that there are programming best practices that should be followed to take full advantage of an active-active database architecture.

For example, if your application is going to have multiple consistent instances of its data in a pool, you’ll want to consider things like timestamps. The common use of default timestamps on database table columns can result in different values being generated in each database in the pool. Fortunately, GRIDSCALE can overcome issues like this with its query optimization capabilities but if the app is coded to pass the timestamp to the database instead of letting the database default the value, then the consistency issue disappears altogether.

In an upcoming post (www.xkoto.com/index.php/blog) you’ll see more of these active-active database best practices – stay tuned – and insist to persist.
Albert
3/31/2009 10:59 PM
Gravatar Albert,
Thanks for commenting! Mentioning RAC was certainly not meant as an exclusive reference, just wanted to mention someone most of my readers would recognize. It probably helps that they're a partner that I've been working with for the last few months, so RAC sprang to mind while writing.
I just read the database scale-out document on your site, looks great, though that's of course a first-blush impression.

Your example leaves me thinking of queue processing... How to ensure that only one server is servicing a DB-based queue... Hmmm. ;-)

Thanks again,
Don.
Don MacVittie
3/31/2009 11:00 PM
Gravatar Rick:
Agreed, but wanted to leave that for the ADC discussion because it is a more advanced solution. Will check out your stuff before I get there.

Regards,
Don.
Don MacVittie

Let Me Know What You Think


Please use the form below if you have any comments, questions, or suggestions.

Title:
 
Name:
 
Email: (so we can show your gravatar)
Website:
Comment: Allowed tags: blockquote, a, strong, em, p, u, strike, super, sub, code
 
Please add 8 and 3 and type the answer here:

Blog Stats

Posts:347
Comments:225
Stories:0
Trackbacks:0
  

Image Galleries

  

82,243 Members in 102 Countries and Growing!

Join DevCentral Today!

About DevCentral

DevCentral has been a successful, thriving community for many years. We have always strived to bring you the best technical documentation, discussion forums, blogs, media and much more that we can.

So dive in, get familiar with DevCentral. We hope you like it, we hope it makes your job easier, and lets you get that much more power out of the community. To learn more, make sure to check out the Getting Started section. And if you have any problems, or think something could be easier to use, drop us a line to let us know.

Got It !

We've received your comment and transmitted it directly to DevCentral HQ.

Thanks for taking time to let us know what's on your mind. At DevCentral | Community Matters!

Get In Touch With Us

Have questions, suggestions or just want to get something off your chest?

Use our handy form below to Direct Connect with DevCentral Mission Control.

Send Us Feedback       or