If you have a finite and reasonable number of objects you could persist upon, then storing the persist table in memory is a great solution. As your numbers show, the 'persist uie' method is relatively inexpensive. Hashing, and especially consistent hashing such as CARP/Election Hash will incur higher CPU, but it provides other potentially important advantages such as unlimited number of persistable objects and multiple LTMs that can compute the same persistence results.
A small optimization that is often overlooked: the content you are looking to hash on is in the HTTP query, not the path portion of the URI. Changing [findstr [HTTP::uri] to [findstr [HTTP::query] will be more efficient as the LTM doesn't need to perform this operation on the entire URI. In this example: http://www.f5.com/folder1/index.html?uid=nathan&browser=firefox www.f5.com is the HTTP::host, folder1/index.html is the HTTP::path, and uid=nathan&browser=firefox is the HTTP::query. When doing a search against the HTTP::uri, the LTM is searching the path first, then the query.
In order to reduce the CPU overhead of consistent hashing, you can (and have) substituted MD5 for CRC32. All requests being equal, you'll find some lumpiness in the server utilization, but you'll shave ~30-50% CPU off the impact of running the iRule. Right now you are seeing 10% CPU with a simple rule doing memory based persistence. I'm guessing this rule is adding not much more than 1% CPU, and the rest is just the CPU required to push packets, load balance, provide protocol sanitization, etc. You are seeing 50% with CRC32 and 70% with MD5. So I'm again guessing that the iRule impact is ~1% for persist UIE, 40% CRC32, and 60% for MD5, plus 9.99% CPU for general load balancing overhead.
The real CPU hit is not from the number of requests per second, but the number of servers in the pool. The LTM is performing 420 hashes per request, and 1,000 reqs per second, or about 420,000 per second on your LTM 3600. That's a staggering number, and a real testament to just how fast the LTM really is to be able to accomplish this. An easy method for you to bring the CPU impact down is to use the 'compound election hash' method described at the end of this article. http://devcentral.f5.com/Default.aspx?tabid=63&PageID=153&ArticleID=135&articleType=ArticleView The idea is to break up your 420 servers into a number of smaller pools. So if you had 4 pools of 105 servers each and used the compound variant of the iRule, you would be performing 1 + 105 hashes per request, times 1,000 req/s, or 106,000 per second. If your MD5 hashing rule had an impact of 60% (plus 10% for LB), I would make a rough guess that you will see your total CPU at about 26% (a little over 15% for the MD5 rule + 10%). You could further lower that number by breaking up the servers into a greater number of pools.
The better answer is to upgrade to version 10. Yes, I know upgrades are a pain, but it's absolutely worth it for all the goodies and improvements it brings. v10 incorporates a similar variant of Election Hash into the kernal level code, and is insanely faster. You can still persist on any custom data using an iRule to find the data, and the built in Hash Persistence profile, which now has a new option to select the hash algorithm. You'd want to choose CARP (cache array routing protocol).
Here's an example of the iRule you'd use. You'll select this iRule in the Hash Persistence profile page.
when HTTP_REQUEST {
persist hash [findstr [HTTP::query] "uid=" 5 "&"]
}
A new 'persist carp' command, documented in the Wiki, has also been added into version 10.1 to make it easier to create and modify the Election Hash algorithm. This is not required if you are using the native Hash profile.