COMMENTS
I think the description above is incorrect, as it should be [md5 URL] % [pool_length] and not the other way around - the iRule at the bottom seems correct
(Typical hash section)
Looks like you're right, I have corrected the initial equation. Thanks!
/deb
HI,
I think the command :
expr [md5 [HTTP::uri]] % [active_members
]]
will not work, as the remainder operation (expr... % ...) requires that both arguments are Integer numbers.
The md5 command will return a binary or ascii string, so we should get messages like :
syntax error in expression ?? ?????????? ???? ??????B~ 6: character not legal in expressions while executing expr ...
So we need to convert the MD5 (or simply the HTTP URI) to a decimal (Integer) format.
Is there a way to do that ?
Thank you.
Yes i am having a similar issue....
Is there any way to work around?
TCL error: cache-persist HTTP_REQUEST - syntax error in expression àÈW¡¶[¢ð-ŸBš’ 2: character not legal in expressions while executing expr [md5 [HTTP::uri]] [active_members ehcache]
You may want to have a look at http://devcentral.f5.com/Default.aspx?tabid=53&forumid=5&postid=9834&view=topic
- The simple examples here had crc32 replaced with md5, but this didn't really work as md5 returns a string.
It appears that you've ported some of the ideas from the CARP (http://icp.ircache.net/carp.txt) algorithm into an iRule. See section 3.1 'Hash Function'. Have you tried using the less CPU intensive CARP algorithm? It relies on a lightweight binary bit rotate rather than a CPU intense md5 call. While TCL doesn't have a built in bit rotate like C does, you can code an equivalent function to do it.
I'm not sure how to do it in TCL, but here is how I approached the bitwise rotate left in Perl (note, i'm not a programmer so expect ugly Perl).
sub Rotate_Left {
my $rlval =0;
# the passed in int to bit rotate
my $x = shift(@_);
# how many positions to rotate
my $n = shift(@_);
my $rval = $x << $n;
$rlval = ((($x) << ($n)) | (($x) >> (32-($n))));
return $rlval;
}
The rotate is only part of the CARP hash algorithm- the rest includes some further simple bit calculations that help to further randomize the node name/ip since they are often similar.The algorithm is clearly laid out in the RFC. I have more ugly Perl code for the whole algorithm if you're interested.
so while the iRule would be much longer and a bit more complicated, it (in theory) could be much more efficient and scalable. what do you think?
That sounds very cool! I'll have to go browse around a bit and take a look.
Thanks!
#Colin
One issue i have found with this is around object distribution.
If you have a reasonable number of caches, say around 30 and you use the example of:
[crc32 $CURRENT_NODE[HTTP::uri]]
You end up with a very large number of small objects residing on one cache, and if you lose that cache they all get redistributed to a single cache. favicon is a good example.
I am testing out this:
[crc32 $CURRENT_NODE[HTTP::host][HTTP::uri]]
It is more specific and seems to prevent this from occurring, it also adds minimal extra characters to the crc calculation.
On a secondary note however, I am not sure how far this will scale, the higher the volume of http transactions and larger the pool of node the more crc calculations you will have to do.
So a simple question is, with CMP will the iRule execute on separate TMM processes? So far i see the traffic flows on each of the TMM processes but the iRule is only executed in TMM0 which worries me a bit. Anyone have an idea on this?
NB. Ihave no persistance, ASM, Web Accelerator or global variables in the rule.
As an update to my earlier post:
It turns out that if you have one VIP forwarding to another both must be CMP capable.
If the first VIP is not then the second will not switch between TMM processes as this is one flow within that TMM and you are restricted to the same TMM.
It's been terrific to explore the world of consistent hashing and see how it can affect real world scenarios. I've had the chance to work with several dozen companies that have deployed this iRule with great success. Recognizing the popularity of this feature, F5 has incorporated the Election Hash concept into the native LTM code. You can now find this in v10 and later buried under Profiles > Persistence > Hash > Algorithm > CARP. Select this, create a small iRule (below) that finds the content you wish to hash upon, apply to your virtual server, and you are good to go. This will scale to vastly higher numbers of nodes with minimal CPU. Give it a try.
when HTTP_REQUEST {
persist hash [HTTP::uri]
}
This is welcome news.
A couple of questions:
1. Is this for one pool of nodes only, it seems to be the case?
2. Is there a limit when you start to worry about the number of nodes?
3. What does higher numbers equate to in the F5 Lab?
4. What http request volume was it tested against?
I think ill run off now and try this out.
there are about 420 members,1000 HTTP request per second,MD5 use 70% cpu and CRC32 use 50% cpu in 3600LTM,is there any way to use less cpu?
BTW:use uie persist only use 10% cpu.
Are you able to share your irule for this?
Are you doing 1 crc32 calc per node per request?
If you are then 50% cpu is very respectable and better than i can get with a viprion blade (85% for crc calcs) doing 8k requests/sec and only 30 nodes.
Have you tried the CARP persistence method at all that is in 10.1?
the rule seems not difference.And I only run the rule on v947.because persit uie only run at tmm0,so I use the rule to run all traffice with cmp.but it seems that the rule use more cpu than persist uie,because there are too many members in pool. is there any way to run all the traffice better?
when HTTP_REQUEST {
set ucid_uri [findstr [HTTP::uri] "uid=" 5 "&"]
persist uie $ucid_uri 1200
}
when HTTP_REQUEST {
set ucid_uri [findstr [HTTP::uri] "uid=" 5 "&"]
set S ""
foreach N [active_members -list [LB::server pool]] {
if { [crc32 $N$ucid_uri] > $S } {
set S [crc32 $N$ucid_uri]
set W $N
# log local0. "the md5 is $S and pool is $W"
}
}
pool [LB::server pool] member [lindex $W 0] [lindex $W 1]
}
If you have a finite and reasonable number of objects you could persist upon, then storing the persist table in memory is a great solution. As your numbers show, the 'persist uie' method is relatively inexpensive. Hashing, and especially consistent hashing such as CARP/Election Hash will incur higher CPU, but it provides other potentially important advantages such as unlimited number of persistable objects and multiple LTMs that can compute the same persistence results.
A small optimization that is often overlooked: the content you are looking to hash on is in the HTTP query, not the path portion of the URI. Changing [findstr [HTTP::uri] to [findstr [HTTP::query] will be more efficient as the LTM doesn't need to perform this operation on the entire URI. In this example: http://www.f5.com/folder1/index.html?uid=nathan&browser=firefox www.f5.com is the HTTP::host, folder1/index.html is the HTTP::path, and uid=nathan&browser=firefox is the HTTP::query. When doing a search against the HTTP::uri, the LTM is searching the path first, then the query.
In order to reduce the CPU overhead of consistent hashing, you can (and have) substituted MD5 for CRC32. All requests being equal, you'll find some lumpiness in the server utilization, but you'll shave ~30-50% CPU off the impact of running the iRule. Right now you are seeing 10% CPU with a simple rule doing memory based persistence. I'm guessing this rule is adding not much more than 1% CPU, and the rest is just the CPU required to push packets, load balance, provide protocol sanitization, etc. You are seeing 50% with CRC32 and 70% with MD5. So I'm again guessing that the iRule impact is ~1% for persist UIE, 40% CRC32, and 60% for MD5, plus 9.99% CPU for general load balancing overhead.
The real CPU hit is not from the number of requests per second, but the number of servers in the pool. The LTM is performing 420 hashes per request, and 1,000 reqs per second, or about 420,000 per second on your LTM 3600. That's a staggering number, and a real testament to just how fast the LTM really is to be able to accomplish this. An easy method for you to bring the CPU impact down is to use the 'compound election hash' method described at the end of this article. http://devcentral.f5.com/Default.aspx?tabid=63&PageID=153&ArticleID=135&articleType=ArticleView The idea is to break up your 420 servers into a number of smaller pools. So if you had 4 pools of 105 servers each and used the compound variant of the iRule, you would be performing 1 + 105 hashes per request, times 1,000 req/s, or 106,000 per second. If your MD5 hashing rule had an impact of 60% (plus 10% for LB), I would make a rough guess that you will see your total CPU at about 26% (a little over 15% for the MD5 rule + 10%). You could further lower that number by breaking up the servers into a greater number of pools.
The better answer is to upgrade to version 10. Yes, I know upgrades are a pain, but it's absolutely worth it for all the goodies and improvements it brings. v10 incorporates a similar variant of Election Hash into the kernal level code, and is insanely faster. You can still persist on any custom data using an iRule to find the data, and the built in Hash Persistence profile, which now has a new option to select the hash algorithm. You'd want to choose CARP (cache array routing protocol).
Here's an example of the iRule you'd use. You'll select this iRule in the Hash Persistence profile page.
when HTTP_REQUEST {
persist hash [findstr [HTTP::query] "uid=" 5 "&"]
}
A new 'persist carp' command, documented in the Wiki, has also been added into version 10.1 to make it easier to create and modify the Election Hash algorithm. This is not required if you are using the native Hash profile.
Hash election or carp is most useful when you want to spread the cache load across a farm AND not have duplicate objects stored in each cache, thereby gaining the most benefit you can from the storage and processing power you have.
Another benefit i hadn't considered until Nathan pointed it out is that this behavior is consistent across physical LTM devices, so multiple active LTM's can help you to continue to scale if required. That is quite powerful.
Nathan i understand that its the crc calcs that are causing load what i cant understand is how a 3600 can do a staggering 420k/sec and a viprion blade cant do more than 70k/sec It seems incongruous.
We use the hash to select a pool (we have 3) and then a hash to select the node (10 in each pool), and so only do 13 crc requests/http request, when the farm is full we hit 85% cpu at ~7k req/sec so that is ~91k crc calc / sec. Which is by no means insignificant, but its far short of 420k @ 70% ;-).
I have modified the irule so that it is CMP capable and spread across all CPU cores as well.
The only thing that we are doing that is different is we are checking the TCP payload for the GET string before we send it to the hash election VIP, so it is a VIP targeting VIP. but this should be quite low in terms of processing required, i hope.
When you say it is 'insanely faster' have you seen any metrics on that? things like number of nodes tested etc.
Also can you still hash based on the pools imlimentation or do all nodes need to be in a single pool? This might be covered by 'persist carp' i think?
This is turning into a forum thread.. Ill post one later today.