TCP redirect on LB_FAILED for in-band health check.

Question

We have several situations in the enterprise where it is desirable to have a large number of farmed services run on a single pool of servers.  New instances come online all the time, and only TCP health checks are required, but we don't want to configure an explicit pool, complete with monitor, each time someone starts up a listening process on a port.  &nbsp;
&nbsp;We want to use a Layer 3 virtual server like this:&nbsp;
&nbsp;virtual moo {
&nbsp; destination 1.1.1.1:any
&nbsp; ip protocol tcp 
&nbsp; pool moo
&nbsp; rule moo
&nbsp;} &nbsp;
&nbsp;pool moo {
&nbsp; member server1:any
&nbsp;} &nbsp;
&nbsp;pool foo {
&nbsp; member server2:any
&nbsp;}&nbsp;
&nbsp;What I'd like to be able to do is create a rule like this:&nbsp;
&nbsp;rule moo {
&nbsp; when LB_FAILED {
&nbsp;  log "connection to [IP::server_addr] failed"
&nbsp;  use pool foo
&nbsp;}&nbsp;
&nbsp;This would enable an on-the-fly TCP health check, essentially -- if the host is not responding on that port, try the other server.  I don't see any reason this shouldn't be possible, but it doesn't work.  I simply get disconnected when LB_FAILED.  LB_FAILED is working, based on LTM output:&nbsp;
&nbsp;May  2 16:20:05 tmm tmm[1049]: 01220002:6: Rule moo : connection failed:  144.203.239.34&nbsp;
&nbsp;Also, it is not the case that LB_FAILED is processed after the client flow is closed.  This rule works:&nbsp;
&nbsp;rule moo {
&nbsp;        when LB_FAILED {
&nbsp;                log "connection failed:  [IP::server_addr]"
&nbsp;                TCP::respond "sorry, dude, your server's down."&nbsp;
&nbsp;        }
&nbsp;}&nbsp;
&nbsp;Observe:&nbsp;
&nbsp;zuul /u/ineteng/Data/f5 239$ telnet 10.165.29.17 23
&nbsp;Trying 10.165.29.17...
&nbsp;Connected to 10.165.29.17.
&nbsp;Escape character is '^]'.
&nbsp;sorry, dude, your server's down.Connection closed by foreign host.
&nbsp;zuul /u/ineteng/Data/f5 240$ &nbsp;
&nbsp;Anyone have any ideas?  This sure would be useful!&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;

krzysztof_kozlo · Answer

If no one has any experience or tips to offer on getting this working, can I ask if anyone at least sees this functionality as useful?  Folks I've talked to here are pretty excited about the possibilities.&nbsp;
&nbsp;What we want to do in effect is set up a Layer 3 rule with no monitoring, but make sure that any connections on any port are directed to a server that's listening on that port.  If nothing is listening, the connection would be dropped.&nbsp;
&nbsp;Combined with, say, source IP persistence, this would allow us to load balance services that talk on arbitrary port ranges, or our present use case, in which we want to be able to start up servers arbitrarily on the pool members and have them load balanced (or at least highly available) without having to touch the LTM.&nbsp;
&nbsp;If we can't do this today, it sounds like a ripe, low-hanging feature request for the dev team at the least!  I don't know of any other vendor who can claim in-band TCP health checking...

jrahm · Answer

Actually, Cisco LocalDirector (yes, that dinosaur) did this passive monitoring.  It removed members from the pool after X number of failed tcp handshake attempts, then occasionally would throw bones back at it in attempts to bring it back "online"&nbsp;
&nbsp;I was hoping that the passive monitoring hyped for 9.4 was in line with this, but it is not the same.

krzysztof_kozlo · Answer

This is great!  The documentation for 9.2.3 does not list "LB::reselect" as a method.  (F5, send your doc writers back to the salt mines.)  Initial results seem positive.  I'll doc my full iRule when and if I get it working.

krzysztof_kozlo · Answer

According to the iRules Wiki (which I just discovered, thank you very much):&nbsp;
&nbsp;This command is used to advance to the next available node in a pool, either using the load balancing settings of that pool, or by specifying a member explicitly. ****Note that the reselection is currently limited to two tries.**** (emphasis added)&nbsp;
&nbsp;If this is correct, it means that a loop is not possible, and the logic&nbsp;
&nbsp;   when LB_FAILED {
&nbsp;                if { [LB::server addr] == "" } {
&nbsp;                        log "connection failed:  no servers available"
&nbsp;                }  else {
&nbsp;                        log "connection failed:  [LB::server addr]"
&nbsp;                        LB::reselect
&nbsp;                }
&nbsp;   }&nbsp;
&nbsp;is all we need.  It also means that this technique is limited to pools with three or fewer members (two retries) unless that documentation is obsolete.&nbsp;&nbsp;

bl0ndie_127134 · Answer

Ok, I would like to kill this urban legend that Passive monitoring is limited to HTTP right now. 'LB::status' can be used from most reasonable events such as LB_FAILED HTTP_RESPONSE etc.

Forum Discussion

TCP redirect on LB_FAILED for in-band health check.

10 Replies

Recent Discussions

F5 Rseries HA

Need advise to setup a policy on F5

F5 Rseries R2600 Tenant sizing

Can iRule mask the payload content on event logs of security

Pricing when used with aws waf

Related Content

F5 Distributed Cloud (XC) API Security in Out-of-Band Mode using BIG-IP

F5 Releases Out of Band Disclosure in Conjunction With Rapid7

Anchor Link Redirect

url redirect

iRule redirection