LB_FAILED behavior, expected or not?

Question

I have the following iRule, use if for when my nodes are down connections go to a different page.
when LB_FAILED { 
 switch [LB::server pool] { 
  default { 
   set remoteip [IP::remote_addr]
   set uri  [HTTP::uri]
   set hostname  [HTTP::host]
   log local0. "$remoteip is looking up Hostname $hostname and URI $uri"
   HTTP::redirect http://maint.my.com
  } 
 } 
}
I found something today that gives me different behavior than I would have expected. I have 3 servers in my pool and when one of them fails for whatever reason this above rule will actually send 1/3 of my traffic to this maintenance page until the BigIP marks the server as down. I've got my check set up in 5 second intervals, and fail at 16 seconds. So all the traffic that is sent to the one down server that has yet to be marked down will hit the redirect.
Is this expected behavior? Sounds to me that LB_FAILED should actually be LB_SERVER_FAILED

hooleylist · Answer

Hi,&nbsp;
&nbsp;That sound right...&nbsp;&nbsp;Click here&nbsp;&nbsp;
&nbsp;Triggered when the system fails to select a pool member or when a selected pool member fails to respond to a connection request.&nbsp;&nbsp;&nbsp;
&nbsp;If you wanted, you could reselect a new node in the pool instead of redirecting, using LB::reselect.  There are a few posts on this in the forums.  I think the max retries is hardcoded to two though.  I haven't tested it, but I would assume you could achieve something similar with setting the pool's 'Action on Service Down' to reselect (Click here).&nbsp;
&nbsp;If you wanted to mark the non-responding node down from the LB_FAILED event, you could using LB::status; but then in effect you're setting your monitor timeout to "one request".&nbsp;
&nbsp;Aaron&nbsp;

al_carandang_11 · Answer

Yes this is expected behaviour - the BigIP will keep sending traffic to the down server until it is marked down which would be after your health checks fail for the configured number of retries.&nbsp;
&nbsp;I was just wondering though why you have the switch in your code when all you test for is the default condition...

jrahm · Answer

FYI, If you tune your tcp profile to limit the syn retransmissions to 2, you can get an LB_FAILED event around 9 seconds, which would occur before your monitor timeout of 16 seconds.  Please see this thread for a more detailed discourse on this from deb:&nbsp;
&nbsp;http://devcentral.f5.com/Default.aspx?tabid=53&amp;forumid=5&amp;tpage=1&amp;view=topic&amp;postid=1523815269 Click here&nbsp;&nbsp;

Forum Discussion

LB_FAILED behavior, expected or not?

3 Replies

Recent Discussions

About custome Response Blocking Page

Office Online Server with SharePoint 2016

health monitor source IP address

How to target only webview rendering with CSS?

ltm policy asm_auto_l7_policy

Related Content

Behavior of outbound DNS query from LTM behavior

AI Safety: Navigating Deception, Emergent Goals, and Power-seeking Behaviors

behavior of SSL::disable serverside

F5 SMTP Fast Template - SNAT Not working as expected

F5 Friday: Expected Behavior is not Necessarily Acceptable Behavior