Learn F5 Technologies, Get Answers & Share Community Solutions Join DevCentral

Filter by:
  • Solution
  • Technology
Answers

iRule Event LB_FAILED - Triggered for Active Server-Side Connections?

Our application creates a "connection pool" to a VIP with 3 LDAP servers. These connections are created at application initialization.

The VIP's pool uses "Load Bal Method=Least Connections (member)" and "Priority Group Activation=Less than 1". The pool members have Priority Group values of: ldap-01 : 100 ldap-02 : 90 ldap-03 : 80

When the application is started All of its 'connection pool' connections go to ldap-01, which is what we want. And of course we want them to stay there unless ldap-01 becomes unavailable.

The application does not handle errors well, but cannot be changed (easily). I am trying to minimize the chance of the app seeing an error when we stop ldap-01 for maintenance. I want the VIP to switch to ldap-02 for an 'in flight' transmission that suffers a time-out or RST.

My question is this, Is is possible to use an iRule, such as the one below, to trigger selection of ldap-02 if the already established connection has a error, such as timeout or RST, when ldap-01 is stopped?

Reading the description of LB_FAILED I think the answer is No. LB_FAILED will fire when the initial VIP pool member connection is attempted (SYN) but that is the only connection-related condition that triggers LB_FAILED(?) If the server-side connection suffers a transmission time-out or an RST that will not trigger LB_FAILED, correct?

We have a 2 second health check, but the app is super busy and there are essentially always active requests.

If the iRule below is not the right approach is there another iRule event that will allow an LB::reselect to chose ldap-02 and re-send the request?

=== candidate iRule === when CLIENT_ACCEPTED { set retries 0 } when LB_FAILED { if { $retries < [active_members [LB::server pool]] } { LB::reselect incr retries } }

2
Rate this Question

Answers to this Question

placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER

Avery,

The LB_FAILED event does not trigger mid-connection, only when selecting the server. There is the CLIENT_CLOSED event and the SERVER_CLOSED event which will activate when there was a connection close between the client and the F5 or the server and the F5 respectively, but nothing that would activate only on a TCP error. It seems like F5 has not yet added a way in to determine whether a TCP connection was closed normally through a four-way close or if it was from a RST/time-out. As you can see in this article, it has been an issue for some time.
If you explain more about your situation, maybe a workaround can be configured.

1
Comments on this Answer
Comment made 3 weeks ago by Avery Salmon 11

Thanks Rico, much appreciated, for sure you get it, but to be specific here is the VIP's pool definition

-------- Pool Definition ------- load-balancing-mode least-connections-member members {

dal9-dev-cldfmkt-01:ldaps {            
    address 10.121.53.108              
    priority-group 100                 
    session monitor-enabled 
    monitor-rule LDAP_CHECK and /Common/gateway_icmp (pool monitor)         
    state up                           
}                                      

dal9-dev-cldfmkt-02:ldaps {            
    address 10.121.53.109              
    priority-group 90                              
    session monitor-enabled 
    monitor-rule LDAP_CHECK and /Common/gateway_icmp (pool monitor)           
    state up                           
}   

The VIP IP is 10.155.189.9 and when the app (10.120.224.158) initializes it creates a pool of connections, that all go to member-01 (10.121.53.108) - b/c it is "priority-group 100", like this

// tmsh show sys connection | grep 10.155.189.9: | more

10.120.224.158:38024 10.155.189.9:636 10.155.189.6:38024 10.121.53.108:636 tcp 234 (tmm: 0) none 10.120.224.158:38006 10.155.189.9:636 10.155.189.6:38006 10.121.53.108:636 tcp 234 (tmm: 0) none 10.120.224.158:37988 10.155.189.9:636 10.155.189.6:37988 10.121.53.108:636 tcp 234 (tmm: 0) none ------- etc etc ------

What I am hoping to do is shield the app from any server-side failures, even for an established client-side connection. So if a given server-side connection fails (orderly close, RST, timeout) the F5 would create a server-side connection to pool member 2 and retry the request.

Apologies if this is a newbie Q, but is it correct that once a pool member is chosen for a client-side connection that LB-decision is kept until the server-side and client-side connections are closed? That is, are there any scenarios where the server-side connection is switched to a new pool member and the client-side connection is unaffected? For example, a failed health-check?

I get the feeling that using LB::reselect at SERVER_CLOSED event is perhaps dangerous?

1
Comment made 3 weeks ago by Rico 344

There are ways to alter only the server-side connections, such as LB::reselect as you mentioned. LB::reselect can keep the client-side connection alive while changing the connection to the server. Also, as you guessed, it can be problematic due to the fact that there is no limit of the amount of reselections. That, plus the fact that the F5 cannot tell the difference between an intended close and an unintended close such as a reset or timeout means that this command has very specific use cases.

The other way to change the server-side connection is by simply triggering a pool reassignment manually in an iRule. The main issue here is I do not know enough about the app to know if there is something in the response that we could key off to trigger the change.

In short, there are ways to change the server-side connection without affecting the client-side connection, the issue is that the F5 does not have a good way to know when it should do that.

If you have anymore questions, I am sure I can help.

0
Comment made 2 weeks ago by Avery Salmon 11

Thanks again Rico,

I ASSUME that if the pool's health check monitor fails that F5 will alter the existing server-side connections to a new pool member, keeping the existing client-side connections unchanged. Is that correct?

I will test this, but (again) imagine that this is an F5 newbie-type question.

We have an LDAP-type health monitor on the pool - as provided "out of the box" by BigIP. So it has the ability to check LDAP response (which, for what it is worth, is a binary protocol).

This append: https://devcentral.f5.com/codeshare/ldap-proxy has a wealth of iRule examples that can be used for LDAP traffic.

0
Comment made 2 weeks ago by Rico 344

Avery,

A health monitor is mainly for selecting if traffic should be sent at all. If health monitor fails, it won't really change the server-side connection as much as it will direct the next request to a pool member with a valid health monitor, or just reset the connection if none are available.

In your case, the health monitor is the best way of determining if there has been a loss of connection or error with one of the servers. If a server were to go down in the middle of an LDAP transaction, that session data would be lost, but after two seconds another server would be available to retry that same transaction.

If all you are doing is taking a server down for maintenance, you can simply disable the pool member you want to perform maintenance on. Disabling a pool member allows current connections to continue accessing the server but all new requests get sent to the available pool members. This should allow you to do what you need to do without disrupting users.

If you want to change pool members based on errors or a reset, you could perhaps create an iRule with a timeout specified separate from the TCP timeout that would reselect a server. I will keep looking into it but for right now, I am not seeing a way you could detect an error faster than that.

0