Code review: Http conditionally reselect for active connections based on persistence when member down or disabled
I am looking for a concept/code review or alternative suggestions if I'm going about this all wrong.
What we are trying to do is be able to maintain servers / release web application updates, killing as few users as possible.
We use source address persistence on the http vs. Pool is selected based on request uri using an existing rule. Each pool has a monitor that hits a keep alive page with just a receive string defined - generally 'OK3' or some sort. We do not use oneconnect. Big ip is 10.2.4 HF7.
Trouble is even after we take out a keep alive page and the member is marked down, requests on already-active connections are still going to the down server. Some of these requests are from external browsers that are using http keep alive to maintain connection. As long as they are not idle, their connections persist. Other requests are from other web applications which also pool their connections and keep them open. New connections are re-balanced, as expected, but this means any requests that rely on persistence for web server state - e.g. images stored locally - break.
What I would like to happen is that when we are going to take a member out for maintenance, to be able to drain the requests by setting the Disable string on the monitor. Then at the request-level when disabled - i.e. even for active connections - to reselect and update persistence, but only if I know the particular request does not require persistence, based on uri. Otherwise, keep the connection on the same member. We'll monitor the connections at this point while it's draining. When everyone is off, or we've waited long enough, bring it all the way down and do maintenance.
I wrote the following to do this.
when HTTP_REQUEST {
set lb_server_addr [LB::server addr]
if { $lb_server_addr != "" } {
set lb_server_pool [LB::server pool]
set lb_server_status [LB::status pool $lb_server_pool member $lb_server_addr [LB::server port]]
if { $lb_server_status == "down" || $lb_server_status == "session_disabled" } {
Reselect active connections when pool member is out
Only do this if there are other active members available..
if { [active_members $lb_server_pool] > 0 } {
Only if the client is NOT requesting stateful things
Switch on path after partner, e.g. for /tenant/foo/bar.aspx, this switches on /foo/bar.aspx
switch -glob [URI::path [HTTP::path] 2] {
"/ThisPathNeedsPersistence/*" -
"/This/Path/Needs/Persistence" {
Do nothing - allow request through
}
default {
LB::detach
Delete any persistence record so we get a new one. The record is automatically
invalidated when server is DOWN, but not when DISABLED
if { [persist lookup source_addr [IP::client_addr] node] == $lb_server_addr } {
persist delete source_addr [IP::client_addr]
}
pool $lb_server_pool
}
}
}
}
}
}
It seems to work in testing, but is it ok? Are there any weird conditions you can think of that would break this that I should account for? Or alternatives? Or is it right in general but I'm using something stupidly?
I did try Action on Service Down Reselect, but it never seems to do anything when the node is marked down - or at least not consistently. If it did work though, one problem is I assume in-flight requests may be lost. Also it would give us no opportunity to drain the connections with persistence requirements, breaking those, too.
Let me know if I can clarify anything.