Forum Discussion

K-Dubb's avatar
K-Dubb
Icon for Nimbostratus rankNimbostratus
Sep 30, 2013

HTTP:Retry and LB:Reselect not working as expected.

Hello,

 

We are using ASP.net Session state via SQL server behind our BIGIP Version 10.2.4. Session state is working great. We do not want the user to ever see a service unavailable. We tried setting the http monitor to a very low value, but there was still a delay before we would finally get moved to a different server. In the meantime, the user would see 503 errors. We contacted F5, and they said this was the only way to do it. However, I found this iRule which seemed like it would do the trick: https://devcentral.f5.com/wiki/iRules.HTTP__retry.ashx

 

As I understand it, it inspects each request and if there is a 5xx error, it will retry to a different pool member. The problem is that it does not work. The client still sees 5xx errors until the http monitor marks the pool member down. If I look in the lTM log, I can see it logging that a 5xx error was received, and that it is retrying and re-loadbalancing. However, from the client point of view, this is not occurring. Am I missing something in the setup? Here is my code:

 

 Retry requests to the virtual server's default pool if the server responds with an error code (5xx status)

when CLIENT_ACCEPTED {
  On each new TCP connection track that we have not retried a request yet
set retries 0

  Save the name of the virtual server default pool
set default_pool [LB::server pool]
    }

    when HTTP_REQUEST {
  We only want to retry GET requests to avoid having to collect POST payloads
 Only save the request headers if this is not a retried request

if { [HTTP::method] eq "GET" && $retries == 0 }{
  set request_headers [HTTP::request]

 }
    }

when LB_SELECTED {
 Select a new pool member from the VS default pool if we are retrying this request
 if { $retries > 0 } {
   LB::reselect pool $default_pool
    log local0. "Re-loadbalancing"
  }
    }
    when HTTP_RESPONSE {
  Check for server errors
     if { [HTTP::status] starts_with "5" } {

     Server error, retry the request if we have not already retried more times than there are pool members
    incr retries
    log local0. "5xx error caught: retry $retries out of [active_members $default_pool]"

     if { $retries < [active_members $default_pool] } {
          Retry this request
      HTTP::retry $request_headers

          Exit this event from this iRule so we do not reset retries to 0
      return
        }
     }
     If we are still in the rule we are not retrying this request
     set retries 0
    }

Thanks.

 

9 Replies

  • The number of retries is limited to the number of available poolmembers. If the number of retries exceeds this number, the 5xx will be forwarded. Please check the /var/log/ltm for the value of $retries versus number of active members. Perhaps this will explain the 5xx codes on clientside ...
  • K-Dubb's avatar
    K-Dubb
    Icon for Nimbostratus rankNimbostratus
    Oneconnect is not enabled. Bascially for this test we look to see which app server our session is on. We then stop the app pool on that server. All other pool members are still active.
  • K-Dubb's avatar
    K-Dubb
    Icon for Nimbostratus rankNimbostratus
    According to the logs it is retrying over and over until it hits the max number of members in the pool. I don't know why this would be happening as the other pool members are up. When the http monitor finally marks the pool down, then my session comes back up on a different pool member..
  • If you enhance the first log statement in the context of LB_SELECTED a bit, please:

    log local0. "Re-loadbalancing to [LB::server]"

    You will probably notice the same server is being selected continuosly.

    As indicated by Jason, OneConnect may help to solve the problem. Alternatively you can try to turn off KeepAlive on your webservers.
  • I´ve got it working in v10.2.4HF7 for source address affinity in combination with oneconnect (plain oneconnect profile used, btw I recommend a 32bit mask) only. I´m deleting the persistence record and things work fine so far. There is still an issue with the limitation counter. It stops at twice the number of poolmembers if each server returns a '503':

    when CLIENT_ACCEPTED {
        set retry 0
        set default_pool [LB::server pool]
    }
    
    when HTTP_REQUEST {
        if { ([HTTP::method]) eq "GET" && ($retry == 0) } {
            set request_headers [HTTP::request]
            log local0. "Current retries of $retry for [HTTP::uri]"
        }
    }
    
    when LB_SELECTED {
        if { $retry > 0 } {
            persist delete source_addr [IP::client_addr]
            log local0. "Re-loadbalancing to [LB::server] retries $retry [HTTP::uri]"
            LB::reselect pool $default_pool
        }
    }
    
    when HTTP_RESPONSE {
        if { [HTTP::status] starts_with "5" } {
            incr retry
            log local0. "5xx error caught: retry $retry out of [active_members $default_pool]"
    
            if { $retry < [active_members $default_pool] } {
                log local0. "will retry now"
                HTTP::retry $request_headers
                return
            }
        } else {
            set retry 0
        }
    }
    

    Another attempt with cookie persistence failed so far. But I´m running out of time right now and need to stop at this point, sorry.

  • This one works a bit better with cookie persistence (plain OneConnect applied). But it still does a couple of reselects even it got a proper response, so I´m not fully happy yet.

    Need to investigate a bit more (and build in a cookie decoding routine), but as I´m a freelancer I need to allocate time for it (hope for your understanding).
    when CLIENT_ACCEPTED {
        set retry 0
        set reselect 0
        set default_pool [LB::server pool]
    }
    
    when HTTP_REQUEST {
        if { ([HTTP::method] eq "GET") && ($retry == 0) } {
            set request_headers [HTTP::request]
            log local0. "Current retries of $retry for [HTTP::uri]"
        } elseif { ([HTTP::method] eq "GET") && ($retry > 0) } {
            persist none
        }
    }
    
    when LB_SELECTED {
        if { ($retry > 0) && ($reselect ==1) } {
            log local0. "Re-loadbalancing to [LB::server] retries $retry [HTTP::uri]"
            LB::reselect pool $default_pool
        }
    }
    
    when HTTP_RESPONSE {
        if { [HTTP::status] starts_with "5" } {
            LB::down
            set reselect 1
            incr retry
            log local0. "5xx error caught: retry $retry out of [active_members $default_pool]"
    
            if { $retry < [active_members $default_pool] } {
                log local0. "will retry now"
                HTTP::retry $request_headers
                return
            }
        } else {
            persist cookie
            set retry 0
            set reselect 0
        }
    }
    
  • The attached one seems to work as expected in v10.2.4.

    Perhaps there is still a little overhead. I will investigate further but concentrate on v11.2.1.
    when CLIENT_ACCEPTED {
        set retry 0
        set reselect 0
        set default_pool [LB::server pool]
    }
    
    when HTTP_REQUEST {
        if { ([HTTP::method] eq "GET") && ($retry == 0) } {
            set request_headers [HTTP::request]
            log local0. "Current retries of $retry for [HTTP::uri]"
        } elseif { ([HTTP::method] eq "GET") && ($retry > 0) && ($reselect > 0)} {
            persist none
        } elseif { ([HTTP::method] eq "GET") && ($retry > 0) && ($reselect == 0)} {
            persist cookie insert "BIGipServer$default_pool"
        }
    }
    
    when LB_SELECTED {
        if { ($retry > 0) && ($reselect ==1) } {
            log local0. "Re-loadbalancing to [LB::server] retries $retry [HTTP::uri]"
            LB::reselect pool $default_pool
            persist cookie insert "BIGipServer$default_pool"
        }
    }
    
    when HTTP_RESPONSE {
        if { [HTTP::status] starts_with "5" } {
            LB::down
            set reselect 1
            incr retry
            log local0. "5xx error caught: retry $retry out of [active_members $default_pool]"
    
            if { $retry < [active_members $default_pool] } {
                log local0. "will retry now"
                HTTP::retry $request_headers
                return
            }
        } else {
            persist cookie insert "BIGipServer$default_pool"
            set retry 0
            set reselect 0
        }
    }