Forum Discussion

Ed_Summers's avatar
Ed_Summers
Icon for Nimbostratus rankNimbostratus
Feb 18, 2014

LTM Monitor - Requiring multiple failures before marking pool member down

A client pool of HTTP servers includes a URI that when polled runs some health checks on the server itself and returns a text result of "PASS" if all is well. If we don't receive "PASS", we mark the member as down. The monitor generally works well and we have no issues with that basic function.

 

However for whatever reason the server sometimes does not respond to the poll within the timeout value. Log and packet analysis shows that the server completes the TCP handshake but leaves the response to the monitor's GET request hanging. After the monitor timeout period the pool member is marked down, but marked up only a few seconds later as the next poll is typically successful.

 

Yes - the server admin needs to investigate and fix the reason for the server ignoring the poll. Understand that effort is underway and separate.

 

(Relevant to BigIP 8900, ver 10.2.3)

 

I'm looking for a way to configure a monitor so that it ultimately fails only after a number of failed polls. I would prefer this number to be configurable but at this time 2 polls would be sufficient. I believe this can be accomplished via a custom "External Monitor" but would like to know if there is a built-in way to do this since External Monitors consume additional system resources.

 

I'm continuing to search on AskF5 as well as DevCentral but would appreciate any pointers to existing posts or SOL that describe how this is done, or if it is/is not possible.

 

Thanks! Ed

 

2 Replies

  • The purpose of setting the Timeout to 3xInterval+1 second is to allow the LTM to perform multiple polls before marking the server down.

     

    Some info on this topic in the following posts:

     

    https://devcentral.f5.com/questions/monitor-settings-interval-timeout-etc

     

    https://devcentral.f5.com/questions/monitor-statistics

     

    Eric

     

  • Eric,

     

    Thanks! Bonehead moment on my part as I had spoken to the server admin earlier about the purpose of the timeout. Your note sparked a thought that's making me review the packet capture and analysis for the monitor traffic in this scenario.

     

    Ed