Forum Discussion

lkchen's avatar
lkchen
Icon for Nimbostratus rankNimbostratus
May 05, 2010

Down a pool if a monitor fails?

While it makes no sense to me....I've been asked to implement a monitor for a pool of webservers, that does a 'GET /monitor/status.html' and if the response does not contain the string "SUMMARY: OK", to disable the entire pool.

 

 

The pools are for our main production web site, and the normal fallback URL in our http profile is to a (different) status page on the main site.

 

 

The purpose is so that the admins can take the site down for maintenance without bothering me the F5 admin. Even though they've formed a committee to blame the F5 for a recent outage caused by all members in the pool being down.

 

 

Lawrence

 

3 Replies

  • Wow....That's awesome.

    Hoolio's idea is interesting.

    I would suggest doing it in two parts.

    Part 1: Custom Health Monitor...

    Type: HTTP

    Send String: GET /monitor/status.html

    Receive String: SUMMARY: OK

    Part 2: iRule for Redirect when no pool members are available, redirect to a pool or another website.

    
    when HTTP_REQUEST {
       Check if the default pool has less than one active member
      if { [active_members [LB::server pool]] < 1 } {
        HTTP::redirect "http://www.google.com"
        pool alternate.pool.of.servers
      }
    }
    

    http://devcentral.f5.com/Wiki/default.aspx/iRules/LB__server.html

    The Server Administrators would then have to take down all of the servers themselves, and then the F5 iRule would redirect the Traffic. It would make it harder for one of them to "accidently" take down the entire pool and causing an outage, and would allow them to take down individual servers for troubleshooting.

    Just a thought.
  • lkchen's avatar
    lkchen
    Icon for Nimbostratus rankNimbostratus
    Well, that was surprising....I went with "no, it can't be done" and they accepted that response.

     

     

    Meanwhile, they've formed a blue-ribbon committee to determine how the F5 is at fault for failing to serve content for our website when all the members were down. And, how they are going to monitor the F5 so they they can document its failure to handle this situation properly. Wonder if that's why some services seem slow on the F5 now...
  • I feel for ya. Perhaps the answer is getting lost in translation.

     

     

    Do they realize that a "Pool" of servers is 1,2,3,4,5 + Servers and that if one of them is down, that you don't need to kill the rest?

     

     

    Most people just want to view the F5 as a "Black Box" and only know what they want, but have no clue how to get it.

     

     

    I would ask them that if you want to take down all of my servers that are hosting the website, what do you want to show the users instead? Then implement the iRule above and tell them to test it for functionality.

     

     

    Hope they figure it out....