Forum Discussion

j_s_23783's avatar
j_s_23783
Icon for Nimbostratus rankNimbostratus
Feb 18, 2014

Best way to order health checks for 500+ sites

We have got 500+ sites configured across 18 web-servers. We have a single pool with 18 members and we use host-headers on the web-server side.

 

At the moment we have a single check which queries the default website for a check file. When that file changes the whole server is pulled from the pool. We mostly use this check file for deployments and taking whole servers offline.

 

First thing I have arranged is a check file to be placed on every single site. Now I have to configure the checks.

 

The ultimate goal is to have each site individually checked and the pool members removed individually should they go down per site, but we also want a "master check" which will pull the member out completely for every site (eg for deployments).

 

The long way round as I understand it is to have a unique pool for every site with a unique check per site. Is there any shortcuts we can do around this? And how do we go about having the master check then pull the member out for all 500+ pools?

 

2 Replies

  • The easiest way to make mass changes is from the CLI using tmsh. You should be able to prepare a file with the necessary commands that you can then cut & paste, or you could script the change.

     

    Regarding the monitors, you should assign the site specific monitor and the server specific monitor to each pool and "Availability Requirement" on the pool should be set to All. If the site specific monitor fails it will pull the server down for the single pool that it failed for, if the server specific monitor fails the server will be downed for all pools.

     

    Eric

     

  • We have a similar setup where a handful of web servers host about a hundred sites. What we've done for what you describe as a "master check" is put an http monitor at the node level for each of these servers. So if any of the boxes fail the http check the node is marked down which then pulls it from all the pools it is in. The other advantage of this is that it's only a single http check against the server instead of the LTM checking it within each pool. Had that check been in the pool itself, the LTM would be checking the same thing once per pool (ie: hundreds of times instead of once).