Forum Discussion

Keith_106951's avatar
Keith_106951
Icon for Nimbostratus rankNimbostratus
Nov 19, 2007

Design help for high TPS application

Feel free to point me to posts that answer my question if this is somewhat redundant. We are deploying a 2000+ TPS (transactions/second) applicaiton that is loadbalanced by a pair of BIG-IPs. We have to maintain some level of persistence, keeing sessions pinned for 10 minutes, so on the virtual, "persist hash" has been used.

 

 

A monitor exists to check for a status HTTP page on each system.

 

 

monitor someapp {

 

defaults from http

 

recv "someapp"

 

send "GET / HTTP/1.1\nHost:x.x.x.x\n etc.etc.\n\n"

 

}

 

 

The problem comes that if a member of the pool that has sessions pinned to it goes down, the time that it takes to mark the member offline can result in 500 or more failed transactions. Any design ideas or helps would be appreciated.

 

 

Thanks,

 

 

Keith

1 Reply

  • Hi Keith,

     

     

    When the pool member goes down, does it continue to accept TCP connections? Does it send back HTTP 500's? Or does it stop answering TCP connection requests altogether?

     

     

    What do you have the monitor interval and timeout set to? How long is it taking to mark a failing pool member down? How long do you want to have it take? You might be able to adjust the timeout to get better results.

     

     

    Another option might be to detect a failure on a single connection in the LB_FAILED event (Click here). In theory, you should be able to select a new pool member and resubmit the request, but I haven't tested this myself and I've seen a few posts which indicate there are issues with this approach.

     

     

    LB_FAILED event

     

    http://devcentral.f5.com/Default.aspx?tabid=53&view=topic&postid=15238

     

     

    LB_FAILED behavior, expected or not?

     

    http://devcentral.f5.com/Default.aspx?tabid=53&forumid=5&tpage=1&view=topic&postid=15999

     

     

    lb::reselect fails to select another node

     

    http://devcentral.f5.com/Default.aspx?tabid=53&forumid=5&tpage=1&view=topic&postid=15864

     

     

    How do I create a passive health monitor?

     

    http://devcentral.f5.com/Default.aspx?tabid=53&forumid=5&tpage=1&view=topic&postid=1343313621

     

     

    There is also a solution on AskF5 which describes an issue with LB::reselect. I haven't seen any additional info on the issue though.

     

     

    The iRule LB::reselect command does not select the next pool member

     

    https://support.f5.com/kb/en-us/solutions/public/8000/000/sol8033.html

     

     

    Does anyone else have practical experience reselecting a pool member when one fails to respond? If so, did you use LB::reselect?

     

     

    Aaron