Forum Discussion

jethro_106302's avatar
jethro_106302
Icon for Nimbostratus rankNimbostratus
Apr 08, 2009

source port collision when using snatpool

Hi,

 

I'm having the problem described in https://support.f5.com/kb/en-us/solutions/public/5000/000/sol5089a.html for a few applications based on a half^H^H^Homebaked protocol I'm loadbalancing with bigip.

 

Theses requests have no kind of session, typically last no more than half a second, and clients run on windows systems (through citrix), hence rely on Microsoft source port selection algorithm which is incremental and by default uses the range 1024-5000 if I'm not mistaken. Servers run windows OS as well, and have the default TIME_WAIT setup.

 

On these bigip, I also loadbalance quite a number of applications not suffering from this problem. This is a one armed bandit loadbalancing setup, hence the need for snat to make sure the bigip sees the replies.

 

Users started complaining about failed connection attempts to the VIP for that protocol, and after investigation we noticed that sometimes, different client systems might connect to the VIP using the same source port, and get snat'ed using the same snat IP (thanks to the client source port selection algorithm by MS). In such cases the first connection is OK, and other connections fail until the TIME_WAIT state is over for the first SNATIP:sport->NODEIP:dport socket registered on the node. (this is all described in SOL5089).

 

We tried using bigger snatpools, but still run into the problem occasionally. I know I have to options to make sure this wont happen again:

 

1/ change the architecture, and have the servers use the F5 as the default route, to avoid the need for snat pools. This isn't likely to occur if we can have other options because management staff doesn't like the idea of modifying the architecture for this critical pair of bigip.

 

2/ apply the workaround described in SOL5089 which I think will severely impact my ability to debug problems at the network level when needed. For the moment, I can easily make a link between client->F5 and F5->node flows because source port is preserved.

 

3/ lower the TIME_WAIT timer.

 

Other options I imagined are the following:

 

3/ use iRule to assign specific snatpools for different kind of clients connecting to this VIP based on info I can extract from the protocol (I'm already doing some kind of X-Forwarded-for for this traffic, and know there are fields I can base my decision on to pick a snatpool).

 

4/ Have the client application pick random ports, which would lower the probability of source port collision, but not make it disappear.

 

I'm looking for advice here. Am I right about 2/ or is it still easy to make the link using ISN and TCP seqnum? (I quite doubt that but would like to be proven wrong :p) Can I use iRules to do something similar to TM.PreserveClientPort = disable only for a given number of VIP in case I'm right about 2/?

 

I hope my post is clear, feel free to ask for further information if needed, and thanks in advance for your inputs.

 

regards

2 Replies

  • I just read about the new translate iRule command from v10, would this help in such a case? Does it work on a per VS basis or not?
  • Hi,

     

     

    Your iRule idea of using different snatpools based on something in the protocol sounds pretty good. I don't know that your 2 would be that bad, I don't usually have to go to the level of matching up flows by ephemeral port, but I see your point. I don't think an iRule can manipulate db commands though.

     

     

    Denny