Forum Discussion

Fabien_V__28825's avatar
Fabien_V__28825
Icon for Nimbostratus rankNimbostratus
Oct 04, 2012

Bug or Problem in VS TCP behavior ?

I'm looking at performance's problem that we had since some days on 8800 and 8900 in version 10.2.1 and 10.2.0 (with an HotFix) on some standard VS

 

Standard HTTP VS seems to slow down TCP connection at client side (HTTP requester/client) from times to times, and so webpage objects like images are taking 5s to load ...

 

If I do a tcpdump on interface 0.0 on F5 and on the client, I see :

 

  • no TCP problems between HTTP VS and the server (to and from our own caching solution or simply an Apache / IIS customer's server)
  • TCP desynchronisation and so lot of retransmission between the HTTP VS and the client / requester, and with long null activity times (between 1,2s and more than 2s without any trafic or ACK / Retransmit on client or on F5 HTTP VS) for some TCP stream (majority of them are ok and fast even with TCP desync and retransmit).

The result is a complete webpage with 30 to 50 objects (jpeg, js, and other static objects) is sometimes loaded in more than 10/15 sec in our LAN and in our country providers, and problem seems to be amplified in foreign countries, due to transit, but without enormous loading times (more than 60sec in some tries ...). Problem is always due to few objects loaded (1 or 2 images or JS of less than 20KB loaded through LAN in more than 4/5 sec which is the time of page full loading approximatively).

 

I was thinking first it can be due to our providers or our LAN, but with FastL4, I didn't see any signs of the previous mentionned problem.

 

So I started playing with TCP profile (http profile is wan optimize with compression) to try to find some improvements. Disabling Nagle's, Bandwidth delays, Slow Start (etc, etc) as suggested in some of this forum's posts, I see some improvements but random problem of some object loading time and null activity times in TCP transmit are always here and appears totally randomly (in our two F5 8800 / 8900 hardwares which are in 2 different datacenters with different telco providers and different VS).

 

Before opening a case to F5 support, I would know if I can try other options or configurations ? Is v10.2.0/1 well known to have TCP bugs like this inactivity times through HTTP VS trafic ? I will try to reproduce on 6400 we have in spare tomorrow @work, and upgrade to v11 to see if it's a v10.x problem, but any advises are welcome !!!

 

Thanks in advance for your advises and help !

 

9 Replies

  • Could you post the full Virtual Server, HTTP Profile and TCP Profile configurations please? Also, can you tell us what the external VLAN MTU is set to please.

     

  • For the VS :

     

     

    virtual CACHE-HTTP {

     

    pool EDGE-LB

     

    destination 10.10.10.10:http

     

    ip protocol tcp

     

    persist CARP

     

    profiles {

     

    http-wan-optimized-compression {}

     

    oneconnect-255 {}

     

    tcp {}

     

    }

     

    }

     

     

    We have rollback to standard TCP profile.

     

     

    MTU is 1500 for all VLAN interfaces on our 2x10Gb LACP trunk (vPC with 2 Nexus 5548).

     

     

    Thanks for your help.

     

     

    We have identified one reason problem is more intensive since few week, probably due to one of our transit provider. But even locally (inside the DC, just in front of the LTM) we see the same problem ...
  • OK, thanks. Any high latency links in place? Also, is any authentication occurring?

     

     

    BTW, I'd change the following compression settings: Preferred Method: Deflate and gzip Compression Level: 7 or 8 and disable Vary Header Insertion.
  • Any IPsec involved as well? If so, it might be worth dropping the VLAN MTU to prevent fragmentation occurring elsewhere in the network and introducing latency.
  • No, sorry, I mean any IPsec being used in the path from client to server, VPNs, tunnels, client's RASing in, whatever?
  • no even in the front of the LTM the problem is still visible (less because of loss and probably operator fragmentation but it's totally random). In this case, we are in LAN configuration, no WAN, no IPSec ...
  • OK, understood. What are the pool members? Looks like they might be caches? Also, can you quickly try and test without compression enabled, that might help us narrow things down.
  • We have found a part of the reply : international provider problems ...

     

     

    But this is not an explanation on why TCP desynchronisation is so important between the client and the VS.

     

     

    After testing v11.0.0, problem seems resolve ... Is anyone here have known bugs on TCP stack / behavior with 10.2.x ?