Note: As of 11.4, WebAccelerator is now a part of BIG-IP Application Acceleration Manager.

This is article two of ten in a series on DevCentral’s implementation of WebAccelerator. Join Colin Walker and product manager Dawn Parzych as they discuss the ins and outs of WebAccelerator. Colin discusses his take on implementing the technology first hand (with an appearance each from Jason Rahm and Joe Pruitt) while Dawn provides industry insight and commentary on the need for various optimization features.

 

 

While there are near endless options when it comes to web acceleration, and we will explore many of them, it’s usually best to start from the beginning, as it were. In this case, as with almost anything on the wire, “the beginning” happens to be the TCP stack. While most may immediately want to jump to web server and browser settings when posed with the “how do you get more out of your application?” question, they would honestly be missing a fair quantity of possible gains. We will certainly tweak those things as well, but let’s work our way up to that.

To begin with, we first want to ensure that we’re using optimized TCP settings. There numerous options at this layer that can be customized to suit your particular application needs. While each of these can absolutely be custom tweaked, we also offer profiles on the BIG-IP that are excellent starting points. Profiles allow you to configure a set of options for a particular scenario or application and re-use or apply it as desired easily. To start with we’ll be selecting the appropriate profiles for our application.

Dawn Says...

 

TCP is the delivery mechanism for web applications. If there is congestion along the route or packets get lost going from point a to point b then the delivery of the application will be impacted. Ensuring the route the packets travel is optimal can be adjusted by fine tuning TCP parameters.

When I first ventured into the on-line world it was with a 2400 baud modem, not the fastest way to get on-line today. Network access speeds are significantly faster but some of the standard value for TCP parameters have not changed. The initial congestion window is one of these parameters. The standard value of at most 4 segments for the initial congestion window has remained unchanged even though the network speeds are increasing. There are proposals out there to increase the standard value to 10 segments as opposed to 4. Being able to tune this value for a web application can have an impact on how quickly flows can finish.

Compression is a key component of application delivery optimization, yet surprisingly many sites still are not enabling compression. In the early days of the web users avoided compression due to multiple browser bugs that existed, today many of these bugs have been resolved. Two of the biggest reasons for uncompressed content today:

  1. Anti-virus or proxy software that removes the accept-encoding header from the client in order to inspect the responses.
  2. Misconfigured web servers

Compression is a great way to reduce the amount of data being sent on the wire, if you’re not using it you should be.

Keep in mind that we’re tuning two TCP stacks here, not just one. Since the BIG-IP is a full proxy it has both a client-side and a server-side TCP stack. We will want to address these individually because the environment in which they operate is dramatically different. Namely, in our case, the client-side stack communicates across the internet to any inbound users making requests. The server-side stack communicates with, as you might imagine, the servers. This means that the client-side options should be tuned for a WAN type environment, and the server-side for a LAN type environment.

It just so happens that we have profiles for exactly that, and you can bet that’s not a coincidence, as this is a very common configuration. In this case we chose the wam-tcp-wan-optimized profile for the client-side and the wam-tcp-lan-optimized profile for the server-side, with a couple of adjustments. This gets us a solid base of TCP optimizations such as window sizing, buffers, etc. These changes will be different on the client-side (WAN optimized) vs. server-side (LAN optimized) profiles, for good reason. Things like the Nagles algorithm can be quite beneficial in certain cases, and rather detrimental in others. Turning this on in a LAN environment may not get you what you want. The table below shows the progression from parent to child for each of the client-side and server-side tcp profiles. Both custom stacks begin with the tcp profile, which is the parent of tcp-[l|w]an-optimized, which is the parent of wam-tcp-[l|w]an-optimized, which is the parent of dc-wam-tcp-[l|w]an-opt. The highlighting is provided to denote changes from the immediate parent. We could create the dc-wam-tcp-[l|w]an-opt profiles directly from the tcp profile, but since the other child profiles were already mostly tuned, it just makes sense to use them as the parents. Notice our final profiles took just two changes each.

wa_tcp_3

Once we had a solid base for both stacks, we went through and made a couple of minor tweaks to each, based on recommendations from some of our more experienced and decorated WA experts here within F5. We made sure that slow-start was enabled on both the WAN and LAN profiles, and then modified the init-cwnd (Initial Congestion Window Size) to 10 for the external (WAN) profile, and he init-cwnd and init-rwnd (Initial Congestion Window Size and Initial Receive Window Size, respectively) both to 16 on the internal (LAN) profile. This is to wring some further benefit out of our particular setup, given the way our application behaves.

Jason Says...

 

Research (here and here) has shown that higher initial congestion windows are of greater benefit to higher round trip times of 200ms or greater, whereas smaller round trip times of 200ms or less should be configured with a smaller initial congestion window. If you have clientele in both ranges, you might benefit from configuring two versions of the virtual server, one forwa_tcp_4 default traffic of <200ms and then track the RTT of client connections and redirect client sessions trending higher than 200ms to the virtual server with a higher initial congestion window.

With that, we did some more testing, and were confident that we had gotten to a comfortable point with our TCP optimization, and it was now time to move up the stack to layer 7 where there is ripe fruit to be had in the world of HTTP optimization.

The next most logical thing for us to look at was compression. This is a very common, basic function that most web servers will perform to increase HTTP application performance. By reducing the size of the data sent to the client on each response we’re able to make better use of their available bandwidth and reduce load times. This is, of course, only if bandwidth is a consideration, but that’s a discussion for another time.

While we could easily have enabled this on our webserver, it is a function that Web Accelerator performs quite well, and with some added intelligence. Not to mention it allows us to free up the server to do what it’s supposed to do – serve. So on goes the compression option within Web Accelerator, offloading that responsibility from the application server and allowing us to the level of that compression based on page or object type. For our application we’re only compressing text/HTML files with the default compression engine, and it’s set to a flat compression level. There are all sorts of bells and whistles in WA for modified compression options, but given our application, those weren’t needed.

Now that we had the TCP stack tuned appropriately and some basic compression set up, it was time move on to more complex methods of wringing the most out of our HTTP based application possible. Fortunately WA is very much up to that task.