Web Application Performance

by Patrick Chang, F5 System Architect, Web Acceleration

Editors' Note: While much of this document is generic an applies to WebAccelerator in general, much of it is specific to 5.x. If you are looking for 9.x solutions, the version 9.4.3 users' guide can be found in PDF format  here.

Two main things affect web application performance: time to generate the HTTP response, and time to deliver the HTTP response over the network.  Most of the time, it is the time to deliver the HTTP response over the network that is responsible for poor web application performance.

Until recently, web acceleration usually meant caching static objects in the RAM of a proxy server, off loading SSL from the web servers to a network device, and off loading compression from the web servers to a network device.  This type of acceleration will improve the time to generate the HTTP response if the web servers are overloaded.  However, it will not improve performance if the time to deliver the HTTP response over the network is responsible for poor performance or if the web servers are not overloaded.  In addition, caching static objects in the RAM of a proxy server is of relatively low value.  Serving static content from web servers consumes very little resources and web servers are very inexpensive.  The only real drawback to adding more web servers is the cost of managing more devices on the network.

The Web Accelerator changes the entire game.  It does everything described above, and then goes one step further by safely caching content at the browser.  Subsequent browser requests get the majority of their content from their own cache without the risk of using outdated objects.  Use of bandwidth across the wide area network drops dramatically and the effects of latency are defeated by not making the network round trip at all.  The following describes how the Web Accelerator can be used to optimize web application performance.
 

 

HTTP Application Performance

Introduction


While measuring application performance is as often simple as determining the length of time required to render a requested page in a given browser, optimizing application performance is a multifaceted problem which can require different strategies depending on the application, network topology, and infrastructure.
While the Web Accelerator comes standard with profiles for many common Web applications, an understanding of the various optimization technologies employed and when to enable them is useful.  The following Application Performance Notes demonstrate how to employ the Web Accelerator to achieve maximal performance.
 

SECTION 1 Optimizing HTTP Performance Using Compression

SECTION 2 Optimizing HTTP Performance Using Application Smart Caching

SECTION 3 Optimizing HTTP Performance Using Intelligent Browser Referencing

SECTION 4 Optimizing HTTP Performance Using MultiConnect

SECTION 5 Optimizing HTTP Performance Using NTLM Acceleration

SECTION 6 Optimizing HTTP Performance Using SSL Termination

SECTION 7 Optimizing HTTP Performance Using ESI Includes

SECTION 8 Optimizing HTTP Performance by Tuning Connection Parameters

SECTION 9 Optimizing HTTP Performance Using Response Rewriting

SECTION 10 Optimizing HTTP Performance Using Document Optimization

 

Determine Your Pain Points

The first step in addressing performance issues is characterizing the application and determining the pain points.  To help you, F5 recommends you run the F5 Performance Measurement Tool to quantify your current Web application performance.  This tool records page load times, packets transferred, and bytes received over a configurable navigation path. 

Once you’ve characterize your current performance, You may choose to read the following description of some general types of issues encountered.  These explanations are intended to help you understand how F5’ technology works and how they may be best applied.

Understanding the Issues

The industry qualifies performance issues by their location relative to the application server.  First Mile problems are bottlenecks that occur within the confines of the datacenter.  This includes such issues as bottlenecks in database access; server inefficiency or connectivity issues, and lengthy page generation times.  Middle Mile problems refer to network latency issues and other wide area network or internet characteristics.  Last Mile problems pertain to poor user connectivity and content design issues. 

The sections that follow detail the major contributors to application performance problems and provide specific strategies to maximize performance using the Web Accelerator and ends with a couple of special cases that don’t neatly fit the industry characterizations given above.

Addressing Server Induced Performance Issues – The First Mile

Background:  How not to waste resources and ensure fresh content. 

First Mile bottlenecks - also referred to as generation bottlenecks - appear when the enterprise database, application server, and web server attempt to meet the ever-increasing demands of users and application complexity.  A key indicator that a Web application’s performance is gated by generation inefficiencies is that it performs poorly regardless of client bandwidth or latency.

Initially, First Mile performance issues were easily identifiable.  Web servers and application servers for public sites were overloaded by the sheer volume of requests they were receiving.  Standard system load indicators such as CPU utilization, memory utilization, and disk IO all indicated an overloaded server.  Modern enterprise applications often show no such indicators.  Enterprise applications perform much more work per request than external web sites at lower levels of concurrency.  As a result, per request latency for internal applications is often much greater than for external applications at the same levels of utilization.  Additionally, serial data aggregation by portals and other enterprise applications delay server responses without taxing the system. 

First Mile Performance Strategies

The Web Accelerator is capable of leveraging multiple optimization technologies to offload Web application infrastructure.  

1. Improving Web Server Scalability

SECTION 6 Optimizing HTTP Performance Using SSL Termination

SECTION 8 Optimizing HTTP Performance by Tuning Connection Parameters

2. Improving Application Server Scalability

SECTION 2 Optimizing HTTP Performance Using Application Smart Caching

SECTION 6 Optimizing HTTP Performance Using SSL Termination

SECTION 7 Optimizing HTTP Performance Using ESI Includes

SECTION 8 Optimizing HTTP Performance by Tuning Connection Parameters

3. Improving Database Scalability

SECTION 2 Optimizing HTTP Performance Using Application Smart Caching

SECTION 7 Optimizing HTTP Performance Using ESI Includes

 

Addressing Network Induced Performance Issues – The Middle Mile

Background:  How Network characteristics contribute to application latency 

TCP is the transport protocol used by the Web to deliver client requests and server responses.  TCP breaks both requests and responses into segments and inserts header information with each segment.  This additional header information allows reassembly, reliable delivery, flow control, and connection management.  This is important as it means that TCP imposes an additional “byte-count” overhead for every request and response.

In addition, TCP is a connection based protocol.  This means that there is a very specific message exchange between the client and server before any transmission of information can take place.  This three way handshake contributes a delay to every new connection between client and server and is directly proportional to the “round-trip” time between the two.  This is a second source of latency which results in a reduction of effective bandwidth. 

Furthermore, to ensure that both the client and the server remain in sync, TCP requires client and server to periodically wait for acknowledgement of data transmitted. How often acknowledgements are required is determined by the advertised window size.  The TCP standard limits this to at most 65535 and can come into play when transferring large objects.  Because Web applications usually deal with the transfer of relatively small objects, TCP congestion control “slow start” is often more of an issue.  Slow-start allows only two segments to be transferred before an ACK exchange is required.  This penalizes the transfer by introducing a “round-trip” time delay into the transfer.  Upon successful transmission and acknowledgement, the window is opened up to four segments, and then eight … but for small objects the transmission is usually complete before full bandwidth can ever be realized unless persistent connections are employed.  Windowing and congestion control represent yet another source of network latency and again result in a reduction of effective bandwidth on a given connection.

Upon examination of layer 7, even more performance obstacles arise.  Unlike the majority of client server applications, HTTP is a very “chatty” protocol.  Pages are not retrieved in a single request, but rather in a series of exchanges.  This additional level of exchanges on top of TCP results in increased performance degradation over high latency connections.  Finally, additional inefficiencies are introduced by authentication, authorization, and encryption layers as well as less than optimal protocol implementations. 

Middle Mile Performance Strategies

The Web Accelerator allows the adoption of different performance strategies depending on the typical user connection.  

<!--[if !supportLists]-->

Scenario
Bandwidth/Latency
Network Type
A
High Bandwidth/Low Latency 1000/100/10 Mbps LAN
B
High Bandwidth/High Latency Satellite/DSK/T1/T3
C
Low Bandwidth/Low Latency Frame Relay / ISDN - WAN
D
Low Bandwidth/High Latency Dial Up - WAN
E
Low Bandwidth/High Latency - Small Objects Wireless/PDA - WAN

1. Avoid Connection Churn (Applies to: A, B, C, and D)

SECTION 5 Optimizing HTTP Performance Using NTLM Acceleration

SECTION 8 Optimizing HTTP Performance by Tuning Connection Parameters

2. Reduce packets transferred (Applies to A, B, C, and D)

SECTION 1 Optimizing HTTP Performance Using Compression

SECTION 3 Optimizing HTTP Performance Using Intelligent Browser Referencing

 

3. Parallelize transfers. (Applies to B and E)

SECTION 4 Optimizing HTTP Performance Using MultiConnect

 

Addressing Client Side Performance Issues – The Last Mile

Background: Big Page, Small Pipe

In the beginning, web pages were small.  The majority of Web sites had simple layouts and used only a handful of images.  Total page size, including images often was less than 35 Kb.  Over time, page layout has increased many fold in complexity and images are used liberally as spacers, buttons, and eye candy.  Through the use of complex table design, modem Web applications often produce base pages (not including images) in the 100 Kb to 250 Kb range. 

While some Last Mile connections have kept pace, many others have not.  Frame relay users and dial up users are often restricted to transfer speeds of 7 Kb/second or less.  This volume mismatch between end user connectivity and application network requirements has resulted in slow and sometimes unusable Web applications.

Dial up users are further hindered by the additional latency their modems induce.  Despite having similar data transfer rates, analog modems and frame relay can differ by an order of magnitude in network latency.  While this additional latency is less significant across such limited bandwidth connections, there are technologies to address it as well. 

Last Mile Performance Strategies

The Web Accelerator implements two technologies critical to improving end user to the Last Mile.  When deployed together, these technologies address both the limited bandwidth common to Last Mile connections and the network latency induced by modems. 

SECTION 1 Optimizing HTTP Performance Using Compression

SECTION 3 Optimizing HTTP Performance Using Intelligent Browser Referencing

SECTION 9 Optimizing HTTP Performance Using Response Rewriting

SECTION 10 Optimizing HTTP Performance Using Document Optimization