Note: As of 11.4, WebAccelerator is now a part of BIG-IP Application Acceleration Manager.

This is article nine of ten in a series on DevCentral’s implementation of WebAccelerator. Join Colin Walker and product manager Dawn Parzych as they discuss the ins and outs of WebAccelerator. Colin discusses his take on implementing the technology first hand (with an appearance each from Jason Rahm and Joe Pruitt) while Dawn provides industry insight and commentary on the need for various optimization features.

 

 

In a recent performance project, I was asked to come up with a way to automate the page-level metric collection for various optimization profiles we were testing on DevCentral.  The idea is that we would like to come up with a simple way to compare the many performance settings side-by-side and, hopefully, give us a better view as to which would benefit the end users who are accessing the site.

I’ve done some automation work with HttpWatch in the past, so I thought that would be a good starting point.  I’ve blogged in the past about the HttpWatch COM interface in Windows, but I will give a little refresher again to give some context to the work done in this script.

Dawn Says...

 

There is a specific process that needs to be followed to get accurate, actionable metrics in relation to web page performance. It can be a very time consuming process and is prone to errors. Generally speaking you want to run a minimum of 3 test runs for each scenario and take an average of the 3. My preference is to run 10 test runs but time typically doesn’t allow for that. If you are running 4 scenarios and need to run a minimum of 3 times that is at least 12 test runs. Unfortunately I don’t think I have ever been able to run just 12 as I always make mistakes. Take the scenario of running as a first time user with a clean cache. For every test run you need to open the browser, clear the cache, launch a measurement tool of some kind like HttpWatch, access the page, save the report and close the browser. Then repeat. Mistakes that are common to make in this process (at least I pretend they are common so I don’t feel like I’m the only one that does this).

  • Forgetting to clear the cache when running as a first time user. I see the objects being returned with a response code of “Cache” and I want to scream.
  • Loading the page before starting to record with the measurement.
  • Closing the browser and forgetting to save the report

 

Having a way to automate the process eliminates these mistakes and enables the running of multiple test runs without wasting a lot of time. I would much rather go get a cup of tea come back and have all my testing completed.

I chose to implement the solution in PowerShell as that gave me an easy way to interact with the HttpWatch COM object from a scripting environment available on all recent versions of Windows.

Script Requirements

My colleague that offered this great opportunity to me, supplied me with the following functional steps that needed to occur

Baseline: VIP:w1.x1.y1.z1; Domain:devcentral.f5.com; URL: /

1. Test Initial Visits
    - Empty browser cache
    - Access test page with an empty cache
    - Record time and data transfer metrics
    - Close browser
    - Repeat “x” number of times and calculate averages

2. Test Repeat visits
    - Reopen browser
    - Access page as repeat visitor
    - Record time and data transfer metrics
    - Close browser
    - Repeat “x” number of times and calculate averages

Optimization: VIP:w2,x2,y2,z2; Domain:devcentral.f5.com: URLs:/tcp, /compress, /ibr, /img

1. Repeat baseline logic for each URL.

To allow for us to dynamically change optimization profiles for a given connection, we implemented a nifty bit of iRules that would trap the “/tcp”, “/compress”, “/ibr”, and “/img” URLs and issue a HTTP 301 redirect to the test page along with a cookie containing the corresponding BIG-IP profile to apply. An iRule sitting in front of the site looked for that cookie and then enabled the associated profile.  The optimization requests then contained one more request (the 301 redirect) that will need to be accounted for and subtracted from the totals (more on that later).

HttpWatch.Controller

The core object is the HttpWatch.Controller.  It contains the following members:

  • FireFox –Returns a reference to the FireFox object.  Use this property if you want to be using FireFox in your testing.
  • IE – Returns a reference to the Internet Explorere object.  Use this property if you want to use IE for your testing.
  • IsBasicEdition – Returns true if the product is the basic edition, false for Pro.
  • OpenLog(logFileName) – This allows existing HttpWatch log files to be opened and examined using the automation library.
  • Wait(plugIn, timeOutSecs) – Wait for a page to be fully loaded and is normally used after the GotoURL method.
  • Version – A string containing the current version of HttpWatch.

The PowerShell code to create this object looks something like this.

$script:HTTPWATCH = New-Object -ComObject “HttpWatch.Controller”;

And by accessing the FireFox.New(), or IE.New() methods, you’ll now have access to control the HttpWatch interface.  For more info on the HttpWatch API, check out the HttpWatch Automation Overview.

Main Application Logic

The script takes in the following parameters

  • num_retries - the number of times to perform each test for averages
  • browser - use “ff” for FireFox, or “ie” for Internet Explorer.  The default is “ff”
  • WA - an optional switch to determine whether the script is run on the regular site or optimization profiles.
  • SubtractRedirects - If supplied, this parameter will cause the 301 redirects mentioned above to be subtracted from the metrics on the optimization tests.
param(
  $num_retries = 5,
  $browser = "ff",
  [switch]$WA = $false,
  [switch]$SubtractRedirects = $False
);

$script:SUBTRACT_REDIRECTS = $SubtractRedirects;
$script:LOGDIR = "$(Split-Path $MyInvocation.MyCommand.Path)\WATests";
if ( -not [System.IO.Directory]::Exists($script:LOGDIR) ) { mkdir $script:LOGDIR; }

$script:HTTPWATCH =  New-Object -ComObject "HttpWatch.Controller";

#----------------------------------------------------------------------------
# Main App Logic
#----------------------------------------------------------------------------

$tests = @("/");
if ( $WA )
{
  $tests = @("/tcp", "/compress", "/ibr", "/img");
}
$results = Do-DCPerfTest -num_retries $num_retries -browser $browser -tests $tests;
Run-Report.CSV -results $results;

Functional Test Logic

The Do-DCPerfTest() function performs various tests with the specified number of retries.  The difference between initial and return visits is the clearing cache step performed for the initial visits.

function Do-DCPerfTest()
{
  param(
    $num_retries = 5,
    $browser = "ff",
    $tests = @("/")
  );
  
  $results = @();
  
  foreach( $test in $tests )
  {
    Write-Host "------------------------------------------------------------";
    Write-Host "- Testing https://devcentral.f5.com${test} initial visit";
    Write-Host "------------------------------------------------------------";
    
    for($i=0; $i -lt $num_retries; $i++)
    {
      Clear-BrowserCache -browser $browser;
      
      $plugin = New-Plugin;
      $results += Run-TimingTest -plugin $plugin -domain "devcentral.f5.com" -url $test -index $i -note "I";
      $plugin.CloseBrowser();
    }

    Write-Host "------------------------------------------------------------";
    Write-Host "- Testing https://devcentral.f5.com${test} return visitor";
    Write-Host "------------------------------------------------------------";
    
    for($i=0; $i -lt $num_retries; $i++)
    {
      $plugin = New-Plugin;
      $results += Run-TimingTest -plugin $plugin -domain "devcentral.f5.com" -url $test -index $i -note "R";
      $plugin.CloseBrowser();
    }
  }
  
  $results;
}

Single Timing Test

The Run-TimingTest() function will first clear the HttpWatch internal log, load the URL and wait for the page to render, save the HttpWatch log (.hwl file) locally for future analysis, and then process the logs to build a PowerShell object with the results.

function Run-TimingTest()
{
  param(
    $plugin = $null,
    $domain = $(Throw "domain required!"),
    $url = $(Throw "url required!"),
    $index = "",
    $note = "",
    $is_redirect = $null
  );
  
  $results = @();
  if ( $null -ne $plugin )
  {
    Write-Host "Clearing HttpWatch Log...";
    $plugin.Clear();
  
    # Load the url
    Load-WebPage -plugin $plugin -url "https://${domain}${url}";

    # save logs
    $ip = Get-IPFromHostname -hostname $domain;
    
    $logname = "${ip}-${domain}-${url}-${note}-${index}".Replace("/", "");
    $log = "$($script:LOGDIR)\${logname}.hwl";
    Write-Host "Saving log '$logname'";
    $plugin.Log.Save($log);

    # Process logs
    $results += Process-Logs -plugin $plugin -note "${logname}";

    Write-Host "Clearing HttpWatch Log...";
    $plugin.Clear();
  }
  $results;
}

Controlling HttpWatch

The New-Plugin() function will use the HttpWatch Controller to instantiate a new instance of the plugin for the given browser.  The instance of that plugin is returned to the calling code.

function New-Plugin()
{
  param($browser = "ff");
  
  $plugin = $null;
  
  if ( $browser = "ff" ) { $plugin = $script:HTTPWATCH.FireFox.New(); }
  else { $plugin = $script:HTTPWATCH.IE.New(); }

  $plugin;
}

For this test, we wanted to isolate any issues of caching between plugin sessions, so we opted to have new plugins created for each functional step.  This forced a refresh of the plugin and browser objects.  The Clear-BrowserCache() function creates a new plugin instance, Clears the browsers cache with the plugin.ClearCache() method and then closes the browser with the plugin.CloseBrowser() method.

function Clear-BrowserCache()
{
  param($plugin = "ff");
  
  $plugin = New-Plugin -browser $browser;
  if ( $null -ne $plugin )
  {
    Write-Host "Clearing browser cache...";
    $plugin.ClearCache();
    $plugin.CloseBrowser();
  }
}

The Load-WebPage() function first initiates a transport level recording with the plugin.Record() method.  It then calls the asynchronous plugin.GotoURL() method and waits for that load to complete with the plugin.WaitEx() method.  Recording is then stopped with the plugin.Stop() method and control is returned back to the calling code.

function Load-WebPage()
{
  param(
    $plugin = $(Throw "Plugin parameter required!"),
    $url = $(Throw "url parameter required!"),
    [switch]$ClearCache = $false
  );
  
  if ( $null -ne $plugin )
  {
    if ( $ClearCache )
    {
      $plugin.ClearCache();
    }

    Write-Host "Loading page '${url}'...";
    $plugin.Record();
    $plugin.GotoURL($url);
    #$HttpWatch.Wait($plugin, -1) | Out-Null;
    $wait = $script:HTTPWATCH.WaitEx($plugin, -1, $true, $false, 1);
    $plugin.Stop();
  }
}

The last piece of the puzzle is processing the HttpWatch log, that is captured during the page load, and extracting all the required data.  There is a lot going on here but are the main things to look at

  • $plugin.Log - This is where all the log data is stored from recording start to stop.
  • $plugin.Log.Pages - Each URL loaded with the GotoURL() method, will get it’s own Page item.
  • $plugin.Log.Pages.Item(0) - There was only one URL loaded so I’ll look at the first Page only.
  • $plugin.Log.Pages.Item(0).Events - Some aggregate data for the page including DOMLoad, PageLoad, HTTPLoad, and RenderStart times.
  • $plugin.Log.Pages.Item(0).Entries - This is a list of all the items requested for the given GotoURL request.  This includes redirects, embedded images, scripts, etc.

Once you get those items down, the Process-Logs() function should hopefully make sense.

function Process-Logs()
{
  param(
    $plugin = $null,
    $note = ""
  );
  
  $page = $plugin.Log.Pages.Item(0)
  
  $summary = $page.Entries.Summary;
  $events = $page.Events;
  $entry = $page.Entries.Item(0);
  
  $url = $entry.URL;
  $redirectTime = 0;
  $redirectBytesSent = 0;
  $redirectBytesReceived = 0;
  $redirectRoundtrips = 0;
  
  if ( $entry.IsRedirect -eq $true )
  {
    $url += " -> ";
    $url += $entry.RedirectURL;
    $redirectTime = $entry.Time;
    $redirectBytesSent = $entry.BytesSent;
    $redirectBytesReceived = $entry.BytesReceived;
    $redirectRoundTrips = 1;
  }
  
  $DOMLoad = $events.DOMLoad.value;
  $PageLoad = $events.PageLoad.value;
  $HTTPLoad = $events.HTTPLoad.value;
  $RenderStart = $events.RenderStart.value;
  
  $roundTrips = $summary.RoundTrips;
  $bytesReceived = $summary.BytesReceived;
  
  if ( $script:SUBTRACT_REDIRECTS )
  {
    $PageLoad -= $redirectTime;
    $HTTPLoad -= $redirectTime;
    $roundTrips -= $redirectRoundTrips;
    $bytesReceived -= $redirectBytesReceived;
  }
  
  $compressionSavedBytes = $summary.CompressionSavedBytes
  
  Write-Host "TIMING: $note - rt: $RoundTrips; pl: $PageLoad";
  
  $o = 1 | select Note, URL, RenderStart, PageLoad, HTTPLoad, BytesReceived, CompressionSavedBytes, RoundTrips, RedirectTime
  $o.Note = $note;
  $o.URL = $url;
  $o.RenderStart = $RenderStart;
  $o.PageLoad = $PageLoad;
  $o.HTTPLoad = $HTTPLoad;
  $o.BytesReceived = $bytesReceived;
  $o.CompressionSavedBytes = $compressionSavedBytes;
  $o.RoundTrips = $roundTrips;
  $o.RedirectTime = $redirectTime;
  $o;
}

Running the Script

Since the requirements called for different IP addresses for the two virtuals, I had to make a host entry before each run of the script to make sure that the browser was going to the correct VIP.

#for VIP 1 (substitute w.x.y.z for real IP address)
w1.x1.y1.z1 devcentral.f5.com
#for VIP 2
w2.x2.y2.z2 devcentral.f5.com

Then the script can be run like the following:

PS> # Change IP Address to VIP 1
PS> .\DC-WATest.ps1 > results.csv
PS> # Change IP address to VIP 2
PS> .\DC-WAText.ps1 -WA >> results.csv
PS> type results.csv
BytesReceived,CompressionSavedBytes,HTTPLoad,Note,PageLoad,RedirectTime,RenderStart,RoundTrips,URL
1519542,467368,4.713,w1.x1.y1.z1-devcentral.f5.com--I-0,4.716,0,1.046,54,https://devcentral.f5.com/
1514991,467368,4.432,w1.x1.y1.z1-devcentral.f5.com--I-1,4.435,0,1.044,54,https://devcentral.f5.com/
1515020,467368,4.548,w1.x1.y1.z1-devcentral.f5.com--I-2,4.55,0,1.051,54,https://devcentral.f5.com/
...
BytesReceived,CompressionSavedBytes,HTTPLoad,Note,PageLoad,RedirectTime,RenderStart,RoundTrips,URL
1350376,637177,5.378,w2.x2.y2.z2-devcentral.f5.com-tcp-I-0,5.381,0.555,1.06,54,https://devcentral.f5.com/tcp -> https://devcentral.f5.com/
1345751,637177,4.401,w2.x2.y2.z2-devcentral.f5.com-tcp-I-1,4.403,0.637,1.171,54,https://devcentral.f5.com/tcp -> https://devcentral.f5.com/
1345662,637177,4.538,w2.x2.y2.z2-devcentral.f5.com-tcp-I-2,4.54,0.606,1.155,54,https://devcentral.f5.com/tcp -> https://devcentral.f5.com/
...

The raw data can then be imported into a spreadsheet and processed to determine the overall effectiveness of the various optimization profiles. 

Conclusion

HttpWatch proves to be a very valuable tool to integrate into your automation and monitoring environments.  For more information on HttpWatch go to their httpwatch.com.  The documentation for their Automation interface is located at the HttpWatch Automation Overview site.

Source Code

The full source for this script can be found in the Advanced Design and Config wiki under “Programmatic Performance Testing with HttpWatch”.