Monitoring Your Network with PRTG - Custom Sensors Part 2

Articles in this series:

In my last “Custom Sensors Part 1” article, I covered what sensors are, the wire-format for custom sensors, and the main logic in our sensor we use for monitoring our BIG-IP GTM, LTM, ASM, and WA devices. In this article, I will continue on with the examples and walk you through what we are using for testing the website itself.

The Application

One may ask why the simple HTTP monitor is not sufficient to test our web application. The reason is that DevCentral consists of the following three applications stitched together with application logic and some iRule goodness:

Main application CMS
Wiki
Blogs

I could have created several monitors for the various applications, but I wanted to come up with something that determined the overall “health” of the system. For our requirements, we wanted to have a single monitor aggregate the health of the various applications into a “global” health and a custom monitor turned out to be perfect for the job.

So without further ado, we’ll dig into the various logic and functions in my script.

Usage

The script only takes 2 parameters. The first is the address that we are testing against. I have several instances of this script running at the various layers in our network. I have it pointing directly at the app servers using an HTTP connection, and through our WA and ASM with HTTPS connections. This gives me a nice view of timings at various layers in our network for troubleshooting.

  1: PS> .\DC-HealthCheck.aspx -Server  -SSL
     2: -Server  - the IP address that we want to test our app against.
     3: -SSL                - a switch to tell the script whether to test HTTP or HTTPS.

Utilities

Since the main component of my custom sensor would be validating contents on web page requests, I created a function that would wrap the curl.exe command line. I could have done this with native .NET methods but I took the easy way out since I had a controlled environment with curl present. I made use of curl’s built in resolver to allow the script to make requests to https://devcentral.f5.com/s but allowing it to control which IP addresses it made requests to.

The Get-WebPage() function does the curl request, times the request, and builds a response containing the page content and the time it took to request the page. The time is used later on to give an “average” response time for the site.

The Build-Result() function, generates a PRTG Sensor item for the response to be included in the aggregate sensor output in the various component functions below.

  1: function New-Response()
  2: {
  3:   1 | select "Content", "Time";
  4: }
  5:  
  6: function Get-Webpage()
  7: {
  8:   param(
  9:     $Server,
 10:     $IP = $null,
 11:     $Url
 12:   );
 13:   
 14:   $Port = 80;
 15:   if ( $script:SSL )
 16:   {
 17:     $fullUrl = "https://${Server}${Url}";
 18:     $Port = 80;
 19:   }
 20:   else
 21:   {
 22:     $fullUrl = "http://${Server}${Url}";
 23:   }
 24:   $Output = "";
 25:   
 26:   if ( $IP -ne $null )
 27:   {
 28:     $sec = (Measure-Command {$Output = curl --insecure -sN --resolve "devcentral.f5.com:${Port}:${IP}" $fullUrl }).TotalSeconds;
 29:   }
 30:   else
 31:   {
 32:     $sec = (Measure-Command {$Output = curl --insecure -sNA $UserAgent --max-time 10 $fullUrl }).TotalSeconds;
 33:   }
 34:  
 35:   $R = New-Response;
 36:   $R.Time = $sec;
 37:   $R.Content = $Output;
 38:   
 39:   $R;
 40: }
 41:  
 42: function Build-Result()
 43: {
 44:   param(
 45:     $Channel,
 46:     $Unit = "Custom",
 47:     $CustomUnit = "msec",
 48:     $Mode = "Absolute",
 49:     $Value,
 50:     [switch]$Warning = $False
 51:   );
 52:  
 53:   $Value = [Convert]::ToInt32($Value * 1000);
 54:   
 55:   $w = 0;
 56:   if ( $Warning )
 57:   {
 58:     $script:WARNING = $True;
 59:     $script:STATUS = "ERROR: At least one channel had an error";
 60:     $w = 1;
 61:   }
 62:   $s = @"
 63: 
 64:   ${Channel}
 65:   ${Unit}
 66:   ${CustomUnit}
 67:   ${Mode}
 68:   1
 69:   1
 70:   $([Convert]::ToInt32($w))
 71:   0
 72:   ${Value}
 73: 
 74: "@;
 75:   $s;
 76: }

Testing the Database

The first layer on our stack is the database. We have a database healthcheck page that performs some connection metrics on the database itself without the overhead of the application code in our main applications. This function will call the Get-WebPage function with the db healthcheck URL and report it’s findings to the caller.

  1: function Test-DB()
2: {
3:   param($IP);
4:   
5:   $R = Get-WebPage -Server "devcentral.f5.com" -IP "${IP}" -Url "/dbhealth.aspx";
6:   if ( $R.Content -match "Database Status: UP" )
7:   {
8:     $s = Build-Result -Channel "Database Connection" -Unit "Custom" -Value $R.Time;
9:     $script:TIMES += $R.Time;
10:   }
11:   else
12:   {
13:     $s = Build-Result -Channel "Database Connection" -Unit "Custom" -Value 0 -Warning;
14:     $script:TIMES += 0;
15:     $script:BADCHECKS += "DB";
16:   }
17:   
18:   $s;
19: }

Testing the Homepage

Now that the database has been checked, we’ll test out the main landing page for devcentral. The function is very similar to the database check except that it’s testing a separate URL in the request. The function also updates the $script:BADCHECKS variable with the check name if that check failed. This is reported to PRTG so that in the GUI you can tell which individual “channels” failed.

  1: function Test-HomePage()
 2: {
 3:   param($IP);
 4:   
 5:   $R = Get-WebPage -Server "devcentral.f5.com" -IP "${IP}" -Url "/";
 6:   if ( $R.Content -match "Welcome to F5 DevCentral" )
 7:   {
 8:     $s = Build-Result -Channel "Home Page" -Unit "Custom" -Value $R.Time;
 9:     $script:TIMES += $R.Time;
10:   }
11:   else
12:   {
13:     $s = Build-Result -Channel "Home Page" -Unit "Custom" -Value 0 -Warning;
14:     $script:TIMES += 0;
15:     $script:BADCHECKS += "HomePage";
16:   }
17:   
18:   $s;
19: }

Main Sensor Logic

The main logic in the sensor calls the various “Test-*” functions for the database, homepage, wiki, and blogs. It then averages out the response times and builds a top-level channel for the “Avg App Response” time. This allows a quick glance on how the app is performing as a whole.

The response is then built with the response time channel and the channels for the various tests and the value is returned back to PRTG. For situations where one of the checks failed, a PRTG error is generated with the list of the failed checks in the error text.

  1: param(
 2:   $Server = $(Throw "Server parameter is required"),
 3:   [switch]$SSL = $False
 4: );
 5:  
 6: $script:SSL = $SSL;
 7: $script:WARNING = $false;
 8: $script:STATUS = "OK";
 9: $script:TIMES = @();
10: $script:BADCHECKS = @();
11:  
12: $Tests = @"
13: $(Test-DB -IP $Server)
14: $(Test-Homepage -IP $Server)
15: $(Test-Wiki -IP $Server)
16: $(Test-Blogs -IP $Server)
17: "@;
18:  
19: $t = $script:TIMES | Measure-Object -Average;
20:  
21: $s = @"
22: 
23: $(Build-Result -Channel "Avg App Response" -Unit "Custom" -Value $($t.Average))
24: $Tests
25: $($script:STATUS)
26: 
27: "@;
28:  
29: if ( $script:BADCHECKS.Length -gt 0 )
30: {
31:   $s = @"
32: 
33:   1
34:   An error occured in the following checks: $([string]::join(',', $script:BADCHECKS))
35: 
36: "@;
37: }
38:  
39: $s;
40:  
41: $rc = [Convert]::ToInt32($script:WARNING);;
42:  
43: exit $rc;

Running/Testing the Script Manually

To test the script, put it in the %Program Files%\PRTG Network Monitor\Custom Sensors\EXEXML directory and run it from the command line. The results are below:

  1: PS C:\Program Files (x86)\PRTG Network Monitor\Custom Sensors\EXEXML> .\DC-HealthCheck.ps1 -Server ww.xx.yy.zz
 2: 
 3: 
 4:   Avg App Response
 5:   Custom
 6:   msec
 7:   Absolute
 8:   1
 9:   1
10:   0
11:   0
12:   1212
13: 
14: 
15:   Database Connection
16:   Custom
17:   msec
18:   Absolute
19:   1
20:   1
21:   0
22:   0
23:   416
24: 
25: 
26:   Home Page
27:   Custom
28:   msec
29:   Absolute
30:   1
31:   1
32:   0
33:   0
34:   1249
35: 
36: 
37:   Wiki
38:   Custom
39:   msec
40:   Absolute
41:   1
42:   1
43:   0
44:   0
45:   2688
46: 
47: 
48:   Blogs
49:   Custom
50:   msec
51:   Absolute
52:   1
53:   1
54:   0
55:   0
56:   494
57: 
58: OK
59:

The Graphs in the GUI

When drilling into the PRTG GUI for the custom sensor I created, you’ll see that the output data is imported in and the channels created with the various times are added to the report timeline.

Conclusion

With the support for PowerShell, it’s very easy to build a custom sensor. If you have needs beyond the basic supported sensors, go ahead and jump in and write your own - it’s easy!

Published Sep 18, 2012

Version 1.0