Advanced iRules: Scan

Scan is used to parse out strings. It takes a string and based on your format parameters, stores the matches in one or more variables. It also returns the number of conversions performed, so it can be used in a conditional as well. For all the options available in the command, the scan man page is available here at http://tmml.sourceforge.net/doc/tcl/scan.html. I'll highlight a couple of the options I see used in iRules examples in Q&A below.

 

scan string format ?varName varName ...?

 

Options

  • d - The input substring must be a decimal integer.
  • s - The input substring consists of all the characters up to the next white-space character.
  • n - No input is consumed from the input string. Instead, the total number of characters scanned from the input string so far is stored in the variable.
  • [chars] - The input substring consist of one or more characters in chars. The matching string is stored in the variable.
  • [^chars] - The input substring consists of one or more characters not in chars.

Examples

So how do we put scan to use in iRules? Consider this first example:

when HTTP_REQUEST {
  if { [scan [HTTP::host] {%[^:]:%s} host port] == 2 } {
    log local0. "Parsed \$host:\$port: $host:$port
  }
}

Here we are scanning the host contents. The HTTP::host command only returns a port if it is not port 80 for http traffic and port 443 for ssl traffic, so if it is standard, the second conversion (%s) will not populate the port variable and the conditional will be false. In the scan commands format section, %[^:] tells the scan command to store all the characters in string from the beginning until the first occurrence of the colon. We then put a colon before the %s (which tells scan to store the remaining characters until the end or white space) so it is not included in the port variables contents. Also note that the format string is wrapped in curly braces so that the brackets are not evaluated as a command. Below is the functionality of the scan command in a tcl shell:

% set httphost "www.test.com:8080"
www.test.com:8080
% scan $httphost {%[^:]:%s} host port
2
% puts "$host $port"
www.test.com 8080

Another use case--splitting up an IP address into multiple variables--is accomplished in one easy step below.

% set ip 10.15.25.30
10.15.25.30
% scan $ip %d.%d.%d.%d ip1 ip2 ip3 ip4
4
% puts "$ip1 $ip2 $ip3 $ip4"
10 15 25 30

As with most things with iRules, there are many paths to the same result, even if they require more steps. Here's another way to arrive at the same split IP with variables for each octet. This method requires four sets of a nested split/lindex evaluation to achieve the same result.

% set ip 10.15.20.25
10.15.20.25
% set ip1 [lindex [split $ip "."] 0]
10
% set ip2 [lindex [split $ip "."] 1]
15
% set ip3 [lindex [split $ip "."] 2]
20
% set ip4 [lindex [split $ip "."] 3]
25
% puts "$ip1 $ip2 $ip3 $ip4"
10 15 20 25

If you're wondering why you'd split the IP like this, the use case in the forums was to extract each octet so they could then do some bit shifiting to create a unique ID for their stores based on IP subnets. One final example before closing. The scan string is refined over a few steps to show the elimination of unwanted characters in the variables.

% set sipinfo {}

% scan $sipinfo {%[^:]%s} garbage sessid
2
% puts "$garbage $sessid"

You can see that at the first colon it dumped the contents up to that character into the garbage variable. Everything else, including the colon, is dumped into the sessID variable. Close but we don't want that colon, so we need to include it in the scan format string.

% scan $sipinfo {%[^:]:%s} garbage sessid
2
% puts "$garbage $sessid"

Good. Now we need to break off the host and the port as well. We want all characters up until the @ sign for the session id, then all the characters between the @ sign and the colon for the host, and finally all the characters after the color for the port.

% scan $sipinfo {%[^:]:%[^@]@%[^:]:%s} garbage sessid host port
4
% puts "$sessid $host $port"
214365981110 10.15.20.25 3232>

OK, all looks good except the port. We definitely don't want the > on the port. So one final fix.

% scan $sipinfo {%[^:]:%[^@]@%[^:]:%[^>]} garbage sessid host port
4
% puts "$sessid $host $port"
214365981110 10.15.20.25 3232

Scan is a great command to have available in your iRules arsenal. Thanks to Hoolio, cmbhatt, sre, and natty76 for some great examples. I've archived the article, but for another great scan example, check out my Revisiting the TCL Scan Command article from the original iRules 101 series.

Updated Oct 02, 2023
Version 3.0

Was this article helpful?

1 Comment

  • @Jason,

     

    Great job!!! This Tip help me a lot and I could to solve a ploblem that I had with SIP Header.

     

    The SIP "Contact" Header have some "routerids", may be between six and eith into his. How I needed to use SNAT, the BIG-IP change the original IP into the "routers" and this caused a trouble to SIP comminication between client and SBC proxy, because the when the SBC received the packet, he sent back to the BIG-IP and not to the client.

     

    Using your example I could save the routerid and the client IP into variables and replace that and the "routerid" into the SIP Response.

     

    Follow my iRule:

     

    when SIP_REQUEST {

     

    set addr [IP::client_addr]

     

    }

     

    when SIP_RESPONSE {

     

     Check for 302 responses 
    if {[SIP::response code] == 302} { 
    
        log local0. "302 OK"
    
        if {[scan [SIP::header value "Contact"] {%[^@]%[^,]%[^@]%[^,]%[^@]%[^,]%[^@]%[^,]%[^@]%[^,]%[^@]%s} a b c d e f g h i j k l] == 12} {
    
            log local0. "Parsed before change [SIP::header value "Contact"] into $a - $b - $c - $d - $e - $f - $g - $h - $i - $j - $k - $l"
    
            SIP::header remove "Contact" 
    
            SIP::header insert "Contact" "${a}@$addr>${c}@$addr>${e}@$addr>${g}@$addr>${i}@$addr>${k}@$addr>" 
    
            log local0. "Parsed after change [SIP::header value "Contact"] into $a - $b - $c - $d - $e - $f - $g - $h - $i - $j - $k - $l"
       }        
    }
    

    }

     

    My question now is if rule can increase the load of CPU, because when I see the dump before Rule, the time to arraive the packet was 7ms and after the rule, 35ms.

     

    Thanks a lot Luis Araujo