Forum Discussion

Dan_Markhasin_1's avatar
Dan_Markhasin_1
Icon for Nimbostratus rankNimbostratus
Aug 20, 2015

Using dict in iRules?

Hi,

 

I'm wondering if anyone had any luck using dict in iRules? I've found some old threads dating back ~5 years ago stating that iRules don't support dict since they are based on Tcl8.4, but perhaps something has changed since?

 

What I'm trying to do is to check if multiple values are in a whitelist for every request to a given virtual server. For example, I am checking if the provided credentials are in a whitelist, if the host header is in a whitelist, etc. Currently I just have 5-6 datagroups - each having a list - that correspond to what I am checking. I'd really like to get rid of that and just use a single datagroup that would contain a dictionary, so I would be able to access it like this:

 

set dictionary [class lookup $somevar datagroup_that_contains_dictionaries]
set whitelisted_accounts [dict get $dictionary accounts]
set whitelisted_extension [dict get $dictionary extensions]

etc.

 

But, I am getting an error that dict is an undefined procedure 😞

 

I considered using a list of lists instead, but it would force me to hard-code the indexes of the different elements I need..

 

Has anyone implemented something similar and can share some tips?

 

4 Replies

  • dict
    is not supported. How are your datagroups organized? You mention that you have 5 or so. Do these correspond to customer (e.g., with each element being a match type, like "header" and "ip"); do they correspond to match type (e.g., with each element being a customer); or something else?

  • Yes, they are keyed on customer, with the value currently being a list (a list of allowed host headers, a list of allowed URIs, etc.).

     

  • I assume that you have a datagroup per-customer. I assume further that each datagroup is indexed by a match type (e.g., "host" for Host header), and that the value for each entry is a space-delimited list. Finally, I assume you use something like

    lsearch
    to see if there is a match.

    If that's correct, I suspected that a

    class match
    plus
    split
    plus
    lsearch
    would be substantially more expensive than a simple
    class match
    . Indeed, testing has suggested that
    class match
    executes in roughly O(1) time, while list iteration is naturally O(n), which among other things means that it will perform increasingly poorly as the lists get longer.

    An alternative approach is to "flatten" the datagroup key space. Right now, if you have something like this as a datagroup entry:

    host {www.foo.com www.bar.com www.baz.com}
    

    (where host is the index and the rest is the value), you could change the data group so it is:

    host:www.foo.com {}
    host:www.bar.com {}
    host:www.baz.com {}
    

    then do something like this:

    if { [class match "[HTTP::host]:www.foo.com" equals match-test-customer01] } { ... }
    

    I did a fair amount of testing, and the second approach is, as I suspected, faster.

    What follows is the details of this analysis:

    I compared two approaches. I believe the first is similar to what you are doing now (per-customer datagroup with a key that points to a string which you expand to a list). The second concatenates the key into a value and searches for that string:

    APPROACH 1 (LIST EXPANSIONS):

    ltm data-group internal list-test-customer01 {
        records {
            hosts {
                data "www.foo.com www.bar.com www.baz.com www.bing.com"
            }
            uris {
                data "/foo/bar /foo/baz / /index.html"
            }
        }
        type string
    }
    
    ltm rule test-list {
        when HTTP_REQUEST {
            set start [clock clicks -milliseconds]
    
             I expect that $x is always 1000000 and $y is always 0
            set x 0
            set y 0
            for { set i 0 } { $i < 1000000 } { incr i } {
                set ll [split [expr { [class match -value hosts equals list-test-customer01] }] " "]
                if { [lsearch -exact $ll www.foo.com] != -1 } {
                    incr x
                }
                if { [lsearch -exact $ll www.bruno.com] != -1 } {
                    incr y
                }
            }
    
            set end [clock clicks -milliseconds]
    
            set delta [expr { $end - $start }]
            log local0. "TEST RUN LIST: x = ($x); y = ($y); delta = ($delta)"
    
            HTTP::respond 200 content "foo$delta\r\n"
        }
    }
    

    In this case, I test the best-case list match (it matches the first list item when using

    lsearch
    ) and the worst-case (it matches no items in the list so must iterate through the entire list).

    The second approach looks like this:

    APPROACH 2 (FLATTENED CLASS MATCHING):

    ltm data-group internal match-test-customer01 {
        records {
            host:www.bar.com { }
            host:www.baz.com { }
            host:www.foo.com { }
            uri:/ { }
            uri:/foo/bar { }
            uri:/foo/baz { }
            uri:/index.html { }
        }
        type string
    }
    
    ltm rule test-match {
        when HTTP_REQUEST {
            set start [clock clicks -milliseconds]
    
             I expect that $x is always 1000000 and $y is always 0
            set x 0
            set y 0
            for { set i 0 } { $i < 1000000 } { incr i } {
                if { [class match "host:www.foo.com" equals match-test-customer01] } {
                    incr x
                }
                if { [class match "host:www.bruno.com" equals expand-test] } {
                    incr y
                }
            }
    
            set end [clock clicks -milliseconds]
    
            set delta [expr { $end - $start }]
            log local0. "TEST RUN MATCH: x = ($x); y = ($y); delta = ($delta)"
    
            HTTP::respond 200 content "foo$delta\r\n"
        }
    }
    

    I ran each of these in a psuedo-random order for 100 iterations using the following script from off-box:

    !/bin/bash
    
    list_time_total=0
    match_time_total=0
    
    list_count=0
    match_count=0
    
    for i in {1..100}; do
        echo -n "."
        port=$(expr 8080 + $(expr $RANDOM % 2))
    
        S="$(date +%s%N)"
        curl http://10.11.212.100:$port >/dev/null 2>&1
    
        E="$(date +%s%N)"
    
        if [ "$port" == "8080" ]; then
            match_time_total=$(($match_time_total + ($E - $S)))
            match_count=$(($list_count + 1))
        else
            list_time_total=$(($list_time_total + ($E - $S)))
            list_count=$(($list_count + 1))
        fi
    done
    
    echo
    
    dm=$((match_time_total / $match_count))
    dl=$(($list_time_total / $list_count))
    
    echo "m = $match_time_total; mc = $match_count; m/mc = $dm; m/mc/1e6 = $(($dm / 1000000))"
    echo "l = $list_time_total; lc = $list_count; l/lc = $dl; l/lc/1e6 = $(($dl / 1000000))"
    

    10.11.212.100:8080 corresponds to the "match-test" and 10.11.212.100:8081 corresponds to the "list-test". Here are the results:

    m = 62587059396; mc = 51; m/mc = 1227197243; m/mc/1e6 = 1227
    l = 142448719265; lc = 49; l/lc = 2907116719; l/lc/1e6 = 2907
    

    So clearly, from off-box, the concatenate ("match") method is substantially faster (in fact, > 2x faster).

    I then modified each iRule, removing the

    for
    loop, and executed again in psuedo-random order with 10,000 iterations. I turned on iRule timing. These are the results:

    root@(b212)(cfg-sync Standalone)(Active)(/Common)(tmos) show ltm rule test-list
    
    ---------------------------------------
    Ltm::Rule Event: test-list:HTTP_REQUEST
    ---------------------------------------
    Priority                    500
    Executions
      Total                    4.9K
      Failures                    0
      Aborts                      0
    CPU Cycles on Executing
      Average                 85.1K
      Maximum                276.1K
      Minimum                 53.4K
    
    root@(b212)(cfg-sync Standalone)(Active)(/Common)(tmos) show ltm rule test-match
    
    ----------------------------------------
    Ltm::Rule Event: test-match:HTTP_REQUEST
    ----------------------------------------
    Priority                    500
    Executions
      Total                    5.0K
      Failures                    0
      Aborts                      0
    CPU Cycles on Executing
      Average                 69.0K
      Maximum                173.0K
      Minimum                 43.9K
    

    The performance difference from this perspective isn't nearly as bad (1.2x for average case and 1.6x for worst case) but still better.

    I'll admit that I'm at a loss to explain why, from off-box, the results are starker than when measured on-box. The result sets are quite stable, which implies to me that it is a timing precision issue on one side or the other.

    This is undoubtedly WAAAAY more information than you wanted or needed, but I was intrigued by this question. 🙂

  • Wow, thanks for the detailed explanation ! 🙂

    Currently, my data groups are not per customer, but rather per "item". So I have for example a data group that lists the allowed hostnames, i.e.:

    customer1 : {"www.foo.com" "www.bar.com"}
    customer1 : {"www.bar.com" "www.baz.com"}
    

    etc.

    a similar data group for accounts, etc. So basically for every connection I need to lookup multiple data groups to get the data I need. Like I mentioned in the original post, I found it possible to create a single data group based on a list of lists, per customer, such as:

    customer1: { {"www.foo.com" "www.bar.com"} { "account1" "account2" } {".jpg" ".gif"} }
    

    And then I access the different elements in that list by index, i.e. set accountslist [lindex $biglist $1], and then I use lsearch (without split) to check if a given account is in that list.

    I will definitely look into your suggestion of using class match, thanks!

    p.s. still, a shame that dict is not supported. Would have made it much easier for me...