Real Least Connections iRule

Problem this snippet solves:

Due to BIG-IP TMOS CMP (Clustered MultiProcessing) architecture, customers that desire least connections behavior for LB will occasionally see uneven LB behavior because each TMM is operating independently and they could (for example) each send 1 connection to pool member A for a total of 2 connections while pool member B has 0 connections. This iRule addresses this issue by using table memory to track the server with the highest number of concurrent connections (table entry "highconn") and if a new LB decision would cause that count to be exceeded, checks to see if all servers have that number of connections; if not, it will force an LB::reselect.

This iRule not recommended for high connection rate virtual servers, but is useful for any application that is particularly sensitive to uneven load balancing (e.g. a terminal services or VDI scenario).

How to use this snippet:

Apply iRule to Virtual Server and use whichever load-balancing algorithm you desire (BIG-IP will use that to make initial LB decision); iRule will do everything it can to "force" least connections behavior.

Code :

# Real Least Connections - For applications sensitive to imperfect least connections load balancing that occurs due to BIG-IP TMOS CMP Architecture
# c.jenison at f5.com (Chad Jenison)

when RULE_INIT {
    set static::vs ssh7
    set static::debug 1
}

when CLIENT_ACCEPTED {

}

when LB_SELECTED {
   if {[table lookup $static::vs+highconn] >= 0} {
        if {[table lookup $static::vs+[LB::server addr][LB::server port]] >= [table lookup $static::vs+highconn]}{
            if { $static::debug != 0 } {
                log "server [LB::server addr]:[LB::server port] with conns: [table lookup $static::vs+[LB::server addr][LB::server port]] at or higher than highconn: [table lookup $static::vs+highconn]"
            }
            if {[table keys -subtable $static::vs+[table lookup $static::vs+[LB::server addr][LB::server port]] -count] < [active_members [LB::server pool]] && [table lookup $static::vs+[LB::server addr][LB::server port]] != 0} {
                LB::reselect
                if { $static::debug != 0 } {
                    log "Reselecting (default LB picked: [LB::server addr]:[LB::server port]) due to uneven load"
                }
            } else {
                table incr $static::vs+highconn
                if { $static::debug != 0 } {
                    log "Increased highconn to: [table lookup $static::vs+highconn]"
                }
            }
        } else {
            if { $static::debug != 0 } {
                log "Selected Server [LB::server addr]:[LB::server port] has: [table lookup $static::vs+[LB::server addr][LB::server port]] connections, adding another."
                log "highconn: [table lookup $static::vs+highconn]"
            }
        }
   } else {
        table set $static::vs+highconn 1
        if { $static::debug != 0 } {
            log "Pool [LB::server pool] in use with max of 1 connection"
        }
   }
   
}

when SERVER_CONNECTED {
    if {[table lookup $static::vs+[LB::server addr][LB::server port]] >= 0} {
        table incr $static::vs+[LB::server addr][LB::server port]
    } else {
        table set $static::vs+[LB::server addr][LB::server port] 1
    }
    table set -subtable $static::vs+[table lookup $static::vs+[LB::server addr][LB::server port]] [LB::server addr][LB::server port] ""
    if { $static::debug != 0 } {
        log "Connected to Pool Member: [LB::server addr]:[LB::server port] - ConnCount: [table lookup $static::vs+[LB::server addr][LB::server port]]"
    }
}

when SERVER_CLOSED {
    if { $static::debug != 0 } {
        log "[IP::server_addr]:[TCP::server_port] closed connection"
    }
    if {[table keys -subtable $static::vs+[table lookup $static::vs+[IP::server_addr][TCP::server_port]] -count] == 1 && [table lookup $static::vs+[IP::server_addr][TCP::server_port]] == [table lookup $static::vs+highconn]} {
        table incr $static::vs+highconn -1
        if { $static::debug != 0 } {
            log "Decremented highconn to: [table lookup $static::vs+highconn]"
        }
    }
    table delete -subtable $static::vs+[table lookup $static::vs+[IP::server_addr][TCP::server_port]] [IP::server_addr][TCP::server_port]
    table incr $static::vs+[IP::server_addr][TCP::server_port] -1
    if { $static::debug != 0 } {
        log "Server Connection Closed for [IP::server_addr]:[TCP::server_port] - Count: [table lookup $static::vs+[IP::server_addr][TCP::server_port]]"
    }
}

Tested this on version:

12.0
Published Feb 06, 2017
Version 1.0

Was this article helpful?

3 Comments

  • Edited rule to remove 86400 lifetime on table entries; edited version uses indefinite timeout.

     

  • Hi Chad,

    if you skip the

    timeout
    and
    lifetimes
    parameters on the
    [table]
    command, it will default to a 180 second timeout with indef lifetime. And once the
    [table]
    data begins to timeout, the entire iRule will more or less produce unpredictable results. To counter the timeout problematic, you may want to either use long lasting timeouts again, or initialize some
    after X_msec -periodic { set x [table lookup xyz] }
    handlers during the
    SERVER_CONNECTED
    event to to refresh the counters of the individual
    [table]
    records right before they will timeout).

    Note: The

    [LB::reselect]
    command allows just a single reselection during the
    LB_SELECTED
    event(its an undocumented behavior). So if the pool contains a couple or even more nodes, the chances to get an equal balancing with this iRule is rather small. E.g. if
    [active_members [LB::server pool]]
    =
    10
    and
    [table keys -subtable XYZ -count]
    =
    9
    it will result in a ~10% chance to
    LB::reselect
    the node which currently has the fewest connections. If a wrong
    LB::reselect
    happens the highconn
    [table]
    value is in addition not increased, causing the the
    SERVER_CONNECTED
    event to overwrite existing
    [table -subtable]
    data, which will in the end cause uncounted connections and may lead to a uneven distribution.

    Note2: In addition you may try to store the result of a

    [table]
    call into a variable and reuse this data as much as possible. The reason for that is, that each
    [table]
    call will cause a TMM parking situation and therefor slows down the iRule execution pretty much. Some code blocks call the same
    [table]
    commands over and over without any reason...

    Cheers, Kai

  • Kai, just saw your comment now (5 months later). Will take the feedback into consideration and revise the iRule. Good feedback, thank you. I realize now that I never got my head around lifetime vs timeout and in light of this, the whole approach of the iRule might be untenable since the table entries set early on (when connection counts are low) are almost inevitably going to timeout no matter what I set it to on any virtual where connection counts don't drop down to 0 on a regular basis (many will not).

     

    Re: Note2, I wasn't so worried about the expense associated with execution of this iRule, because I think the situations that call for "improved" least connections (as opposed to the built-in) to deal with CMP idiosyncrasies are relatively low connection establishment rate virtuals (e.g. VDI) and all the code is only executing during connection establishment and teardown.