Forum Discussion

Dave_24664's avatar
Dave_24664
Icon for Nimbostratus rankNimbostratus
Dec 11, 2007

problem with Eduardo Saito irule_limit_num_connections_googlebot

I am new.

 

 

I tried just cutting and pasting Eduardo Saito's winning iRule into my 6400, but I get an error:

 

 

Code is:

 


when RULE_INIT {
 array set ::active_clients { }
}
when CLIENT_ACCEPTED {
 switch -glob [string tolower [HTTP::header "User-Agent"]] {
 "*googlebot*" {
 set client_ip [IP::remote_addr]
 if { [info exists ::active_clients($client_ip)] } {
 if {$::active_clients($client_ip) > 10 } {
 reject
 log local0. "Reject GOOGLEBOT IP $client_ip ($::active_clients($client_ip))"
 return
 } else {
 incr ::active_clients($client_ip)
 }
 } else {
 set ::active_clients($client_ip) 1
 }
 }
 }
}
when CLIENT_CLOSED {
 switch -glob [string tolower [HTTP::header "User-Agent"]] {
 "*googlebot*" {
 set client_ip [IP::remote_addr]
 if { [info exists ::active_clients($client_ip)] } {
 incr ::active_clients($client_ip) -1
 if { $::active_clients($client_ip) <= 0 } {
 unset ::active_clients($client_ip)
 }
 }
 }
 }
}

 

 

error I get is:

 

 

 

01070151:3: Rule [googlebot] error: line 6: [command is not valid in current event context CLIENT_ACCEPTED] [HTTP::header User-Agent] line 25: [command is not valid in current event context CLIENT_CLOSED] [HTTP::header User-Agent]

 

 

 

What am I doing wrong?

8 Replies

  • Thanks...

     

     

    I'm surprised that it won with something that doesn't work?

     

  • Yes, there's a bit of irony. I'm not sure if it was a typo or what. Maybe someone could update it with with the correct event name.

     

     

    Aaron
  • Here is an update version based on Eduardo's which sends back a 503 response (Click here) to a bot which exceeds the maximum number of concurrent HTTP requests. It passes the syntax check, but isn't tested.

    
    when RULE_INIT {
        Maximum number of concurrent HTTP requests
       set ::max_conc_http_requests 10
        Response content to send to a client which exceeds maximum number of concurrent HTTP requests
       set ::response_content "Some titleRetry later"
        Initialize an empty array to track bot IP addresses and current HTTP request counts
       array set ::active_clients { }
    }
    when HTTP_REQUEST {
        Look for bots by their User-Agent string
       switch -glob [string tolower [HTTP::header "User-Agent"]] {
          "*somebot*" -
          "*googlebot*" {
              set a variable to track that this is a bot
             set client_ip [IP::client_addr]
              Check if there is an existing entry in the array for this bot IP
             if { [info exists ::active_clients($client_ip)] } {
                 Check if the bot is already has X number of 
                if {$::active_clients($client_ip) > $::max_conc_http_requests } {
                    Log an entry to syslog-ng
                   log local0. "Reject GOOGLEBOT IP $client_ip ($::active_clients($client_ip))"
                    Send a 503 status back to client
                   HTTP::respond 503 content $::response_content
                } else {
                     Bot IP exists in the array, but the client is under the max
                    incr ::active_clients($client_ip)
                }
             } else {
                 Bot IP doesn't exist in the array so add it
                set ::active_clients($client_ip) 1
             }
          }
       }
    }
    when HTTP_RESPONSE {
        Check if this is a response to a bot and that the IP exists in the array
       if { [info exists client_ip] and [info exists ::active_clients($client_ip)] } {
           Decrement the count in the array
          incr ::active_clients($client_ip) -1
          if { $::active_clients($client_ip) <= 0 } {
              Delete the array if there aren't any entries
             unset ::active_clients($client_ip)
          }
       }
    }

    Aaron
  • This just doesn't work for me.

     

     

    I put in some debugging, and it appears that the counter goes up and up, but never really gets decremented.

     

    And once you are in the penalty box, ie) you are above the counter, you are done, until you delete the array containing the counters.

     

     

    It's almost if the HTTP_RESPONSE doesn't happen for all HTTP_REQUESTs

     

  • Ok, update.

     

     

    I only happens if I have a caching profile selected on the Virtual server.

     

     

    ie)

     

     

    http-lan-optimized-caching

     

    http-wan-optimized-compression-caching

     

     

    If I use just the compression ones, it works fine.

     

     

    I guess my next question is can you not cache for bots?

     

  • I hadn't considered LTM caching with that rule. I haven't worked with caching events in iRules. I would assume you wouldn't want to limit requests which would be answered from cache as there isn't a resource hit on the web servers in the pool. If that's the case, you'd need to figure out if the request was going to be answered from cache and not increment the request counter or decrement it on the cache response event. Reading over the wiki pages (Click here and Click here), it isn't clear to me what triggers them.

     

     

    Can anyone shed light on how the HTTP_REQUEST/HTTP_RESPONSE and CACHE_REQUEST/CACHE_RESPONSE events work together when you have caching enabled on the HTTP profile? I would assume the HTTP_REQUEST event is triggered first. Could you determine in the HTTP_REQUEST event whether the request will be answered from cache? If not, I would think you'd want to increment the request counter in HTTP_RESPONSE and decrement the counter in both CACHE_RESPONSE and HTTP_RESPONSE (assuming only one would possibly be called per request).

     

     

    Aaron