Forum Discussion
hooleylist
Dec 11, 2007Cirrostratus
Here is an update version based on Eduardo's which sends back a 503 response (Click here) to a bot which exceeds the maximum number of concurrent HTTP requests. It passes the syntax check, but isn't tested.
when RULE_INIT {
Maximum number of concurrent HTTP requests
set ::max_conc_http_requests 10
Response content to send to a client which exceeds maximum number of concurrent HTTP requests
set ::response_content "Some titleRetry later"
Initialize an empty array to track bot IP addresses and current HTTP request counts
array set ::active_clients { }
}
when HTTP_REQUEST {
Look for bots by their User-Agent string
switch -glob [string tolower [HTTP::header "User-Agent"]] {
"*somebot*" -
"*googlebot*" {
set a variable to track that this is a bot
set client_ip [IP::client_addr]
Check if there is an existing entry in the array for this bot IP
if { [info exists ::active_clients($client_ip)] } {
Check if the bot is already has X number of
if {$::active_clients($client_ip) > $::max_conc_http_requests } {
Log an entry to syslog-ng
log local0. "Reject GOOGLEBOT IP $client_ip ($::active_clients($client_ip))"
Send a 503 status back to client
HTTP::respond 503 content $::response_content
} else {
Bot IP exists in the array, but the client is under the max
incr ::active_clients($client_ip)
}
} else {
Bot IP doesn't exist in the array so add it
set ::active_clients($client_ip) 1
}
}
}
}
when HTTP_RESPONSE {
Check if this is a response to a bot and that the IP exists in the array
if { [info exists client_ip] and [info exists ::active_clients($client_ip)] } {
Decrement the count in the array
incr ::active_clients($client_ip) -1
if { $::active_clients($client_ip) <= 0 } {
Delete the array if there aren't any entries
unset ::active_clients($client_ip)
}
}
}
Aaron