Learn F5 Technologies, Get Answers & Share Community Solutions Join DevCentral

Filter by:
  • Solution
  • Technology
Answers

iRule for web scraper block and redirection on ASM decision

Been working on an ASM iRule to handle web scrapers and bot traffic and would like some help reviewing and any suggestions for improvements either around performance or functionality. This is one of my first attempts at an iRule and especially want to make sure I didn't make any egregious errors.

The iRule is designed to do the following things:

  • Redirect a client IP on a known scraper whitelist to a dedicated scraper farm
  • Redirect a client IP on a blacklist to a simple HTTP block page
  • If a client IP doesn't match the whitelist or blacklist but triggers an ASM scraper event, do the following:
  • If during a certain time of day, send the scraper to a dedicated pool
  • If scraper traffic is not during specified time, send client to HTTP block page

Thank you for any assistance in advance! Hope this also can help others looking to do the same.

##########################################################
# iRule Name: Scraper Block and Redirection
# Purpose: Allows whitelisted IPs, blocks blacklisted ips, 
#     and redirects detected scrappers to a secondary farm 
#      during a specific time
##########################################################

# Check for whitelist allow and blacklist deny based on client IP
when CLIENT_ACCEPTED 
{
   set client_ip [IP::client_addr]

 # checks client IP for whitelist, if positive always send to scraper pool
   if { [class match [client_ip] equals ip_whitelist] } 
   {
       # log client whitelist allow
        log local0. "Client connection allowed, client IP: $client_ip  Client on scraper whitelist"
     # send client to scraper pool at all hours
      pool scrapper_pool
  }

    # checks client IP against blacklist, if positive send to blockpage
 elseif { [class match [client_ip] equals ip_blacklist] } 
   {
       # log client blacklist block
        log local0. "Client connection blocked, client IP: $client_ip  Client on Blacklist"
     # send client to blockpage
      HTTP::redirect "http://example.com/blockpage.html"
  }
}

# if an ASM violation is a web scrapper, log traffic and direct to 
# scraper pool if request is during good window, otherwise direct 
# to block page
when ASM_REQUST_VIOLATION 
{
   set x [ASM::violation_data]
 # checks ASM violation to see if it was a web scraper detection
 if {([lindex $x 0] contains "VIOLATION_WEB_SCRAPING_DETECTED")}
 {
       # log the scraper violation
     for {set i 0} { $i < 7 } {incr i} {
       switch $i {
         0         { log local0. "violation=[lindex $x $i]" }
        1         { log local0. "support_id=[lindex $x $i]" }
       2         { log local0. "web_application=[lindex $x $i]" }
          3         { log local0. "severity=[lindex $x $i]" }
         4         { log local0. "source_ip=[lindex $x $i]" }
        5         { log local0. "attack_type=[lindex $x $i]" }
          6         { log local0. "request_status=[lindex $x $i]" }
             }
           } 
      # Set start and end time of when scraper pool is active
     set static::start_time "23:00"
      set static::end_time "05:00"
        # Convert start/end times to seconds from the epoch for easier date comparisons
     set static::start [clock scan $static::start_date]
      set static::end [clock scan $static::end_date]
      # Get the current time in seconds since the Unix epoch of 0-0-1970
      set now [clock seconds]

        # If traffic is during scraper pool hours, send to scraper pool
     if {$now > $static::start and $now < static::end} 
          {
           # log scrapper redirect
         set client_ip [IP::client_addr]
         log local0. "Violation=[lindex $x 0]"
           log local0. "Decided to route scrapper ip $client_ip to different pool"
         # send scrapper to alternate pool
           pool scrapper_pool
          }
       # If traffic is out of scraper pool hours, send to block page
       else 
           {
           # log scrapper block
            log local0. "Violation=[lindex $x 0]"
           log local0. "Scraper traffic out of hours, sent to block page"
          # send client to blockpage
          HTTP::redirect "http://example.com/blockpage.html"
          }
   }
}
1
Rate this Discussion
Comments on this Discussion
Comment made 04-Mar-2014 by ericc01 104
Any input on this iRule?
0

Replies to this Discussion

placeholder+image

Super cool iRule!

There's something awry in the time code. "start_time" and "end_time" are computed but never used. The comparison references "start_date" and "end_date" instead? If this is a typo, and it was really meaning to use "start_time" instead, then see if you can have the code reference the value directly instead of computing it every time.

Also "scrapper" in at least one place.

If this gets triggered a lot, it's a 2 second chnage to make it work with high-speed logging instead.

Whatever you do, don't redirect the scraper to your competition! Haha!

0