Forum Discussion

Blaisure_212538's avatar
Blaisure_212538
Icon for Nimbostratus rankNimbostratus
Mar 15, 2016

Web Scraping and Bot Detection

We have tried several times and different ways to implement bot detection and web scraping prevention to keep unwanted bots and attackers at bay.

 

Problems: The bot detection leverages Javascript that causes 302's and has adverse SEO implications. Web scraping profile seems to block legitimate traffic regardless of how high we set the thresholds on transaction anomalies. If we try to block via opening sessions we don't seem to catch anything.

 

Is there any way to get a true capture within ASM of actual "session Count" from a normalized user? We use IBM's Tealeaf and apparently it interprets sessions differently than ASM does.

 

1 Reply

  • BinaryCanary_19's avatar
    BinaryCanary_19
    Historic F5 Account

    with web scraping, you can configure DNS on the system, and add known search engines to your allowed list, and as long as DNS validation proves successful, legitimate search engine bots will not get any hinderance from ASK. See https://support.f5.com/kb/en-us/products/big-ip_asm/manuals/product/asm-implementations-11-6-0/5.html?sr=52379255

     

    Session Opening Anomaly is useful for clients that drop cookies. When you have clients that are discarding cookies, it means they get a new session every time they drop cookies, and this allows you to configure thresholds that will affect such clients.

     

    Note that the issuance of javascript challenge is not necessarily blocking as if the client supports javascript, it should execute the javascript. Bots tend to not execute javascript, so from the point of view, they are blocked. Real users should find it easy to navigate through the javascript successfully, and like i mentioned earlier, real search engines are exempt, as long as their source IP addresses match what ASM sees in DNS.