Forum Discussion

Ross_79174's avatar
Ross_79174
Icon for Nimbostratus rankNimbostratus
Mar 11, 2009

Help caching urls with unique date/time stamp

Hello irule gurus!

 

 

We have an issue on our production service where we need to cache our rss feeds to offload our backend servers. The problem is that each rss url has a unique date/time stamp attached to it....therefore it never will "hit" in the cache since no two instances are ever the same. However, we are willing to give up some security for now, to go ahead and "strip" the url so that the date/time stamp are not put into the cache, but rather just the part of the url that stays consistent.

 

 

The complete url looks something like this:

 

 

/rss/Pepcom/Pepcom+June+2008?oauth_consumer_key=ced9fcdbcae5bd941f51cf82421e6413&oauth_nonce=6191575&oauth_signature_method=HMAC-SHA1&oauth_timestamp=1236751715

 

 

However, we only need this to be able to properly serve the url:

 

 

/rss/Pepcom/Pepcom+June+2008?oauth_consumer_key=ced9fcdbcae5bd941f51cf82421e6413

 

 

We can drop everything from the first "&" on, and still serve the rss feed properly.

 

 

We would also like to keep each feed active in the cache for 10 minutes, before expiring.

 

 

Can anyone help?

 

 

Thanks,

 

 

Ross

 

 

3 Replies

  • Me again,

     

     

    Here is what I have started with, but from looking at the cache hits, I know the trimright is not working correctly.

     

     

    when RULE_INIT {

     

    set ::fifteen_minutes 900

     

    }

     

    when HTTP_REQUEST {

     

    if { [HTTP::uri] starts_with "/rss" } {

     

    persist none

     

    CACHE::enable

     

    CACHE::uri [string trimright [HTTP::uri] &]

     

    set cachetime $::fifteen_minutes

     

    pool rss

     

    } else {

     

    pool gallery

     

    log local0. "matches rss"

     

    }

     

    }

     

     

    Thanks.
  • Hi Ross,

     

     

    TMM will crash if you force caching with CACHE::enable and the request does not contain a Host header (not required in HTTP v1.0) or a URI (required in all HTTP versions). This is described in SOL9617 (Click here). So it would be good to add a check for the Host header value and path having a length:

     

     

    if { [HTTP::uri] starts_with "/rss" && [string length [HTTP::host]] && [string length [HTTP::path]] } {

     

     

    To parse the portion of the URI you mention in the first post, you can use HTTP::path to get the URI minus the query string and then just the parameter value for oauth_consumer_key

     

     

    So to get this:

     

     

    /rss/Pepcom/Pepcom+June+2008?oauth_consumer_key=ced9fcdbcae5bd941f51cf82421e6413

     

     

    You can use this:

     

     

    "[HTTP::path]?oauth_consumer_key=[URI::query [HTTP::uri] oauth_consumer_key]"

     

     

    Lastly, 'persist none' will disable persistence for the duration of the TCP connection. If there are multiple HTTP requests over the same TCP connection, with an RSS request followed by a non-RSS request, the non-RSS request wouldn't be persisted within the gallery pool. You would want to explicitly set persistence for both cases (persist none and persist .

     

     

    Aaron
  • Hi Aaron,

     

     

    Thanks for your prompt response. The string manipulations all worked and we are now happily caching RSS feeds.

     

     

    I appreciate your help!

     

     

    -Ross