Forum Discussion

Mike_Lowell_456's avatar
Mike_Lowell_456
Historic F5 Account
May 08, 2006

URI decoding H-E-DOUBLE-HOCKEY-STICKS

I'm trying to compare the HTTP URI that I receive in a request to an HTTP URI string of my choosing in a way that's robust (i.e. "%46oo" == "foo").

In this case I have Windows servers, so not only do I want to decode %xx notation (and +), I also want to compare in a way that's case-insensitive, i.e. foO, %46oo and so on.

I'm running BIG-IP version 9.2.2, and I've hit a few snags. I hope you can direct me to better solutions to my problems.

Issues (in no particular order):

1) URI::compare doesn't have a "-nocase" option. This means it's not useful for comparisions with Windows servers where: "/foo" == "/FoO". Given that IIS runs >30% of the internet, it seems like a good enhancement for iRules.

http://news.netcraft.com/archives/web_server_survey.html

2) URI::encode doesn't work for HTTP::uri's. If you encode "/foo/a.html", you will get "%2ffoo%2fa.html". I think there should be an option that doesn't encode slashes in the HTTP path, doesn't encode the ? query separator, and doesn't encode & or = in the HTTP query. The current form of URI::encode cannot be used on HTTP::uri's because this encoding makes the URI invalid. i.e. if you send "GET %2f HTTP/1.0\r\n\r\n" to an Apache server, it will give you a 404. This same problem applies to ?, &, and =.

3) URI::path doesn't seem to include the filename as part of the PATH, i.e. a request for "/foo/a.html" returns "/foo/" as the URI::path. This seems incorrect based on my reading of RFC2616 section 3.2.2 and RFC2396 section 3 -- they seem pretty clear in indicating that "PATH" is everything before the query separator (if one exists). They have no concept of "basename" as used in iRules. Normally I would just use HTTP::path (which is right), but in this case I need to extract the PATH from a variable, so URI::path is required.

Because of these issues, the iRule that I've written is much longer than I had hoped and seems kludgy. I hope you can advise better methods for me.

BTW, please know that this iRule is intentionally verbose / declares many variables / is slow. It's not written for performance at all, and this is okay for now (I'll fix that later).


when HTTP_REQUEST {
   grab a copy of the URI as it exists in the request
  set raw_uri [HTTP::uri]
   decode the URI for later comparison
  set decoded_uri [URI::decode $raw_uri]
   normalize the URI to lower-case
  set uri [string tolower $decoded_uri]
  if { $uri starts_with "/foo/" } {
     extract the part of the URI after /foo/
    set end_of_uri [string range $uri 5 end]
     create the new URI with /bar/, preserving the rest of the URI
    set uri "/bar/$end_of_uri"
     encode the URI (important so the webserver doesn't end up parsing double-encoded content)
    set encoded_uri [URI::encode $uri]
    log "ORIGINAL ENCODED_URI: $encoded_uri"
     fix over-encoding so I can parse PATH and QUERY from $encoded_uri
    set encoded_uri [string map {%2f / %3f ?} $encoded_uri]
     fix over-encoding of QUERY
    set path [URI::path $encoded_uri]
    set basename [URI::basename $encoded_uri]
    set query [URI::query $encoded_uri]
    if { $query ne "" } {
      set query [string map {%3d = %20 + %26 & / %2f} $query]
      set new_uri "$path$basename?$query"
    } else {
      set new_uri "$path$basename"
    }
    log "ORIGINAL URI: $raw_uri, ENCODED_URI: $encoded_uri, PATH: $path, QUERY: $query, NEW_URI: $new_uri, BASENAME: $basename"
     change the URI in the request to the new URI that points to /bar/
    HTTP::uri $new_uri
  }
}

I appreciate any advise!!

BTW, if there are no better ways to achieve my goal, can BIG-IP be enhanced to include new functionality like I describe above? Thanks!

a1l0s2k9

3 Replies

  • To eliminate case, use the standard TCL command [string tolower $string ]

     

     

    I'll defer to the developers for your other questions.
  • Mike_Lowell_456's avatar
    Mike_Lowell_456
    Historic F5 Account
    Yes, I know. As you can see in the iRule provided above, I do use [string tolower ...]. However, this doesn't help in the case of URI::compare where the strings must be decoded (internally) before they can be made lower-case for the purposes of comparison.

    In thinking about the problem more, it seems it would be very nice if URI::compare not only a "-nocase" option, but also had an option to specify the comparison operator like matchclass (i.e. starts_with, equals, ...).

    An additional idea comes to mind as well. Knowing that the starts_with operator is probably used most often to compare directory names in the HTTP URI, it would be nice if there was a starts_with_dir operator that eliminated the need for my two "if" conditions above.

    Instead of this:

    
      if { $uri starts_with "/foo/" || $uri equals "/foo" } {

    It would be just this instead:

    
      if { $uri starts_with_dir "/foo" } {

    It seems that if URI::compare could be extended a bit, my iRule could be greatly simplified and robust URI comparisons could be made very easily. It seems like this would be a good area to enhance iRules since they are used for parsing the HTTP URI so much. As I read a lot of the forum posts here, I'm surprised to see how few people take into account URI encoding...I wonder how many administrators are parsing request URI's in a way that can be defeated with simple URI encoding (and I wonder for how many this exposes data they didn't intend to expose...).

    Lastly, it seems the iRules wiki example for URI::compare maybe isn't right?

    http://devcentral.f5.com/wiki/default.aspx/iRules/URI__compare.html

    It shows a comparison between "http://www.foo.com/somepath" and [HTTP::uri] (which would only contain the "/somepath" part of the request). If URI::compare really does use RFC2616 section 3.2.3 for comparison, it seems this should fail.

  • sorry, I was looking at your question and breezed by your code. What you've done with your rule is the path I would have gone down as well.