Learn F5 Technologies, Get Answers & Share Community Solutions Join DevCentral

Filter by:
  • Solution
  • Technology
Answers

Rewrite a uri (not redirect)

I have a need to rewrite a uri, and make this look transparent to the end user. What I have come up with is this:

   
when HTTP_REQUEST {
set loop 0
set max [llength $::shorturi]
while {$loop < $max}{
set tmpstr [lindex $::shorturi $loop]
if {[HTTP::uri] starts_with $tmpstr}{
set uri1 {findstr [http_uri] $tmpstr}
set uri2 {regexp -all -inline {$tmpstr::longuri}}
subst {[HTTP::uri] $uri2}
}
}
}


I have to Data Groups - long_uris and short_uris:

short_uris:

someuri1
someuri2

long_uris:

rewrite/someuri1
rewrite/someuri2

the end of the long uri will always match what the short uri is

I have not had a chance to test this yet - still building out the test environment, but it does not kick any errors on the LTM. I am just looking for some input - as I am not a developer (I'm a network guy).

So - to re-iterate - the long term goal is this:

user comes in to http://www.somesite.com/someuri
and the LTM rewrites the uri on the backside to:
http://www.somesite.com/rewrite/someuri (the rewrite portion is much much longer in the real data)

Thanks in advance
0
Rate this Question

Answers to this Question

placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
I think there are a number of things that can be improved in your example. But, let's not start there yet.

First, let's gather more information about what you're trying to do.

It sounds like you want to insert some path data before certain uris.
So, the first question that comes to mind is whether the inserted rewrite text is different depending on the original uri?

Second, I'm not quite sure how/why you are relating the two datagroups with each other. Perhaps you really just want one datagroup with two fields.

For example, I might have a datagroup that looks like this:
 
class my_uri_mappings {
"/index.html /some/place/for/html"
"/pretty.gif /some/place/for/images"
"/unknown.html /some/place/for/hackers"
}

I might then have a rule that looks of the uri in the datagroup and inserts the second portion of it. This example relies on the fact that when the findclass command is supplied with a separator (the 3rd argument) it will return the latter portion of the entry that matches the beginning portion:
 
when HTTP_REQUEST {
set rewrite [findclass [HTTP::uri] $::my_uri_mappings " "]
if { $rewrite ne "" } {
HTTP::uri [concat $rewrite [HTTP::uri]]
}
}

The next example relies on the fact that the matchclass command returns the list index + 1 of a matching entry. Since it doesn't appear that you are actually rewriting the original portion of the uri, but only inserting a leading string, this approach could also be used:
 
class my_uri_mappings {
"/some/place/for/html/index.html"
"/some/place/for/images/pretty.gif"
"/some/place/for/hackers/unknown.html"
}

 
when HTTP_REQUEST {
set rewrite [matchclass $::my_uri_mappings ends_with [HTTP::uri]]
if { $rewrite } {
HTTP::uri [lindex $::my_uri_mappings [expr $rewrite - 1]]
}
}

0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Again - this just shows why I am not a programer - I used the second example - and that works rather well in all respects except - I don't wan't the client to see the new uri - I just want it to be passed on the back end - any way we can do that?

Thanks for the examples
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
I did a little more testing - works great if the uri is equal to something in the data group. In my case - if the uri starts with something in the data group, I want to rewrite that portion of the uri, example:

http://www.somesite.com/uri/index.html

would become

http://www.somesite.com/rewrite/uri/index.html

I always to need to preserve and pass the "end" of the uri, if the beginning matches something in the data group.

0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Just an update - here is the final rule that I ended up going with:

Assuming I have a data group called uris formated like this:
/uri1 /longuri/directoryone
/uri2 /longuri/directorytwo
/uri3 /longuri/someotherdirectory


   
when HTTP_REQUEST {
if { [getfield /[findclass [findstr [http_uri] "" 1 "/"] $::uris] " " 1] ne ""}{
HTTP::uri "[findclass /[findstr [http_uri] "" 1 "/"] $::uris " " ][HTTP::uri]"
}
}


0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Nice work!

I would suggest the following subtle optimizations:

1) Extract the first directory from the uri just once and save it in a variable.
2) Use matchclass which is far more efficient than findclass to determine if the class actually contains the first directory.

 
when HTTP_REQUEST {
# Extract first directory from uri
set first_dir {/[findstr [HTTP::uri] "" 1 "/"]}
# See if class contains an entry matching the first directory
if { [matchclass $::uris starts_with $first_dir] } {
# Retrieve the 2nd field of the class member and rewrite the uri
HTTP::uri "[findclass $first_dir $::uris " "][HTTP::uri]"
}
}

0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
For the non-coders on this forum, (ok, idiot ME!) can you explain why that is an optimized version of the previous example? Thanks.
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Well, I thought I explained that... I'll try again...

1) Instead of searching/parsing the first level directory twice, only do it once and save it in a variable.

2) Use matchclass instead of findclass to check whether the class contains the directory. Although in reviewing this, it would probably be even more efficient to only call findclass once.

In general, doing searches or parsing only once and saving the result in a variable is more optimal than doing the same thing multiple times.

Here's an even further optimization that only searches the class once.
 
when HTTP_REQUEST {
# Extract first directory from uri
set first_dir {/[findstr [HTTP::uri] "" 1 "/"]}
# See if class contains an entry matching the first directory
set rewrite_dir [findclass $first_dir $::uris " "]
if { $rewrite_dir ne "" } {
HTTP::uri "$rewrite_dir[HTTP::uri]"
}
}


0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
So - setting a variable before a rule matches will not cause additional overhead when processing a rule? My thought was I did not want to set a variable, unless the traffic passed my if statement - and at that point - I figured, why set a variable at all, if I can just splat it with one line. I will try the latest example once my test environment is back online and let everyone know how it goes.

Thanks again for all the pointers and help!
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
That is a very good point. You have obviously thought about this. Of course, it will all really depend on just how often you expect to match. If it does not match often, then you are completely correct. If it matches regularly, then you would likely want to save the result in a variable. Another factor to weigh is the number of elements in the class/datagroup.

For those that are interested and paying attention, I'm now going to mention a YASF (yet another stealth feature):

You can enable timing statistics in a rule which will allow you to see just how many cycles are spent evaluating a given rule event. The way you do this is with the "timing on" statement.

An example that enables timing for all subsequent events in a rule is:
   
rule my_fast_rule {
timing on
when HTTP_REQUEST {
# Do some stuff
}
}

An example of only timing a specific event is:
   
rule my_slow_rule {
when HTTP_REQUEST timing on {
# Do some other stuff
}
}

This will then collect timing information each time the rule is evaluated and can be viewed with "b rule <name> show all". You'll likely only want to look at the average and min numbers as max is often way, way out there due to the optimizations being performed on the first run of the rule. Additionally, enabling timing does have some overhead, though it should be negligible.

0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
This rule will be used on a very busy e-comm type site. My understanding is they peak at 52 million unique page clicks per day, during the holiday season, with 20 million unique page clicks being the average . . .

My understanding of HTTP_REQUEST is that it will be evaled for every client header request - so pretty much every "hit" to these sites will use this rule - correct?
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Ok.

Here's another way to look at that traffic:

With the busiest traffic being 52 million page clicks (we will assume this actually refers to unique HTTP requests), then that works out to be an average of 602 req/sec assuming around the clock clicking, or an average of 1805 req/sec if concentrated into an 8 hour busy period, or an average of 14444 req/sec if all 52 million clicks were concentrated into 1 hour. Any of these levels of traffic are well below the upper limits of v9 software on most of our platforms.

I'm not sure where we were going with this, but it doesn't sound like you've got a whole lot to be concerned about other than writing a really, really crappy rule (which by no means have you done). The idea to making on efficient rule is not always how much it costs the current connection, but how much it costs the entire system which might be processing lots of different sites. Then you obviously want to make it as efficient as possible.

It sounds like you are the right person to make the decision as you know the traffic patterns the best and also know the pros and cons to doing it either way. I was just trying to point out the subtle differences.

To finish, your understanding of HTTP_REQUEST is correct. It is evaluated on every client request whether it's the first request on a new connection or a subsequent request on an existing connection.

On an off topic side note: you could use "event HTTP_REQUEST disable" if you actually wanted to turn off this rule event for subsequent requests.

Good luck and let us know how things work out.
0