Forum Discussion

Jamey_Price_105's avatar
Jamey_Price_105
Icon for Nimbostratus rankNimbostratus
Jun 21, 2006

Trying to rewrite page content

I have an iRule I'm adapting from some posted on this site. My goal is to take the content that clustered internal servers send to the client and strip out their hostnames and replace them with the hostname of the site.

 

 

Needless to say, this does not go so well.

 

 

when HTTP_REQUEST {

 

Don't allow data to be chunked

 

if { [HTTP::version] eq "1.1" } {

 

if { [HTTP::header exists "Connection"] } {

 

I commented this out. It was in the original

 

iRules I'm basing this off of, but I got an

 

error whenever my rule hit this line.

 

HTTP::header replace "Connection" "Keep-Alive"

 

}

 

HTTP::version "1.0"

 

}

 

}

 

when HTTP_RESPONSE {

 

Only check responses that are a text content type

 

(text/html, text/xml, text/plain, etc).

 

if { [HTTP::header "Content-Type"] starts_with "text/" } {

 

Get the content length so we can request the data to be

 

processed in the HTTP_RESPONSE_DATA event.

 

log "Response type is [HTTP::header "Content-Type"]"

 

if { [HTTP::header exists "Content-Length"] } {

 

set content_length [HTTP::header "Content-Length"]

 

} else {

 

set content_length 4294967295

 

}

 

if { $content_length > 0 } {

 

HTTP::collect $content_length

 

}

 

This is the line that is the source of the rather gnarly

 

and question-mark laden text I've pasted in below.

 

log "My content type is [HTTP::header "Content-Type"] and my data is [HTTP::payload]"

 

}

 

}

 

when HTTP_RESPONSE_DATA {

 

Length of internal host name

 

set target_length 23

 

Length of site host name

 

set correction_length 26

 

set offset [expr $correction_length - $target_length]

 

Initialize the counter for the number of replacements done

 

I'm counting the number of replacements because I figure

 

each time I do one it's going to screw with the offset of

 

every pair that comes after.

 

set replacement_counter 0

 

set host_indices [regexp -all -inline -indices "target-host0.\.domain" [HTTP::payload]]

 

foreach host_idx $host_indices {

 

log "Loop iteration $replacement_counter and index is $host_idx"

 

set host_start [expr {[lindex $host_idx 0] + {$replacement_counter * $offset}}]

 

set host_len [expr {[lindex $host_idx 1] - $host_start + 1 + {$replacement_counter * $offset}}]

 

log "Host_start is $host_start and Host_len is $host_len, replacement_counter is $replacement_counter"

 

HTTP::payload replace $host_start $host_len "my.website.com"

 

set replacement_counter [expr $replacement_counter + 1]

 

}

 

}

 

 

This is the output I end up with in my log. Call me crazy, but I'm pretty sure this is why my regexp isn't matching.

 

 

My content type is text/html; charset=ISO-8859-1 and my data is ????????????????is??????????????????V0??????)????%??S????'?? ??????????????o????}{!??$nl????????C??h??????????q??????????,;??i??nGCv5????????N????Uo??????n??uv??;??|l; ????x??Sm_??????:??v????>9$BQ??????U??(??Nu35t??i]????!????????c?????? ????R??hZ??{B????7:9::??????N??????~w???????????????? ????;(??????????7??}??I??????????y????7??q/4??G??????????\????k??????\&??^??$??????????a????????????C??????????>????K??[G. ??M??!??s)qiMS????l??????????????6????C????h????????????,????Vjv

 

 

I'm also pretty sure I just need to decode whatever it is that my server is throwing back at me. There's an excellent chance that the internal hosts are compressing the page before they deliver it. Can someone help me out with how to figure out exactly what is going on with my content and how to get it back to nice, easy-to-manipulate text?

 

 

Thanks,

 

 

Jamey

5 Replies

  • Hi Jamey,

     

     

    I'm not sure why the original rule you're trying is erring, but it might be more efficient and easier to use the stream profile to perform the substitution.

     

     

    There are a few related posts you can find if you search for stream. These two should get you started:

     

     

    Click here

     

    Click here

     

     

    9.2 Config Guide: Stream Profile (Click here)

     

     

    Aaron
  • We're using 9.1.0, and when I create a stream profile like this...

     

     

    profile stream reports.fmm.com-stream {

     

    defaults from stream

     

    source "server03.domain.com:6666"

     

    target "productionsite.domain.com"

     

    }

     

     

    I end up getting nothing on the client side when the pool picks server03. When it picks server04, the page loads as normal with all the internal hostnames in the content.

     

     

    Does this have something to do with the source and target being different lengths?

     

     

    Also, seeing as we need to catch both server03 and server04, we are going to have to upgrade to 9.2, correct?

     

     

    Jamey

     

  • As if this weren't strange enough, when the client is IE, when the request is routed to the host that matches the stream you get nothing. When the client is Firefox, you get a very long delay, at the end of which you get a portion of the page content. And the portion you get has indeed been rewritten.

     

     

    I'm trying to get a TCP dump of this oddity now.

     

     

    Jamey
  • Deb_Allen_18's avatar
    Deb_Allen_18
    Historic F5 Account
    The behaviour you are describing is what we've seen when the content length is incorrect due to stream replacement.

     

     

    To eliminate the issue you're experiencing, change Response Chunking to "Rechunk" on the http profile for that virtual.

     

     

    HTH

     

     

    /deb
  • Yeah, the rechunking allowed us to actually get content, which is nice. We've also upgraded to 9.2.3 so I'll be working on getting the stream to rewrite all the content it needs to. Hopefully all will be well. Thanks for the help.

     

     

    Jamey