Jamey_Price_105
Jun 21, 2006Nimbostratus
Trying to rewrite page content
I have an iRule I'm adapting from some posted on this site. My goal is to take the content that clustered internal servers send to the client and strip out their hostnames and replace them with the hostname of the site.
Needless to say, this does not go so well.
when HTTP_REQUEST {
Don't allow data to be chunked
if { [HTTP::version] eq "1.1" } {
if { [HTTP::header exists "Connection"] } {
I commented this out. It was in the original
iRules I'm basing this off of, but I got an
error whenever my rule hit this line.
HTTP::header replace "Connection" "Keep-Alive"
}
HTTP::version "1.0"
}
}
when HTTP_RESPONSE {
Only check responses that are a text content type
(text/html, text/xml, text/plain, etc).
if { [HTTP::header "Content-Type"] starts_with "text/" } {
Get the content length so we can request the data to be
processed in the HTTP_RESPONSE_DATA event.
log "Response type is [HTTP::header "Content-Type"]"
if { [HTTP::header exists "Content-Length"] } {
set content_length [HTTP::header "Content-Length"]
} else {
set content_length 4294967295
}
if { $content_length > 0 } {
HTTP::collect $content_length
}
This is the line that is the source of the rather gnarly
and question-mark laden text I've pasted in below.
log "My content type is [HTTP::header "Content-Type"] and my data is [HTTP::payload]"
}
}
when HTTP_RESPONSE_DATA {
Length of internal host name
set target_length 23
Length of site host name
set correction_length 26
set offset [expr $correction_length - $target_length]
Initialize the counter for the number of replacements done
I'm counting the number of replacements because I figure
each time I do one it's going to screw with the offset of
every pair that comes after.
set replacement_counter 0
set host_indices [regexp -all -inline -indices "target-host0.\.domain" [HTTP::payload]]
foreach host_idx $host_indices {
log "Loop iteration $replacement_counter and index is $host_idx"
set host_start [expr {[lindex $host_idx 0] + {$replacement_counter * $offset}}]
set host_len [expr {[lindex $host_idx 1] - $host_start + 1 + {$replacement_counter * $offset}}]
log "Host_start is $host_start and Host_len is $host_len, replacement_counter is $replacement_counter"
HTTP::payload replace $host_start $host_len "my.website.com"
set replacement_counter [expr $replacement_counter + 1]
}
}
This is the output I end up with in my log. Call me crazy, but I'm pretty sure this is why my regexp isn't matching.
My content type is text/html; charset=ISO-8859-1 and my data is ????????????????is??????????????????V0??????)????%??S????'?? ??????????????o????}{!??$nl????????C??h??????????q??????????,;??i??nGCv5????????N????Uo??????n??uv??;??|l; ????x??Sm_??????:??v????>9$BQ??????U??(??Nu35t??i]????!????????c?????? ????R??hZ??{B????7:9::??????N??????~w???????????????? ????;(??????????7??}??I??????????y????7??q/4??G??????????\????k??????\&??^??$??????????a????????????C??????????>????K??[G. ??M??!??s)qiMS????l??????????????6????C????h????????????,????Vjv
I'm also pretty sure I just need to decode whatever it is that my server is throwing back at me. There's an excellent chance that the internal hosts are compressing the page before they deliver it. Can someone help me out with how to figure out exactly what is going on with my content and how to get it back to nice, easy-to-manipulate text?
Thanks,
Jamey