Learn F5 Technologies, Get Answers & Share Community Solutions Join DevCentral

Filter by:
  • Solution
  • Technology
Answers

Regex in STREAM::expression

I am looking to implement a regex within STREAM::expression to find the User Comments  section on our web pages which we mark with a specific begin and end comment in the HTML. I am using the following within the iRule (with the usual Stream::disable / enable etc. which I've stripped out here for readability) which in theory should perform a string replacement within that section of the web page.

STREAM::expression {@sectionbegin.+sectionend@}

when STREAM_MATCHED {
   log local0. "matched: [STREAM::match]"
   STREAM::replace "[string map {[^A-Za-z][fF][iI][nN][dD][^A-Za-z] *******} [STREAM::match]]"
}

The difficulty I'm having is getting the regex in STREAM::expression to match over line feeds / carriage returns. I've tried [.\r\n]+, [\s\s]+ and various others without success. I can get the match to work in part if I miss out sectionend from my example but do not match on a line feed.

Any suggestions or the correct regex for this would be most appreciated.

Thanks

Darren C
0
Rate this Question

Answers to this Question

placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Could you post the match that you're trying to use? It sounds like you're trying to include the carriage return in the search, which might be mucking things up a bit. I've never had any problems with the STREAM profile not crossing lines, so I'd like to see what it is you're searching for exactly, if possible.

Thanks,
#Colin
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Hi Colin,

Thanks for getting back to me, Here is the iRule as it stands at present and it's the only iRule on my test VIP. It seems to work if I have an end marker that in the html output is quite near the first regex, so for example having my last match as a .+> seems to work as I hit that fairly quickly (hence my theory that line feeds / carriage returns are what's giving me difficulty).

when HTTP_RESPONSE {
#HTTP::header remove server

foreach header {Server Date X-Powered-By} {
while { [HTTP::header exists $header] } {
# log local0. "Removing- $header: [HTTP::header value $header]"
HTTP::header remove $header
}
}

# Disable the stream filter by default
STREAM::disable

# Check if response type is text
if {[HTTP::header value Content-Type] contains "text"}{

# Match any http:// instance and replace it with nothing
#STREAM::expression {@
.*>@} - This doesn't break the page
STREAM::expression {@comments.start.here.+comments.end.here@}
#STREAM::expression {@comments.start.here.*comments.end.here@}
# Enable the stream filter for this response only
STREAM::enable
}
}

when STREAM_MATCHED {
log local0. "matched: [STREAM::match]"
#STREAM::replace "[string map {[^A-Za-z][bB][uU][sS][^A-Za-z] *******} [STREAM::match]]"
# STREAM::replace "[string map {helloooooooo XXXXXXXXXXX} [STREAM::match]]"
}
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
I just tested with a simple STREAM::expression rule which matches across new lines:


when HTTP_REQUEST {
    STREAM::disable
}
when HTTP_RESPONSE {
    if {[HTTP::header content-type] starts_with "text"}{
        STREAM::expression {@This.*yet@@}
        STREAM::enable
    }
}
when STREAM_MATCHED {
    log local0. "[STREAM::match]"
}


Here is a request direct to the web server (with spaces inserted in the HTML tags to prevent the forum code from removing them):


# curl 10.1.0.100
< html >< body >< h1 >It works!< /h1 >
< p >This is the default web page for this server.< /p >
< p >The web server software is running but no content has been added, yet.< /p >
< /body >< /html >


And here is a request through the VS and iRule:


# curl 10.1.0.15
< html >< body >< h1 >It works!< /h1 >
< p >.< /p >
< /body >< /html >


And the debug log from STREAM::match:

< STREAM_MATCHED >: This is the default web page for this server.

The web server software is running but no content has been added, yet



Aaron
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Another thought, is there a limit to the amount of data that STREAM::match will accept, I've not amended the iRule but the last bit of html I tested against finished on
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Is it possible that your matching terms are over 4096 bytes apart? This is the default match length that will be buffered:

http://devcentral.f5.com/wiki/default.aspx/iRules/STREAM__max_matchsize

Aaron
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
I think this is possible as sometimes the pages just break. Looking at your example, I'm basically doing exactly the same thing as your example code so it appears that maybe this isn't due to line feeds.

However, that's gives me another problem. When I add STREAM::max_matchsize, I get the following error.

[command is not valid in current event context (HTTP_RESPONSE)] [STREAM::max_matchsize 2048]

I've found an example of the same problem on the forum.

http://devcentral.f5.com/Community/GroupDetails/tabid/1082223/asg/39/aft/21001/showtab/groupforums/Default.aspx

I've tried adding STREAM::max_matchsize in the LB_SELECTED and STREAM_MATCHED events where I get no syntax errors, but I don't think it makes much difference.

Thanks

Darren
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Based on Spark's reply, I think STREAM::max_matchsize should work in LB_SELECTED. I think it's too late to use in STREAM_MATCHED.

Can you use LB_SELECTED to set it, test and view the source HTML on a page that doesn't get rewritten to count the number of bytes (or characters) between the start and end tags you're trying to match on?

Aaron
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Thanks again Aaron, I agree that STREAM_MATCHED would be too late. The volume of data I'm working on is 9964 bytes so definitely larger than the default allowed so I'm pretty sure now that is where the problem lies.

I've added the following to my iRule

when LB_SELECTED {
STREAM::max_matchsize 999999
}

This causes no syntax errors but is not resolving the issue, I've tried various smaller sizes also. I also notice that I always get to the same part of the code no matter what I set in this so I'd assume this is 4096 bytes in (though I've not checked that yet). Could the STREAM::disable statements be resetting this setting or something in my logic, if I'm correct iRule events process in the following order - HTTP_REQUEST,  LB_SELECTED, HTTP_RESPONSE, STREAM_MATCHED but I've tried disabling the STREAM::disable in HTTP_RESPONSE but still no success.

I'll keep trying and update if I get to the bottom of it but if you've any further ideas, I'd be most grateful.

FYI, here is my current iRule.

when HTTP_REQUEST {

   # Explicitly disable the stream profile for each request so it doesn't stay 
   #   enabled for subsequent HTTP requests on the same TCP connection.
STREAM::disable
}

when LB_SELECTED {
STREAM::max_matchsize 999999
}

when HTTP_RESPONSE {

   #HTTP::header remove server

   foreach header {Server Date X-Powered-By} {
      while { [HTTP::header exists $header] } {
        # log local0. "Removing- $header: [HTTP::header value $header]"
         HTTP::header remove $header
      }
   }
   
   # Disable the stream filter by default
   STREAM::disable

   # Check if response type is text  
   if {[HTTP::header value Content-Type] contains "text"}{  
      # Match and replace it with nothing 
#STREAM::expression {@comments.start.here.*>@@}
STREAM::expression {@comments.start.here.*comments.end.here>@@}  
      # Enable the stream filter for this response only
      STREAM::enable
   }
}

when STREAM_MATCHED {  
   log local0. "1st MATCH [STREAM::match]"
   # Check if the matched string meets some condition that can't easily be checked for using a single regex in STREAM::expression
   if {[STREAM::match] contains "Darren"}{

      # Replace Darren with XYXYXY and do the replacement
      STREAM::replace "[string map {Darren XYXYXY} [STREAM::match]]"
      log local0. "[IP::client_addr]:[TCP::local_port]: matched: [STREAM::match], replaced with: [string map {Darren XYXYXY} [STREAM::match]]"  
      log local0. "2nd MATCH"
   }
}

Thanks

Darren
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
Hi Darren,

I'm not sure why that wouldn't work. It seems logical. Can you open a case with F5 Support and ask them to investigate this? If you do, make sure to explain that you're not asking them to write an iRule for you, but want help troubleshooting an issue with an existing iRule that should be working.

Thanks, Aaron
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER
I've got a ticket on the go with F5 support (raised on Friday last week), I'll update on how it goes. Thanks Darren
0
placeholder+image
USER ACCEPTED ANSWER & F5 ACCEPTED ANSWER

FYI, I hit this issue recently. I'm not sure I fully understand what a partial match is but regardless, I doubled the value of the buffer, within the HTTP_RESPONSE event:

when HTTP_RESPONSE {
  STREAM::max_matchsize 8192
  STREAM::enable
  ...
}

Note, the actual response payload that was giving me issues was around 64kB. Considering the issue was intermittent, it's possible the buffer relates to all traffic handled by the Virtual Server, not each connection.

Using the STREAM::max_matchsize command in the STREAM_MATCHED event (as I'd seen suggested elsewhere) doesn't work.

0