In this session of iRules Security 101, I'll walk you through on a process to strip unnecessary content from your outbound application responses.  Section 3.2.5 of RFC 1886 (The Hypertext Markup Language - 2.0) allows for comments to be enclosed within HTML content.  In certain cases, this could lead to unwanted information being exposed. 

<body>
  ...
  <!-- Pull user info from Users table in database and format a list from that data -->
  ...
</body>


In this article, I'll show you how to remove all those HTML comments from all HTTP traffic that's leaving your network. Other articles in the series:

An early of every (respectable) developers training is to thoroughly comment ones code.  Typically this is a safe process as source code is either compiled or obfuscated before being made available to the client.  But, with the advent of web based applications, situations can occur that could cause your source code to be leaked.  While stripping out HTML comments will not completely secure all content breaches, it will do a small part in making sure that any internal information included in asp/jsp/html development is not allowed to reach the masses.

The following example will inspect all HTML responses for patterns matching HTML comments and replace those characters with spaces, effectively erasing them from the outside world.

 

when HTTP_REQUEST {
  # Don't allow data to be chunked. This ensures we don't get
  # a comment that is spread across two chunked boundaries.
  if { [HTTP::version] eq "1.1" } {
    if { [HTTP::header is_keepalive] } {
      HTTP::header replace "Connection" "Keep-Alive"
    }
     HTTP::version "1.0"
  }
}
when HTTP_RESPONSE {
  # Ensure all of the HTTP response is collected
  if { [HTTP::header exists "Content-Length"] } {
     set content_length [HTTP::header "Content-Length"]
  } else {
     set content_length 1000000
  }
  if { $content_length > 0 } {
     HTTP::collect $content_length
  }
}
when HTTP_RESPONSE_DATA {
  # Find the HTML comments
  set indices [regexp -all -inline -indices {<![ \r\n\t]*--([^\-]|[\r\n]|-[^\-])*[^/][^/]--[ \r\n\t]*>} [HTTP::payload]]
  # Replace the comments with spaces in the response
  #log local0. "Indices: $indices"
  foreach idx $indices {
     set start [lindex $idx 0]
     set len [expr {[lindex $idx 1] - $start + 1}]
     #log local0. "Start: $start, Len: $len"
     HTTP::payload replace $start $len [string repeat " " $len]
  }
}

The special sauce in here is the regular expression used to search for the comments.  I'll leave it to you all to figure out how the regular expression works and possibly rehash it when I start the "iRules Ninja" series. 

Bonus points to anyone who can comment on why I added the "[^/][^/]" towards the end of the regexp.

Get the Flash Player to see this player.