Forum Discussion

Jeff_42220's avatar
Jeff_42220
Icon for Nimbostratus rankNimbostratus
Aug 24, 2009

URI::encode question

 

Does the "URI::encode" command replace unsafe ASCII characters and thus prevent XSS vulnerabilities? I think that is the point of the "URI::encode" command but the devcentral explanation of this command doesn't go into detail.

 

 

Thanks,

 

 

Jeff

3 Replies

  • 2 part answer:

     

     

    I believe that URI::encode is meant to encode URI address strings, while most XSS exploits live in the body content. You don't see a lot of JavaScript URI exploits unless the app is written really badly. You can technically URI::encode anything though, including body content. But that job is usually better left to the application environment (i.e. PHP -> HTMLENTITIES()).

     

     

    While URI::encode does provide some level of protection from the more mundane "" type exploits, I wouldn't rely on it solely to protect your site from XSS. Good coding practices, input validation, and white listing are your best bets against cross site scripting attacks.

     

     

    Kevin
  • Hi Jeff,

     

     

    Sorry I didn't reply to your previous post (). I saw your reply and then forgot. I was suggesting you URL encode the original requested URI before including it as a parameter value in the query string of the redirect location. This is required by RFC2616 as you want to prevent the application from interpreting the value of the parameter as more parameter names and values. For example, if a client made a request to www.example.com/path/to/file.ext?param_name=param_value&param2=value2, and you used 'HTTP::redirect "http://sorryserver?original_url=[URI::encode $url]"' to redirect them, the effective redirect would be:

     

     

    http://sorryserver?original_url=www.example.com%2fpath%2fto%2ffile.ext%3fparam_name1%3dparam_value1%26param_name2%3dparam_value2

     

     

    The value of the original_url parameter is www.example.com%2fpath%2fto%2ffile.ext%3fparam_name1%3dparam_value1%26param_name2%3dparam_value2

     

     

    If you did not URL encode the value, the redirect location would be:

     

     

    http://sorryserver?original_url=www.example.com/path/to/file.ext?param_name1=param_value2&param_name2=param_value2

     

     

    When the app tries to parse the parameters of the non URL encoded value, it would expect the format of param_name1=param_value1&param_name2=param_value2 and it would parse the following parameter names and values:

     

     

    original_url = www.example.com/path/to/file.ext?param_name1=param_value2

     

    param_name2 = param_value2

     

     

    So the idea behind URL encoding the original_url parameter value is to safely encode the value so the app interprets it correctly. The app would then need to URL decode the value of the parameter and then ideally HTML encode it before displaying the value to the client. The two encoding methods provide different functionality.

     

     

    HTML encoding prevents the client from interpreting any metacharacters as the actual metacharacter. For example, if you HTML encode a script like it becomes . The browser would HTML decode this and display , but would not execute the resulting string.

     

     

    HTML encoding should provide reasonable protection against an attacker using the redirect action in a cross site script attack. You can read more about XSS attack and prevention methods on OWASP's page: http://www.owasp.org/index.php/XSS_Attacks (Click here) and this page: http://tldp.org/HOWTO/Secure-Programs-HOWTO/cross-site-malicious-content.html (Click here).

     

     

    Aaron
  • Sorry, the HTML encoding didn't make it through.

     

     

    HTML encoding prevents the client from interpreting any metacharacters as the actual metacharacter. For example, if you HTML encode a script like it becomes & lt ; script & gt ; alert ( ' xss ' ) & lt ; / script & gt ; (without the spaces). The browser would HTML decode this and display , but would not execute the resulting string.

     

     

    As a good example, the DC forum web app is HTML encoding the post content, so the script tags are displayed by the client browser but not executed as scripts.

     

     

    Aaron