Search
Lori MacVittie - Two Different Socks
You are here: DevCentral > Weblogs

posted on Thursday, January 07, 2010 3:58 AM

Being an efficient developer often means abstracting functionality such that a single function can be applied to a variety of uses across an application. Even as this decreases risk of errors, time to develop, and the attack surface necessary to secure the application it also makes implementing security more difficult.

Over the holidays I had the opportunity to do some coding on my latest web application project. I won’t bore you with the details of what it is because it’s to support a hobby of Don and mine except to say that it’s running on a LAMP stack and heavily data-driven. But then what isn’t data-driven on the web these days?

Now I’m an old skool OO (Object Oriented) programmer and a typical developer. That is to say that I’m basically lazy and hate to code and recode the same thing over and over so I employ every trick I can to avoid doing so. That means abstraction and taking advantage of some of the more flexible capabilities of loosely-typed scripting languages like PHP. Reuse is my best friend, and I’ll take a little extra time to write a single method if I think I can reuse it across the entire application and thus save a lot of extra time. I also rely heavily on AJAX (the PHP XAJAX framework to be exact) to provide a more interactive application for our users.

I was debugging one of those reusable functions that’s called often via AJAX when it occurred to me how difficult it was to secure such a beast for several reasons but primarily because securing this single function would basically negate all the gains in productivity and efficiency I’d gained by implementing it in the first place.


REUSABLE CODE + SECURITY –> BLOAT

The combination of a reusable function designed to update the database plus AJAX means that it’s necessarily leveraging variables to format a database query. Those variables come from the application and the user, but all are submitted via Javascript. This a huge security risk as it’s possible to exploit the function by manipulating the parameters in such a way as to inject malicious code (XSS) or otherwise corrupt the database.

I won’t include the entire function, but here’s the relevant code that’s used to update just about any column in any database that utilizing an identifier column called “id”:

   1: function updateField($tableName, $colName, $value, $id) {
   2:         ...
   3:         $q = "update $tableName set $colName=\"$value\", lastmodified=NOW() where id=$id";
   4:         $r = mysql_query($q, $dbh);
   5:         ...
   6: }

In order to prevent malicious code from being injected it is necessary to scrub $value based on $colName. Sometimes it’s an integer, sometimes it’s a string. Sometimes HTML is allowed and sometimes it isn’t. Sometimes there’s a range of values allowed for $colName and sometimes there isn’t. The constraints – which could be placed on the database tables themselves but comes with its own set of problems – are peculiar to (1) the column type, (2) the database, and (3) the column itself. In order to really effectively secure this function I’d need to validate the parameters against all possible attack signatures and if I wanted to be really secure, I’d further validate the parameters based on the constraints for the specific column and table combination.

Making it more difficult is that $id is actually an auto incremented value in most database tables, and as use of the application grows so too does the valid range of values for $id. The valid range, then, is highly dependent on $tableName because some tables will be used more than others and thus the range of ids will be different across the application. And of course we’d want to limit the possible set of $tableName and $colName combinations to those that actually exist because it’s too easy otherwise to manipulate this particular How many freaking combinations for these parameters are there??query to inject XSS or other malicious code.

Adding all the necessary code to enforce limits, ranges, and scrub the data would add literally hundreds of lines of code, making the function extremely more complex than it is now and introducing the risk that I’d miss a combination or place a constraint on a table/column combination that would later break the application. By the time I got done adding all the necessary code I might as well have written individual functions to update each table/column combination. Tedious, to say the least, and exactly the kind of busy-work coding I was trying to avoid in the first place. And every time I used the function for a new table and column combination I’d have to update the function. Even if abstracted a security function based on the table name and called that automatically, I’d still have to write additional code to support security every time (1) I used the function for a new table or column and (2) there’s a new discovered vulnerability that might be applicable.

The data input validation, too, must almost necessarily exist on the server-side of the application. Even though the functions are called repeatedly via AJAX and thus Javascript, adding a ton of database schema-related input validation code into the client-side of the application would necessarily bloat it and possibly degrade performance. Including that level of detail in the client-side Javascript – unless heavily obfuscated – would also provide a much easier way for miscreants to “map” out the database schema and understand how better to manipulate the parameters such that a successful attack could be perpetrated.


A WEB APPLICATION FIREWALL TAKES the TEDIOUS out of SECURITY

Obviously my first answer is to employ the capabilities of a web application firewall. The same tedious table/column combination constraints will exist, but they’ll be automatically – and dynamically – generated by the firewall and codified into an enforceable policy without requiring that I destroy the efficiency of a reusable function by adding very parameter-specific code.

A web application firewall does not, however, really address the $id problem; that is, the range of $id values will continue to change over time based on usage of the application and any constraints placed upon it in code or in web application firewall policies would likely break the application. The web application firewall can, of course, recognize the parameter and changes in its values over time, but there is possibility if we try to restrict it – even based on $tableName (which we can do with some WAF implementations) – that eventually we’ll break the application by using a “bad” $id according to the WAF. The best we can do, then, is to codify a constraint into the policy that allows only numerical values for $id regardless of what database table is being updated. We would be, essentially, enforcing a schema on what is schema-less data. The same constraint system would allow us to enforce a table-column relationship on all possible combinations because the WAF would collect the possible combinations and allow us to restrict/allow/constrain in a much easier way than if we were to codify it in the function. This latter capability is of vast importance if there might be new tables, columns, and combinations thereof in the future as the WAF will recognize what it sees as an anomaly and provide an opportunity to allow or deny or further constrain that new combination without requiring that we modify the application code.

Websense Security Labs report: The Internets are full of badnessA WAF can also scan the parameters for possible exploitation in the form of injecting SQLi or XSS easy enough, so we don’t have to concern ourselves with the fact that parameters are being passed in via Javascript that are used as the basis for the entire query. Parameters, whether passed in via Javascript or as a query parameter in a URI, all pass through the WAF the same way and can easily be evaluated for potential malicious content of a thousand different types.

The second answer is to write additional code that is, unfortunately, tedious and specific to $tableName and $colName and requires updating in a variety of circumstances. In other words, it may be necessary to just shut up and write some more code.

And that is really what makes reusable code so hard to secure: it was almost certainly designed by a developer who not only dislikes tedious busy-work coding in the first place but is also likely over-subscribed. Thus getting them to write tedious, busy-work security related code is going to be difficult and, if the current statistics regarding the exploitability of web applications are true, it’s obvious that a whole lot of people have been unable to do just that.

Follow me on Twitter    View Lori's profile on SlideShare  friendfeed icon_facebook

AddThis Feed Button Bookmark and Share

Related blogs & articles:



Feedback

1/9/2010 4:01 AM
Gravatar So basically, what you are saying is that because you have to validate your data it all becomes so complicated?

Plus, that query you posted above is the most inefficient thing anyone could use.

Reusable code should apply to core functionality alone, other application specific components should not be used from anywhere else.
francois
1/9/2010 4:34 AM
Gravatar Personally, I'd go with some kind of code generation here (either generating the D/B side tables from the object model, a la Rails, or generating objects from the DB metadata).

That way you can also get some performance gains from prepared statements (reducing the parser hits on your DB), although it may be your DB library is handling that for you under the hood anyway.

As for validation and constraints - I don't think we've ever reached a satisfactory answer to this problem - really, we want them in every level - going back to the server, even with Ajax, to do simple field level validation feels wrong (particularly over networks of unknown speed) but having the validation only in the front end is definitely wrong (and mitigates against reusing your business objects) . . . while constraints on the database provide a fundamental final check that your data has integrity - but you don't want to wait until you are saving to the database and then handle constraint exceptions.

And worse still, we're talking a different programming language at every tier, even when we just want to express something simple (Name = 50 chars, alphabetical + some symbols) - with RegExp as perhaps the most common solution.

Should be possible to generate JavaScript based validators along with the page though (although I'm painfully aware of the problems that occur as soon as you start doing so, like unintentionally disabling short-cut keys, or international characters - ø ü - etc).
JulesLt
1/9/2010 8:23 AM
Gravatar Another Xajax fan, nice!

I understand your comment about bloat and security.

Let's say though that you buy 100% into "Filter Input Escape Output" and you leave the Filtering to an Xajax-ified layer, you could put all your trust in PDO's ability to escape everything properly before putting it into your database.

And then escape everything for displaying as html when you get it out?

Having said that, I am not talking about Xajax/CRUD on a public-facing website, but a backend CMS where I can detect any monkey business and chuck them out/penalise them, and if they turn JS off, well just nothing works anyway.

PaulG
1/10/2010 1:26 AM
Gravatar First, now with forms submitted with ajax, the need for client-side field validation is not that important. The server can validate the fields and in case of invalid sumission, just return names of erroneous fields (unless you do really care about every byte transferred between client and server).

Second, you will never succeed if you try to create schema-based validation. I believe that cleverly designed model layer will solve all your issues - including the most essential part - database updates.
Dennis
9/30/2010 12:48 AM
Gravatar online casino
thanks for the tip. I downloaded the tool kit cause I wanted the calendar control only to find out other things broke. I had the December version and there is now a Jan. version of the Ajax.net extensions. Also some changes with the web.config file. I sure hope we dont have to reinstall everything every time a new toolkit version is released.
dimple

Let Me Know What You Think


Please use the form below if you have any comments, questions, or suggestions.

Title:
 
Name:
 
Email: (so we can show your gravatar)
Website:
Comment: Allowed tags: blockquote, a, strong, em, p, u, strike, super, sub, code
 
Please add 7 and 3 and type the answer here:

Blog Stats

Posts:978
Comments:1685
Stories:0
Trackbacks:583
  

Image Galleries

  

Application Delivery

  

Cloud Computing

  

Random

  

Security

  

Chat Catcher

82,243 Members in 102 Countries and Growing!

Join DevCentral Today!

About DevCentral

DevCentral has been a successful, thriving community for many years. We have always strived to bring you the best technical documentation, discussion forums, blogs, media and much more that we can.

So dive in, get familiar with DevCentral. We hope you like it, we hope it makes your job easier, and lets you get that much more power out of the community. To learn more, make sure to check out the Getting Started section. And if you have any problems, or think something could be easier to use, drop us a line to let us know.

Got It !

We've received your comment and transmitted it directly to DevCentral HQ.

Thanks for taking time to let us know what's on your mind. At DevCentral | Community Matters!

Get In Touch With Us

Have questions, suggestions or just want to get something off your chest?

Use our handy form below to Direct Connect with DevCentral Mission Control.

Send Us Feedback       or