posted on Thursday, March 25, 2010 3:22 AM
Never never trust content from a user, even if that user is another application.
Web 2.0 is as much about integration as it is interactivity. Thus it’s no surprise that an increasing number of organizations are including a feed of their recent Twitter activity on their site. But like any user generated content, and it is user generated after all, there’s a potential risk to the organization and its visitors from integrating such content without validation.
A recent political effort in the UK included launching a web site that integrated a live Twitter stream based on a particular hashtag. That’s a fairly common practice, nothing to get excited about. What happened, however, is something we should get excited about and pay close attention to because as Twitter streams continue to flow into more and more web sites it is likely to happen again.
Essentially the Twitter stream was corrupted. Folks figured out that if they tweeted JavaScript instead of plain old messages that the web site would interpret the script as legitimate and execute the code. You can imagine where that led – Rickrolling and redirecting visitors to political opponents sites were the least obnoxious of the results.
It [a web site] was also set up to collect Twitter messages that contained the hashtag #cashgordon and republish it in a live stream on the home page. However a configuration error was discovered as any messages containing the #cashgordon hashtag were being published, as well as whatever else they contained.
Trend Micro senior security advisor Rik Ferguson commented that if users tweeted JavaScript instead of standard messages, this JavaScript would be interpreted as a legitimate part of the Cash Gordon site by the visitor's browser. This would redirect the user to any site of their choosing, and this saw the site abused to the point of being taken offline.
The abuse was noted and led to Twitter users sending users to various sites, including pornography sites, the Labour Party website and a video of 1980s pop star Rick Astley.
– Conservative effort at social media experiment leaves open source Cash Gordon site directing to adult and Labour Party websites, SC Magazine UK
IT’S STILL USER GENERATED CONTENT EVEN via an API
While you may trust Twitter as a source of “content” it’s important to remember that the content it provides still comes from users. This is true for any content or content feed aggregated from another source and integrated into your site. It is feeding you user-generated data and we all know what that means: potential abuse. You might be tempted
to trust the third-party site to validate user input and thus pass on only “good bits” to you, but that trust would likely be misplaced and even if it wasn’t, it’s simply a bad idea to trust user-generated content whether it comes direct from the user or via a third-party mechanism such as the Twitter API.
Because it’s user-generated you need to validate the content before you include it in your site and potentially expose your visitors and customers to malicious content. Remember your visitors and customers see you as the source of the content, not Twitter or Facebook or Google. You. If they’re infected by malware or sent off to an “adult” site simply by visiting your application, you are going to be blamed.
This means it’s important that you validate the content. Unfortunately this may be more difficult than it first sounds because of the extensive use of “gadgets” or “widgets” to integrate third-party content into sites. You don’t always have access to the source (it may be Flash-based) or have control over that stream of data. If you do, of course, you ought to be wrapping that integration in validation and scrubbing application logic like a baby in swaddling clothes: tightly.
A better option if you have control (and I say better because you won’t have to modify the application every time a new vulnerability is discovered in the future with this option) is to leverage a web application firewall to validate and “scrub” the incoming data stream. In the same way a web application firewall protects your applications and data stores from exploitation, it can be used to intercept and inspect incoming content streams for junk and malicious code. Note that solution, like others, cannot scrub content that is included via a “widget” or “gadget” that uses JavaScript (and potentially XHR) to retrieve third-party content. Content retrieved from third-party sites is inherently a risk because it the exchange of data occurs outside the boundaries of organizational control.
As much as widgets and gadgets are useful and make integration easy for anyone, they ignore the larger issue of security and the potential harm to passers-by from malicious or exploitative content swimming upstream. It is therefore in the best interests of any organization to analyze the risk of potential exploitation and weigh that against integrating the old-fashioned way: via code and an API. Doing so takes more time and effort, yes, but it affords the organization control over the content and the way in which that content stream is integrated, which can mitigate many of the risks associated with offering up non-validated user-generated content to your visitors.