Web 2.0 is built on primarily two technologies: AJAX and RSS. AJAX is used to develop interactive, real-time applications while RSS is primarily used as for integration and syndication. Import a feed, share a feed, drag-n-drop a gadget, widget, or component. It's all RSS (XML) today.
It's further becoming a requirement of Web 2.0 sites that they provide some sort of API through which developers can write add-on applications. Twitter, Tumblr, Facebook. They all offer APIs that are quite heavily used at this time and startups are following suit. Other sites offer richer media, like video or slideware, that can be embedded in blogs and web sites but that is still served from the provider's site.
It's no wonder, then, that sites like Twitter have a hard time scaling up to meet demand. Capacity planning is no longer primarily concerned with "hits" in the traditional sense of users simply loading pages in a browser and visiting a set of pages. Capacity planners now have to consider hits and application flows from multiple sources - all with slightly different performance profiles and usage characteristics.
A user can no longer be equated to a "hit". It's not possible today to predict usage patterns based simply on the number of users. Capacity can't be determined by asking, "If all 60,000 of our users hit the site at once, can we handle it?" because it's not just your 60,000 users you have to serve, it's 60,000 users in addition to a plethora of sites integrated with yours and their users. And that may be information you can't get, or that it just isn't possible to know.
It's All About the Architecture
Scalability is quickly becoming an architectural exercise. Just as a house can't stand for long if it's not properly architected, neither can a web site. Application scalability, performance, and reliability today is just as much reliant on architecture of its infrastructure as it is its design and implementation.
It's becoming imperative to understand the load that is placed upon servers based on where the requests are coming from and what they're trying to do. Streaming a video off a site may put little burden on a server when it's embedded in a blog hosted by a service provider with a very large pipe, but when it's embedded in a blog hosted by a smaller provider with a smaller pipe, it's going to take more time to stream which is going to necessarily consume resources for a longer period of time; resources that can't be used by another user/integration point/application.
As sites grew they used to simply buy up a load-balancer when it became necessary and toss it into the mix. And it worked. It provided the scalability and reliability that users demanded. But today that's not necessarily going to work. As Twitter has pointed out, its scalability issues are primarily around architecture and design of its systems. Simply adding a load-balancer to the mix won't necessarily fix its problems.
Architecture of web 2.0 sites that hope to grow into monsters needs to start earlier, before it ends up standing on the verge of collapse because of high demand. And while load-balancing is certainly a part of that architecture, it's necessary to move higher up the stack, into the application layer, in order for network-focused products to be a part of the solution from day one. Such solutions need intelligence, and application fluency that can be leveraged in the original architecture such that scalability is as easy as adding new servers again, especially in the light of the growing popularity of "built to fail" infrastructures.
Separating APIs from the presentation layer of a site is a good start. So is running the processes that generate RSS feeds on separate servers from APIs and the general site. Load-balancing such an environment is going to need some application smarts, though, in order to intelligently direct requests to the right server internally. It's going to require some intelligence to understand individual requests and the network conditions that exist in real-time, such that optimization and acceleration features like compression and caching can be employed when it makes sense and when it will be a benefit.
Don't Just Serve It, Deliver It
The nature of Web 2.0 sites, with its APIs and integration mechanisms added to the mix of ways in which we can access, integrate, and digest its information, is changing (or should be!) the ways in which we look at scalability and how we can architect smarter, better solutions before it becomes a problem. There's no single answer to this problem, because every site, every API, every application is unique in its usage and performance characteristics. What worked for Amazon may not work for Twitter, and vice-versa. That's why it's just not enough to drop in a "load-balancer", and why we continually talk of "application delivery" instead.
A smarter architecture requires smarter components, and that makes an application delivery controller a better fit for achieving true scalability than a simple load-balancer.