Technical Article Infrastructure Scalability Pattern: Sharding Sessions November 01, 2010 by Lori MacVittie 3341 article application delivery architecture availability cloud design dev devops dynamic infrastructure infrastructure infrastructure 2.0 load balancing management performance scalability us virtualization 0 A deeper dive on how to apply scalability best practices using infrastructure services. So it’s all well and good to say that you can apply scalability patterns to infrastructure and provide a high-level overview of the theory but it’s always much nicer to provide more detail so someone can actually execute on such a strategy. Thus, today we’re going to dig a bit deeper into applying a scalability pattern – horizontal partitioning, to be exact – to an application infrastructure as a means to scale out an application in a way that’s efficient and supports growth and that leverages infrastructure, i.e. the operational domain. This is the reason for the focus on “devops”; this is certainly an architectural exercise that requires an understanding of both operations and the applications it is supporting, because in order to achieve a truly scalable sharding-based architecture it’s going to have to take into consideration how data is indexed for use in the application. This method of achieving scalability requires application awareness on the part of not only the infrastructure but the implementers as well. It should be noted that sharding can be leveraged with its sibling, vertical partitioning (by type and by function) to achieve Internet-scale capabilities. SHARDING SESSIONS One of the more popular methods of scaling an application infrastructure today is to leverage sharding. Sharding is a horizontal partitioning pattern that splits (hence the term “shard”) data along its primary access path. This is most often used to scale the database tier by splitting users based on some easily hashed value (such as username) across a number of database instances. Developer-minded folks will instantly recognize that this is hardly as easy as it sounds. It is the rare application that is developed with sharding in mind. Most applications are built to connect to a single database instance, and thus the ability to “switch” between databases is not inherently a part of the application. As with other scalability patterns, applying such a technique to a third-party or closed-source application, too, is nearly impossible without access to the code. Even with access to the code, such as an open-source solution, it is not always in the best interests of long-term sustainability of the application to modify it and in some cases there simply isn’t time or the skills necessary to do so. In cases where it is simply not feasible to rewrite the application it is possible to leverage the delivery infrastructure to shard data along its primary incoming path and implement a horizontal partitioning strategy that, as required, allows for repartitioning as may be necessary to support continued growth. What we want to do is leverage the context-aware capabilities of an application delivery controller (a really smart Load balancer, if you will) to inspect requests and ensure that users are routed to the appropriate pool of application server/database combinations. Assuming that the application is tied to a database, there will be one database per user segment defined. This also requires that the application delivery controller be capable of full-proxy inspection and routing of every request, because if a user whose primary storage is in one database, a request for a user’s data that is stored in another database will need to be directed appropriately. This can be accomplished using the inspection capabilities and a set of network-side scripts that intercept, inspect, and then conditionally route each request based on the determination of which pool of resources holds the requested data. This will often take the form of a hashed value, but as this can add latency to the total response time an alternative to ascertaining the appropriate hashed value for each request is to compute the hashed value (which is used to choose the appropriate pool of resources) and then set a cookie with that value that can be easily accessed upon subsequent requests. Other methods might include simply examining the first X digits of some form of ID carried in a header value. Because the choice at the application delivery layer directly correlates to persistent data, it is important that whatever method is used by always available. Immutable and always available data should be used as the conditional value upon which routing decisions are made, lest users end up directed to a pool of resources in which their persistent data does not reside or, worse, rejected completely. Alternatively, another approach is to shard data based on location of the user. Again, this requires the ability of the infrastructure to determine at request time the location of the user and route requests appropriately. For smaller sites this type of infrastructure may be self-contained but for large “Internet” scale applications this may actually go one step further and leverage global application delivery techniques like location-based load balancing that first directs users to a specific data center, usually physically located near them, and then internal leverages either additional sharding or another partitioning scheme to enable granular scalability throughout the entire infrastructure. SHARDING DATA is COMPLICATED Sharding data – whether within the application or in the infrastructure – is complicated. It is no simple thing to spread data across multiple databases and provide the means by which such data might be shared across all users, regardless of their data “home”. Applications that use a shared session database architecture are best suited for a sharding-based scalability strategy because session data is almost never shared across users, making it easier to implement as a means to scale existing applications without modification. It is not impossible, however, to implement a sharding-based strategy to scale applications in which other kinds of data is split across instances. It becomes necessary then to more carefully consider how data is accessed and what infrastructure components are responsible for performing the necessary direction and redirection to insure that each request lands in the right data pool. The ability to load balance across databases, of course, changes only where the partitioning occurs. Rather than segment application server and database together, another layer of load balancing is introduced between the application server and the database, and requests are distributed across the databases. This change in location of segmentation responsibility, however, does not impact the underlying requirement to distinguish how a request is routed to a particular database instance. Using sharding techniques to scale the database layer still requires that requests be inspected in someway to ensure that requests are routed to where the requested data lives. One of the most common location-based sharding patterns is to separate read and write, with write services being sharded based on location (because inserting into a database is more challenging to scale than reading) with read being either also sharded or scaled using a central data store and making heavy use of memcache technology to alleviate the load on the persistent store. Data must be synchronized / replicated from all write data stores into the centralized read store, however, which eliminates true consistency as a possibility. For applications that require Internet-scale but are not reliant on absolute consistency, this model can scale out much better than those requiring near-perfect consistency. And of course if the data source is shared and scaled out as a separate tier, with sessions stored in the application layer tier, then such concerns become moot because the data source is always consistent and shared and only the session layer need be sharded and scaled out as is required. DEVOPS – FOR THE WIN This horizontal scaling pattern requires even more of an understanding of application delivery infrastructure (load balancing and application switching), web operations administration, and application architecture than vertical scaling patterns. This is a very broad set of skills to expect from any one individual, but it is the core of what comprises the emerging devops discipline. Devops is more than just writing scripts and automating deployment processes, it’s about marrying knowledge and skills from operations with application development and architecting broader, more holistic solutions. Whether the goal is better security, faster applications, or higher availability devops should be a discipline every organization is looking to grow. In our next deep dive we’ll dig into THE Non-Partitioned Dynamic Scalability PATTERN Related blogs & articles: So That was a Bummer (Foursquare, Sharding, and Downtime) Applying Scalability Patterns to Infrastructure Architecture Service Virtualization Helps Localize Impact of Elastic Scalability Infrastructure Scalability Pattern: Partition by Function or Type Cloud Computing: Vertical Scalability is Still Your Problem Scalability Only One Half the Reliability Equation Automating scalability and high availability services Web 2.0: Integration, APIs, and Scalability Vertical Scalability Cloud Computing Style Cloud + BPM = Business Process Scalability Cloud Load Balancing Fu for Developers Helps Avoid Scaling Gotchas The Battle of Economy of Scale versus Control and Flexibility Scalability with multiple networks for Virtual Servers ... last modified: November 01, 2010 0 Comment(s): You must be logged in to post comments.