One of the questions we frequently get from the field and customers is how to appropriately tune the profile for caching. There are lots of settings in the profile and a mis-configuration can actually cause some pretty adverse effects, so getting the settings tuned properly is highly recommended. Of course the answer to this question is my go-to response ‘It depends.’ I am sure many people have gotten tired of always hearing the same answer for every question, but there is no one size fits all answer to this question.

The natural follow on question is “What does it depend on?” Here I can help you with more details. First are you trying to tune caching for RAM cache (AKA Fast cache) or are you trying to tune for Application Acceleration Manager (AAM)? The settings in the profile will perform differently for each of the caches.

How do you determine which objects are cacheable and for how long?

RAM Cache as the name implies is based entirely on RAM memory and is available with every BIG-IP LTM. AAM’s cache on the other hand uses both RAM memory and disk for storing objects. How the two determine which objects to cache and for how long differs. AAM decides if an object is cacheable based on the policy associated with the application assigned to the profile.

AM_Applications

Filters are then applied based on object size, “Responses Cached” and Profile settings. How long an object is cached for is then determined by the lifetime settings within the policy. RAM Cache determines if an object is cacheable and for how long based on the configuration within the profile. The settings are the same for all object types there is no per-object setting as exists with AAM.

clip_image004

This profile can control both AAM and RAM Cache, although the settings mean different things depending on which you are configuring for. The table below outlines the differences

Table 1 highlights the differences between how decisions on caching are made.

Setting

RAM Cache

AAM Cache

Cache Size

Maximum amount of space that can be used per profile. No borrowing occurs.

Minimum amount of space that is dedicated to the profile, borrowing will occur if resources are available.

Max Entries

Maximum number of objects that can be stored

Number of references that are stored for objects in the resource and entity cache. A reference to an object can be evicted from the resource cache but the item still exists in cache and can be served. Responses served from cache may be slightly delayed in these circumstances, but requests will not be proxied to the origin web servers.

How long objects are cached for

Fixed for all objects based on the max-age setting in the acceleration profile

Configurable on a per object or object type basis in the acceleration policy

Determination if an object is cacheable

Based on configuration in the acceleration profile

Based on the acceleration policy responses cached and proxy settings along with the object size setting from the acceleration profile.

How much space can be used for caching?

The maximum amount of space available for caching is half of the RAM a TMM process has been allocated. Depending on which platform you are using will impact how much space is available for caching. RAM is used for smaller objects and disk is used for larger objects. The maximum amount of space, both memory and disk) that is available for caching with AAM is up to 256 GB per profile, if resources are available. This does NOT mean you should set the size on all profiles to 256 GB. AAM will borrow if space is available. The trick is figuring out what the initial value should be. The following provides some guidelines on how to calculate this initial value.

Calculating the ideal cache size

The initial set of variables to care about regarding the cache size: OBJECT_SIZE and lifetime settings. Of course, the values of these variables are going to depend (there’s that pesky word again) on the application, the application content, the traffic patterns, etc. The more unique cacheable objects the application may require a larger cache to run faster, however the frequency of access for those objects, if it's low, may make a large cache to be a waste of space since the objects expire in the cache before the next request needs them based on the lifetime, plus cache latency introduced by the high number of records. See it depends.

When the cache is full, AAM will evict the entry that is deemed less important, in order to make room for a new one, resulting in cache misses if the number of popular entities is higher than what the cache can accommodate. Lifetime settings have meaning here again, since it could be the case where having a high age value forces the cache to keep on rotating (evicting) still valid content. The main goal should be to minimize evictions and maximize the load savings on the origin web servers.

Other "external factors", that dictate amount of memory/disk space available for caching in AAM are:

· Hardware specs.

· Number of applications running on that device.

· Other modules running in the BIG-IP.

As I said in the beginning and you can now see this depends on a number of variables, there's no hard answer that applies to all scenarios. Knowing the specifics of the application makes setting the values easier, however if you don’t know the specifics here are some general guidelines on setting the values-:

clip_image005· Min/Max Object Size: Knowing the distribution of object sizes can help determine what these values could be. If your site is made up of mostly GIFs setting a minimum object size of 10Kb could result in the majority of the objects not being cached. Similarly if your objects are mostly flash files and the maximum object size was set to 100 Kb not many items would be cached. Minimum values of 2-4Kb and maximum values of 1MB are good starting points for these settings

 

· Aging/Lifetime settings: How long should content be cached for is often times a business decision. AAM uses default lifetimes of 4 hours for static content such as images and includes. This means an object will not be revalidated for 4 hours, in most instances this is good. Altering this would determine on how often objects are updated and how long it is safe to serve stale content. In most businesses it is rare for an object to be edited frequently. Yes, new objects and content will be added but the same exact file will likely not change. Take a social site like LinkedIn for example – people are constantly changing their profiles, posting articles, and adding content, but much of the content such as icons and JS files stays the same. The last modified dates of content on my LinkedIn home page range from November 2012 – today. With only a few objects from today. Having a cache serve the objects for 4 hours is relatively safe.

· Cache size: The cache-size value for the LTM web-acceleration profile should be set to a "trivial" value based on the content type. A good starting point could be the default value of 100MB, however if your site serves a lot of heavy images maybe a larger than default value should be used. Remember AAM will borrow space if needed so there is no need to set this to 100 GB. A value between 100-500MB is likely a good starting point. The trick here is making sure the space isn’t over or under utilized (more on this below).

· Number of entries: This should not be set to the total number of objects on the application but rather calculated based on the size of the cache above in either of the following ways:

1) If all content is of primarily a single object type such as images, you can calculate based on the average object size. According to HttpArchive the average image size is 19KB. If you set the cache size to be 100 MB then the max entries could be calculated using the following formula:

Cache size / average object size = Max entries

102400/19 = 5389

I would suggest rounding up to pad slightly to a value of 6000.

2) Now not all caches will cache the same exact type of object there will be objects of varying sizes and content types so an alternative way of calculating the max entries

# of HTML pages * average # of objects per page = Max entries

HttpArchive reports that the average number of objects on a page is 95 and the average number of requests across a single domain is 51. Why the discrepancy and which number to use? With domain sharding and third party content the requests will not all come from a single FQDN. For the purpose of this calculation we are concerned with the objects that are being served from the origin servers no the third party content so I will choose the lower of the numbers. Sadly there is not a metric for the average number of pages if you have access to that number use it otherwise you will have to guess. For the purpose of this example I am going with a nice round number of 300 pages.

300 * 51 = 15300

That’s a lot of objects and honestly is probably too high but we’re not done calculating yet. We assumed that every page will be downloading 51 unique objects from cache, this is not the case. There are likely common items on the pages js, css, images which will be getting served from the browser’s cache and some pages which are only accessed once in a blue moon, it would be safe to estimate that 50-75% of the objects will be getting served from caching resulting in a total of 7650-11475. A number within this range would be a good starting point.

There is a bit of trial and error that goes into configuring the settings. With the above guidance and the process below it becomes a bit easier to narrow in on the best settings.

1.- Set the cache values to a seed value as described above and evaluate.

2.- Let the Application receive the traffic it is expected to receive normally.

3.- Monitor the cache stats:

Via TMSH on box

$ tmsh show ltm profile web-acceleration

Or the TMCTL version which provides the output in csv for scripting analysis & parsing

 

$ tmctl profile_webacceleration_jail_stat

For example:

 

tmctl -c profile_webacceleration_jail_stat | grep | grep

And look for cache_size and cache_evictions. You can run the following (just put the appropriate WEB_ACCEL_PROFILE_NAME and VIRTUAL_SERVER_NAME) to get the simplified table:

% cut_fields=`tmctl -c profile_webacceleration_jail_stat | head -1 | awk 'BEGIN{FS=","; fields="name,vs_name,cache_size,cache_evictions"; split(fields,sfx,","); for (x in sfx) sf[sfx[x]] = sfx[x]; cut_fields=""} { for (i=1; i<=NF; ++i) { if ($i in sf ) cut_fields=cut_fields i"," } } END{ print cut_fields }'`; echo ; echo 'Stats table:' ; tmctl -c profile_webacceleration_jail_stat | head -1 | cut -d ',' -f $cut_fields ; tmctl -c profile_webacceleration_jail_stat | grep WEB_ACCEL_PROFILE_NAME | grep VIRTUAL_SERVER_NAME | cut -d ',' -f $cut_fields; echo

Like:

% cut_fields=`tmctl -c profile_webacceleration_jail_stat | head -1 | awk 'BEGIN{FS=","; fields="name,vs_name,cache_size,cache_evictions"; split(fields,sfx,","); for (x in sfx) sf[sfx[x]] = sfx[x]; cut_fields=""} { for (i=1; i<=NF; ++i) { if ($i in sf ) cut_fields=cut_fields i"," } } END{ print cut_fields }'`; echo ; echo 'Stats table:' ; tmctl -c profile_webacceleration_jail_stat | head -1 | cut -d ',' -f $cut_fields ; tmctl -c profile_webacceleration_jail_stat | grep webacceleration | grep _listener | cut -d ',' -f $cut_fields; echo

This command will output the cache size at that moment, and the cache evictions (the number of objects that were pushed out of the cache to make room for new objects). In the example below the cache is empty and as a result there are no evictions.

clip_image006

4.- Given that applications and traffic patterns are fluid and constantly changing it is recommended to periodically monitor the cache size and store the data in a table to view trends over time. If the maximum cache size is reached frequently or there is a high number of cache evictions then adjusting the cache size would be recommended. On the other hand, if you are barely reaching half the value for the cache size and there are no evictions, consider reducing the setting for a more efficient use of resources.

Maximizing the cache hits, highly depends on the traffic pattern. A pattern that is conducive to caching depends on having a subset of documents out of the entire document space that are highly popular, and a long tail of less popular documents. Ideally we have enough space to fit all the highly popular documents. If not, then whatever can fit in becomes the cacheable popular content and we have to live with it. As cache pressure rears its head, we throw out a document based on a calculated weight that is derived from some of the parameters AAM to pick a document that has been configured as less important to throw out when under pressure.

An important observation here, note that the more objects cached, the greater the time to first byte, so if latency is mentioned as something more important than OWS off-load, you should take note of that.

Look carefully at the traffic. Any content produced by programs or scripts, or that require database accesses may not be useful to cache. If it is useful, a select sub-set of very low recency, high hit count, highly ephemeral objects should be marked as memory only.

 

A very big thank you to my following coworkers Eswar Bala, Sergio Ligregni, Matt Miller and John Stevens for contributing to this article.