If you’re just joining this series, check out Reason #1 and Reason #2 That You Need File Virtualization

Tiering is the one benefit that a sales person will tell you about, so I was putting it off a bit, but it is the one thing I’ve had requests for, and it’s a benefit most enterprises can relate to. I’ll tell you up-front that even though it does offer a huge savings, for reasons I’ll mention below, this isn’t one of the big drivers for me – I knew about the benefits of tiering before I decided that File Virtualization was a good thing, but it is one of the “death by a thousand cuts” reasons – and one that many of you would benefit from.

The Normal Case

You’ve got high-end storage in your data center, you’ve also got low end storage in your data center. Finally, you have tape or VTL in your data center.

Today you’re likely trying to control what is stored where by how you expose shares and what users or applications have access to them. This is better than no plan, but it does have its weaknesses. Primarily, you’re not managing by data, you’re managing by user or application. This means that you’re likely being inefficient about what goes where, simply because the data should tell you where it belongs, not who created or uses it.

There are plenty of studies out there on the difference in cost between SCSI/FC/SAS disk and SATA, we won’t delve into it here, but suffice it to say that it is a large price difference, large enough that you probably don’t want all of your data on the high-end storage. Any statement beyond that will draw fire from one end or the other of the spectrum, and detract from the point of this article. The fact is that most enterprises have both types of storage, which is the key point for this post.

So you have two or more “tiers” of storage. There are speed and capacity differences, there are cost/TCO differences, there are sometimes locale differences.

The Promise of ILM

Keeping your data strictly aligned on access frequency and performance requirements is well neigh impossible without some tool to help you. This was one of the goals of Information Lifecycle Management (ILM) tools, one that at the time I called “low hanging fruit” that would be sucked into existing applications and live on long after ILM had passed out of favor the same way HSM did.

Here we are several years later, and it turns out that File Virtualization products have done just that, pulled in the concept of tiering and made it useful. I can speak definitively about ARX, and it certainly does. While I’ve read about some competitor’s support for the same functionality, I haven’t studied them closely or used them personally, so you’ll have to ask them about their level of support. In this document, tier one is your uber-fast storage, tier two is your regular storage, and “backup” refers to both tape and VTL – it does not refer to replication, though in some cases replication might fit (I’m saving replication for a separate blog entry).

Reaping the Promise – without the ILM hype

So now you have to define the system that will move files around across tiers. You can do this by several different attributes, but we’ll stick to the most common – filename, file size, and age (since last modified). These three cases cover the most used scenarios – move all files of a given time (say all Microsoft Excel files, for example) off to tier two storage because they’re not response-time critical (a few milliseconds difference opening them won’t matter), move all large files off to tier two storage because they’re pigs, and you want to keep tier one open for more granular stuff, and finally, move documents that haven’t been modified in X days off to tier two because they’re likely aged out of critical importance.

Once you determine which file servers (or NAS devices, or whatever) are going to be your primary and secondary, you set them up as different folders on the File Virtualization appliance, and then you create rules that tell the system what to move where. These rules use the metadata of the files to pick which files get moved, so you could make a rule that says “move all files ending in .XLS that haven’t been modified in a week to Tier 2”. These rules are then placed inside policies that run them on a regular basis on a defined set of folders or shares. You also make rules that return things - “Move all files ending in .XLS that have been modified in the last week to Tier 1”, and apply those to the tier 2 shares/folders. (note that rules and profiles are ARX terms, other vendors may use some other name for the same concept)

And you’ve achieved one of the largest benefits that was promised from ILM – tiering. That really is all there is to it. Of course, implementation will take a bit longer than it took me to write this, but the guts of how it is achieved are here. Over time your shares will sort themselves out to contain the correct content.

And That’s a Big Deal Because…

So, other than keeping your data all neat and tidy, what is the big deal?

You don’t have to back up tier two every night. The backup window, which was supposed to shrink with the advent of VTLs and replication is still dogging us in the enterprise, and this takes a huge chunk of backup data and removes it from the nightly stream. Tier two has older documents, you simply don’t need to back it up every night. If you use one week as your movement timer (Your “aging” per-se), then you back up tier two once a week and tier one nightly. Tier one will inevitably be a small percentage of your overall storage, requiring a small percentage of your normal backup window. If you use one month as your timer, then more data will remain on tier one, but you only need to back up tier two once a month.

Frankly, that is huge. That’s why your salesperson will tell you about this option – it is a huge win in backup times, and really takes very little manual intervention, and no changes to your applications beyond the change when you put the File Virtualization appliance into place. Since the folders for tier one and tier two can be anywhere on the virtualization appliance as long as they are distinct folders, you don’t have to change the directory structure to attain tiering. If you should want to move a folder from primary to secondary, that is possible with replication, we’ll touch on that later.

But for now, that’s it… You get the best of ILM, you get smaller backup times, and you get less calls from application owners that want to cut into your huge backup window. You go home at night and sleep more peacefully, the world is a better place… Okay, maybe not that last bit. ;-)

Until next week,

Don.