Search
Don MacVittie - Persistently Different
You are here: DevCentral > Weblogs

posted on Wednesday, May 05, 2010 12:07 PM

YOU GET THE CALL AT 2AM.

“Web Server XYZ is out of disk space” says the voice on the other end of the line in a tone that, at 2am, could pass for Scotty screaming “Capt’n! I don’t know how much longer I can ‘old her together!”

And that’s just the beginning of your nightmare, for this is exactly the scenario that you implemented thin provisioning to avoid, and if it has come to pass anyway, then you know that…

“And now server ABC is crashed!” the voice shouts, reminiscent of Scotty’s “She’s breakin’ up capt’n!” Except in your case, Scotty isn’t there to save the day. Or the night, as the case may be.

Soon you have boxes dropping all over the data center, all with disk space problems. And it’s not even 8am yet.

Blackberry Bold


THIN PROVISIONING

For those who don’t know, thin provisioning is the ability to tell a system it has a ton of disk, but have it only actually use what it needs right now. In short, a pool of disk is placed behind a collection of servers, and each are told they have a certain amount of space dedicated to them. But it isn’t really dedicated. The point of thin provisioning is to tell two or ten or two hundred machines that they each have 500 Gig, available, even if only one Terabyte is all that’s available to the lot of them. Then without certain knowledge of growth patterns on each machine, you can still guarantee that they can all consume more disk resources without problems.

This is extremely useful when you aren’t certain of storage growth patterns on a group of servers, but have an idea of the upper bounds or rate of storage usage change of those servers as a group. You can give them a chunk of disk that you know will suit their combined needs for a given time period, and not worry about them. Since we in IT are often asked to “throw up” servers that we can’t be certain of usage patterns for, this is a method of cutting risk while keeping costs contained. We won’t buy 10 Petabytes for that server unless it actually starts to use that much space. Until then it can share with some other servers the disk we can afford to put behind them.

This process absolutely reduces the risk that you will get a call around midnight that Server1 is using 100% of its disk while Server2 sits next to it and is at 5% disk utilization. It doesn’t eliminate the risk, even a completely virtualized infrastructure is going to suffer from physical resource limitations. That’s where cloud providers are supposed to help, because they (theoretically) have enough physical hardware to support more than their current possible workload. I’d check that though, their focus is on OpEx and CapEx just like yours, and less is more money.

Photo CC-by-SA Michael Moll


CLONE WARS

Most of us know what VM clones are at this point, but just to be sure we’re all on the same page, you can clone the image of a given VM so that you have an exact copy of it. There are a ton of options for how to set up the clone, but the key is that you have an exact copy. If you’re using DHCP to get the address and hostname for your server, then an exact copy is all you need. No changes required, you can boot the copy and run. Lots of applications don’t do well with DHCP, but changing a static IP/Hostname is as easy as running the clone and making the changes to the operating system, so not a big deal even if your application needs static IPs.

And that’s where our story really  gets interesting. There is a lot of cloning going on out there.

Not so long ago, a storage fellow I  follow on Twitter sent out “with our product, you can make 10,000 clones in X minutes with zero storage costs!” Which caused me to give him grief about the concept that those clones were actually disk-free. Of course they’re not, clones aren’t made to sit and do nothing, they’re made to use, and in the course of use will both use disk directly, and become modified from their original layout (as with changing IP). Once the clone is no longer an exact duplicate of the original, it starts using disk indirectly too, to store the differences.

Now clones are A Great Thing(TM) because they allow you to quickly bring up another instance of a server, or to capture a server in a specific state for future reference… But many clones, well, you saw the movie, right? Many clones can be a problem. They use resources. Sometimes lots of resources. Talked to an IT person who shall remain nameless that said they had their DB in a VM and had clones of it. I’m guessing their CISO hasn’t found out about this little gem yet. Which brings us to another point, if you are making a clone so that you have a backup, or so you can move it across machines or data centers, your clone will have to be a “full copy”. Otherwise it is dependent upon access to the original, not something most of us are thinking when making your average clone. Meaning most clones are anything but free, even though I’ve been discussing the so-called “zero storage” versions.


THAT WAY LIES DOOM

And as the more astute of you may have guessed, the risk of getting a 2am call of doom from Scotty goes wayyyyy up in a virtualized environment, particularly if that environment uses thin provisioning. In an environment without thin provisioning you might lose a system or two, but total melt-down? Much more likely when several systems share disk.

Let us consider for a moment. Even in a world where you had physical servers for everything, your actual disk usage rate was elusive. Some companies had it down very well, and could tell when App X was going to need more disk, others counted on alarms to tell them when they needed to fill a few more trays, most stumbled along, figuring it out as they went.

But now we’re talking about a multitude of machines, each set up like it has more disk than it needs. While those alarms will still go off and tell you that your big old NetApp is under 10% disk free, now you’ve got a lot more boxes chewing at that 10% – because with thin provisioning you could, even should, have multiple apps of varying usage patterns all hitting that same box, and with Virtualization it is so easy to “bring up another instance” that you likely have more virtual machines chewing up disk than you would have physical.

Thus does IT Armageddon start, with a warning. And soon, with a loud crashing sound. But it won’t just be one or two apps, it will be all the apps that are thin provisioned to the same hardware. And in a Virtualized environment, if you’re not careful, that can be a lot.


KNOW THY ENEMY

Thin provisioning is not the problem here, nor are VMs. Control is the problem. You can keep this nightmare scenario at bay simply by knowing what you have, how it is used, and setting alarms a bit higher for those items that have erratic disk usage patterns and all other apps on the same physical disks.

In a virtualized world – be it disk virtualized through thin provisioning or applications virtualized in a VM system – you need to know more, not less about your environment. Keep on top of it, monitoring should move up the stack, and someone should be responsible for reporting upward the state of the data center. In this case the state of data center storage, but it feeds into the state of the data center overall.

In the mad rush to virtualization, just make certain you don’t leave the core function of IT behind – to make sure that systems are running smoothly, with enough resources to do their job.


Follow me on Twitter     icon_facebook

AddThis Feed Button Bookmark and Share

 



Feedback

5/5/2010 9:07 PM
Gravatar Please make your article more interesting to read.
attention department
5/5/2010 10:28 PM
Gravatar Uhhmmm... Okay? You get the point that the comment isn't very enlightening, right? I approved it because you're certainly welcome to your opinion, but I'm suspicious of anyone who leaves no email address and the comment is so vague.

Don.
dmacvittie
9/30/2010 3:19 PM
Gravatar lol
what a comment!
Article is fine, gets the point accross. Tho I am still surprised by people not knowing these things. People who manage VM's should be the people who manage systems, meaning have an NMS with low disk space alerts, do regular maintenance, plan, do maintenance, work.
Matt

Let Me Know What You Think


Please use the form below if you have any comments, questions, or suggestions.

Title:
 
Name:
 
Email: (so we can show your gravatar)
Website:
Comment: Allowed tags: blockquote, a, strong, em, p, u, strike, super, sub, code
 
Please add 3 and 4 and type the answer here:

Blog Stats

Posts:347
Comments:225
Stories:0
Trackbacks:0
  

Image Galleries

  

82,243 Members in 102 Countries and Growing!

Join DevCentral Today!

About DevCentral

DevCentral has been a successful, thriving community for many years. We have always strived to bring you the best technical documentation, discussion forums, blogs, media and much more that we can.

So dive in, get familiar with DevCentral. We hope you like it, we hope it makes your job easier, and lets you get that much more power out of the community. To learn more, make sure to check out the Getting Started section. And if you have any problems, or think something could be easier to use, drop us a line to let us know.

Got It !

We've received your comment and transmitted it directly to DevCentral HQ.

Thanks for taking time to let us know what's on your mind. At DevCentral | Community Matters!

Get In Touch With Us

Have questions, suggestions or just want to get something off your chest?

Use our handy form below to Direct Connect with DevCentral Mission Control.

Send Us Feedback       or