Yeah, I'm an 80s child, slamming guitar riffs, long hair, and the whole shot, so I blatantly stole that line from a song I never really liked... But it works for this topic.

Because it really would be cool. What if you had guaranteed delivery, 100% uptime, and were able to sleep at night.

Not all that long ago I was building out a brand new datacenter. It was one of those projects where some genius in the business area thinks that IT is just overhead and they can do it better. I was brought in to set up an actual data center without any deep contact with IT. By the time I left, that theory was gone and the business owner that came up with it was also, but it did give me some interesting experiences.

We never dreamed of 100% uptime. In fact, with all of the dependencies and variables, we doubted the ability to achieve and maintain five nines. We had all of the things the modern data center had (the project had a huge budget) - backup power, redundant servers, multiple paths in and out of the building... But building an aggregation point for the collection of data from six or more full-blown systems with hundreds of thousands of endpoints is a sticky business, even when you don't have some of the (real) overhead associated with deep ties into IT.

The problem is largely one of complexity. You bring in several disparate systems, try to make them all play nice together, and any attempt at controlling how many different platforms you have to support goes out the window. We had systems running on Solaris, AIX, Windows, and Linux, utilizing four different database vendors and apps written in four different languages. It was real-time and near-real-time data collection, so guaranteed uptime was high on our priority list, but how do you guarantee that when none of these systems was developed in-house?

In the end, we put a piece of software we developed between users and the disparate systems, and then guaranteed its uptime, so data was always accessible, but we still didn't have guaranteed data collection. For the subsystems that allowed buffering of real time data we had a limited window should things go wrong - but the window was different between systems, and even sometimes between endpoints.

Zip ahead to today. If I were architecting this setup now (I actually didn't spec systems, I just had to deal with them when they arrived), I would throw in GTM and move the redundant backup systems remote. I'd hit up Oracle about Grid to make certain the database didn't just "go down", I'd push the software vendors for active-active failover, and I'd then use LTM to balance the load... And F5 Acopia ARX to present the same file structures to all (non-database) servers. Using an array for the databases would guarantee that they could be brought back online quickly also. Though today I'd still do Fiber Channel, the day is near where even that would go iSCSI - but I'd want 10 gig iSCSI for databases to hit on.

We're not there yet in most organizations, one outage can still pull us below five nines, and a natural disaster or fire can shut us down either partially or completely. But it's coming. You can feel it.

Which means one day you will be able to sleep at night. Now if only it were automated so that it can make incremental adjustments to how your data center runs while you sleep and in the morning you can go in, look over the changes, and actually fix the problem long-term. But that's coming too, you can see the enabling in nearly every area of IT.

To answer the "yes, yes, one day, but when?" question, I'll steal a line from a musician I actually do like. "Don't ask me, I don't know!" but it is actually sooner rather than later, you can feel it.

There will still be freak accidents. Our AIX box's RAID array was populated with DeathStar disks, and we lost a lot of data when three of them went out in a single drawer before the rep could arrive to replace the first, but those types of things are flukes, not regular problems, and with the right technology (available today, like from EqualLogic/Dell), you can even minimize the damage that type of issue can bring about. For that matter, ARX does something similar, distributing the data across multiple arrays so that you have what is, in effect, RAID of RAIDs. I think if you chose the right RAID level, you could create an infinite loop there... ;-).

That's kind of amusing - when first EqualLogic started up, I thought they were great. Now it's years later, I'm no longer writing about storage and they've been bought out, I still think they've got the right idea if you like iSCSI. But I guess that's a different blog post.

 

Don.

/imbibing: Mountain Dew

/reading: WWII Tank Encyclopeadia in color: 1939-1945, Restayn