Lori, the Toddler, and I drove down to my mothers’ house in Cincinnati (about 9 hours away) for the fourth of July weekend. Our youngest daughter drove her car with her sister, the sister’s fiancé, and our grand-daughter. We stayed in touch via text message and drove through the night.

What does all of this have to do with networking, you ask? Well I was driving about 1AM around Indianapolis, Indiana, and realized that there were an awful lot of cars on the road for the middle of the night, presumably holiday traffic, but things were moving along smoothly, even through the ever-present construction zones.

One of the things that a really smart member of our technical staff and I were discussing the other day was the difference between dedupe on-the-wire and in-situ, and how to illustrate it succinctly for those of you who are being barraged with storage dedupe (which is generally in-situ), and WAN Optimization, which is in-flight. And this is it.congestion1

When you go to leave for Holiday (hat tip to our UK friends), you pile everyone into the car and hit the road. If you think of the road as your Internet connection and each car as representative of some data, and  the car travels once down the road, with each of the individuals in the car being instances of the same data… So our car had three instances of MacVittie in it – Lori, The Toddler, and I. But when we got to our destination, the three instances re-appeared as we climbed out of the car. Sure, the three instances were stiff, sleep deprived, and cranky, but rehydration will do that. When our daughters arrived, all four of them (two daughters, a grand-daughter, and a fiancé), also climbed out of the vehicle to be separate entities again. Our oldest son had to work, and thus was never sent over the wire/roadway, and didn’t get deduplicated.

Now, so you all pile into one car, one car is sent over the roadway, then you all show up at the end. It’s an oversimplification, and probably the architect I was talking with is gritting his teeth, but here comes the example part. Imagine if each individual going on holiday took a different car? We’d have sent seven cars instead of two. Multiply that by the number of cars full of people on the road. Now imagine trying to drive through that. That’s what in-flight dedupe does for you, less cars.

Your duplicated data transmission.

(compliments of photocarsonline.com)

In storage dedupe – the primary place where deduplication is bandied about these days – five of us would have been eliminated from the world, leaving only two actual people and references to those two. Never i-190_il_wt_14again would the originals be seen (short of some work on your part anyway), just a number and a list of differences between the original and this “copy”. On the road (and in on-the-fly compression such as that done with WAN Optimization), we all piled into the car, and we all piled out, completely rehydrated, all originals, which just a faint memory of being crammed together in a car.

In on-the-fly deduplication, you have a box at either end (two of our LTM + WOM boxes, for example), and only during flight does it matter if one of those boxes disappears. In storage deduplication, you have to  keep the vendor that did the dedupe. Though there are some implementations that can handle vendor change, I don’t see them in use because they don’t offer as much data for management purposes as proprietary solutions do. At least you have to keep them around until you’ve moved everything out of dedupe engine’s workspace. In on-the-fly dedupe, you have but to shut down the pipe, change the boxes from vendor A to vendor B, and turn the connection back on. No loss, no worries, because only while in-flight (while in the car) is the data changed from the original.

As to compression, we won’t use the freeway analogy to talk about that ;-).


Your deduplicated pipe.

(compliments of www.interstate-guide.com)

Follow me on Twitter    icon_facebook

AddThis Feed Button Bookmark and Share

Related Articles and Blogs: