Tony Bourke has a fun little post on "Gotchas of Load Balancing" that really end up being your fault. Sorta.

All very true and common mistakes that many people have made when configuring load balancers. But that got me thinking - and laughing - about a couple of "gotchas" that were my own fault back in the Network Computing lab.

When Bits Don't Match

You cannot plug a 10/100 port into a GigE only port and expect things to work. Really. One of the core routers in our lab had a GigE only blade and for some reason I always forgot about it - probably because all the other RJ45 ports in the other blades in that router were tri-state (10/100/1000).

The reason this one was such a problem was because for some reason you tend to blame the configuration of a product before you blame the network. After all, the network was working just fine for everything else - the only difference was the new product, right?

When Metrics Don't Measure Up

We had two sets of core routers configured to simulate three complete networks - a T1, 100Mbps, and GigE. We used them for a variety of purposes, and one of those was to simulate failure. The GigE connection was the primary, 100Mbps the secondary, and T1 the tertiary. If we wanted to test products over a T1 we just pulled the other two connections. Worked like a charm. We used routing metrics of 10, 5, and 1 respectively to keep things flowing properly and easily fail back when we pulled a plug.

Well, we had reconfigured things for some reason and somehow the metrics on one "side" of the network got mixed up. We had sporadic network problems - lost packets, high latency - for a while and just could not figure out why. Finally Don found the problem when he was testing some WAN optimization products, primarily because he had dropped the network to simulate a T1 but the network just wasn't working right due to the mismatch in metrics.

The Gateway to ... Nowhere?

When you're setting up more complex networks involving load balancers you may need to point the default gateway on the server to the load balancer. If you forget to do this you will undoubtedly become frustrated when your browser appears to be connecting to the web server but you never seem to get a response.

This was one of those beginner's mistakes that really caused a great deal of frustration in the very early days of learning about load-balancers because everything appeared to be working correctly - the load balancer saw the response and sent it to the web server - but all the responses ended up in some network black hole.

 

And that's just networking snafus. After years of testing myriad enterprise applications, believe me, there's many more where these came from. Everyone's got at least one story, I'm sure, of encountering problems and mistakenly blaming them on a device - likely with very colorful language - before discovering that the problem was, in fact, of their own making. Got a story to share? Leave a comment!

Imbibing: Mountain Dew

Technorati tags: , ,