Surprised? I was, but I shouldn’t have been.

capacity While working on other topics I ran across an interesting slide in a presentation given by Microsoft at TechEd Europe 2009 on virtualization and Exchange. Specifically the presenter called out the average 12% overhead incurred from the hypervisor on systems in internal testing. Intuitively it seems obvious that a hypervisor will incur overhead; it is, after all, an application that is executing and thus requires CPU, I/O, and RAM to perform its tasks. That led to me to wonder if there was more data on the overhead from other virtualization vendors.

I ended up reading an enlightening white paper from VMware on consolidation of web applications and virtualization in which it observes that multi-virtual configurations actually outperformed in terms of capacity and performance a server configured with a similar number of CPUs. Note that this is specifically for web applications, though I suspect that any TCP-heavy application would likely exhibit similar performance characteristics.

quotesAlthough virtualization overhead varies depending on the workload, the observed 16 percent performance degradation is an expected result when running the highly I/O‐intensive SPECweb2005 workload. But when we added the second processor, the performance difference between the two‐CPU native configuration and the virtual configuration that consisted of two virtual machines running in parallel quickly diminished to 9 percent. As we further increased the number of processors, the configuration using multiple virtual machines did not exhibit the scalability bottlenecks observed on the single native node, and the cumulative performance of the configuration with multiple virtual machines well exceeded the performance of a single native node.

-- “Consolidating Web Applications Using VMware Infrastructure” [PDF, VMware]

We know there’s overhead associated with the hypervisor. Fact. But what’s interesting here is that the overhead turns out to be irrelevant – at least in the case of web applications. What’s important is the initial degradation of performance and its subsequent improvement as additional virtual instances are added. We need to understand why that’s the case, because it has – or should have – an impact on our overall architectural strategy.