Topics


Blogs


Forums


Samples


Media


Labs


Resources

 




DevCentral > Weblogs > Nojan Moshiri - Nojan's blog
 Introducing: Long Distance VMotion with VMWare
posted on Tuesday, February 02, 2010 2:46 PM

It seems like I blinked and 2009 went by, but in that time I've been working on so many interesting projects at F5, I have a backlog of information to share with the community.  The first post this year is about the long distance VMotion with VMWare's ESX system.  This is a solution that enables the movement of live running virtual machine hosts from one data center to another.

The main problems in routing VMotion between data centers are latency, bandwidth, client traffic and security.  In BIG-IP 10.1 we have a solution that compresses, encrypts and shields the ESX servers from prevailing WAN conditions, to enable long distance motion of running hosts.  Take a look at the following screencast to see how this works:

 

 

 Screen shot 2010-02-02 at 10.44.45 AMIn the chart below are some of the typical improvement times we see with long distance VMotion with BIG-IP.  When latency goes up, VMotion is often not possible without BIG-IP in place.  For example, with 100 ms of round-trip latency, on an OC3, a virtual machine that has one gigabyte of active RAM memory, takes roughly three and a half minutes to migrate across the WAN.  If you were to try the same VMotion without BIG-IP in place, it would take more than 13 minutes and only succeed about half the time.

I'm excited about the types of architectures that can be enabled with this kind of solution in place.  F5 is laying the ground work to make some exciting infrastructures possible

 

Have a look at the F5 deployment guide which describes how to set this solution up and how to architect new solutions across your data centers: http://www.f5.com/pdf/deployment-guides/vmware-vmotion-dg.pdff

 



 
      

Feedback


2/9/2010 12:08 PM
Gravatar I was wondering if isessions and WOM would help with just basic management of vmware hosts over a WAN? e.g. right now for example if I try to have a remote ESXi host connect to vCenter from ~75-150ms away it frequently times out and disconnects. Also managing the host directly with the vmware client is of course quite sluggish.

I don't have a need for vMotion over a WAN but it would be real handy to do vCenter over a WAN with good performance.
nate

2/9/2010 12:21 PM
Gravatar Nate, that's an excellent question. With that much latency, TCP optimization would definitely be useful, as would iSessions. However, your environment would have to be one that lends itself to this setup.

Because the administration communication is encrypted via SSL, compression and de-duplication would be of little use without terminating the SSL first. However, if you have a situation where a symmetric deployment of WOM could be setup, there may be some benefit there.

Specifically, you would need a BIG-IP to terminate your connection locally (before it enters the WAN), the SSL could be terminated there, then the connection could be optimized over the WAN, to the BIG-IP closest to the ESXi host.

This would work if your ESX servers being managed are in, lets say, San Jose and you are in the office in New York. Our Edge Gateway product would be ideally suited for such an application. You can read more about Edge Gateway here: www.f5.com/.../edge-gateway.html

Anyway.. great question. I'll play with this in the lab when I get a chance, under the network conditions you're describing, and let you know what kind of results I see.
nojan

2/9/2010 2:12 PM
Gravatar hmm, I suppose the issue then would be to somehow get vmware to talk non SSL to the LTM, I wonder how that might be configured.

We do have LTMs at all of our sites, soon all will be 3900s and running 10.1(so far just one site running 10.1). I have been interested in WOM for specifically this purpose for a while now, I do look forward to hearing any results you might get or if you have any tips on configurations on our end for manipulating how vCenter/ESXi talk to each other, since in the coming couple of months we'll have more sites running 10.1 code on our LTMs.

I will look into the edge gateway product I had not heard of that before.

thanks!
nate

2/9/2010 3:57 PM
Gravatar Terminating SSL for vSphere Client/Server should not be an issue. You would just create a virtual server that has as its pool member the vCenter Server. The client would connect to the virtual server via SSL, BIG-IP would then decrypt the traffic it, optimize it, pass it over the WAN over an encrypted iSession Tunnel and then it would terminate the other BIG-IP where it would then be passed to the vCenter Server.

Sounds more complicated than it is, it's actually a fairly standard way of doing things with in these situations. I'll definitely post the results here, if you're interested, drop me an email n.moshiri @ f5 dot com and I'll email you the results directly.
nojan

2/3/2010 8:27 AM
Gravatar Wow, that look impressive! What was the spec of your guest VM?
stuart mchugh

2/3/2010 9:12 AM
Gravatar Good question, the VM in this test was a Arch Linux host with one gig of RAM. It was running a content management system called Drupal (http://www.drupal.org) which uses PHP, Apache and MySQL. All of the components were running on the box in order to really stress both the CPU and the memory.

When virtual machines are idle the process of VMotion is pretty straightforward but still subject to the effects of loss, latency and poor bandwidth. When the VMs are completely stressed (swap, CPU pegged, etc) , long distance VMotion becomes nearly impossible.

With BIG-IP and iSessions, as I demonstrate in this video, we see an almost 100% success rate over hundreds of VMotion runs.
nojan

2/3/2010 5:06 PM
Gravatar Very Interesting application of exisiting features to accomodate vMotion. I was not aware of the fact that for vMotion vmkernel IP address of ESX host can be acorss different subnet. In fact in many VMWORLD events and documents mention that for vMotion to occur ESX hosts vmkernel network should be same i.e same layer2/vlan.

Below are few questions I have

1. On the BIG-IP front How are the session persistance parameters exchanged between the local and remote BIG-IP? source-ip persistences or cookies that are learned from the servers?? I understand cookies set by BIG-IP can be interpreted by both the BIG-IP's as it can be mapped to the VM.
2. what happens to the TCP data for a given connection to a VM which got moved to different data center?

Thanks,
Tech Savy Engineer
Tech Savy Engineer

2/4/2010 2:13 PM
Gravatar Hi Tech Savy, thanks for the questions. At VMWorld last year we previewed and announced this feature. We showed a demo of the long distance VMotion over layer 7, in a routed mode. There are some documents on vmworld.com that you can review.

Now we've fully documented the solution, which fully utilizes our full-proxy architecture in BIG-IP and leverages our newly released wan optimization module to not only extend VMotion but to secure it as well.

In regards to your specific questions:

1) For client traffic, session persistence is not shared between the BIG-IPs in the two data centers with our current release. Sessions persistence needs to be handled by applications using cookie persistence.

For example, by including a value that maps to a host inside the application cookie, BIG-IP can easily direct users back to the same host once it's in the other data center.

2) In regards to the TCP connection, it's important to understand that today we do not carry established connections over to the new datacenter. We buffer any connections and re-transmit them when appropriate, we rely on global traffic management to "flip" data centers for a particular application virtual IP address, and we rely on session persistence within the application.

One of the use cases for this solution is a J2EE application server for example. With J2EE memory has been carved out for individual user sessions, JSPs have been compiled, and the application is in a steady running state, migrating such servers without an outage brings quite a few gains in startup time, user experience, etc. Only the connection between the webserver and the application server would need to re-established once such a server is moved to the new data cetner.

Anyway, thanks for your question.


nojan
 Leave Feedback
Title  
Name  
Email
Url
Comments   
Please add 5 and 2 and type the answer here: