Enhanced vMotion Compatibility

Enhanced vMotion Compatibility (EVC) is available for a while now, but it seems to be slowly adopted. Recently VMguru.nl featured an article “Challenge: vCenter, EVC and dvSwitches” which illustrates another case where the customer did not enable EVC when creating the cluster. There seem to be a lot of misunderstanding about EVC and the impact it has on the cluster when enabled.

What is EVC?
VMware Enhanced VMotion Compatibility (EVC) facilitates VMotion between different CPU generations through use of Intel Flex Migration and AMD-V Extended Migration technologies. When enabled for a cluster, EVC ensures that all CPUs within the cluster are VMotion compatible.

What is the benefit of EVC?
Because EVC allows you to migrate virtual machines between different generations of CPUs, with EVC you can mix older and newer server generations in the same cluster and be able to migrate virtual machines with VMotion between these hosts. This makes adding new hardware into your existing infrastructure easier and helps extend the value of your existing hosts.

EVC forces newer processors to behave like old processors

Well, this is not entirely true; EVC creates a baseline that allows all the hosts in the cluster that advertises the same feature set. The EVC baseline does not disable the features, but indicates that a specific feature is not available to the virtual machine.

Now it is crucial to understand that EVC only focuses on CPU features, such as SSE or AMD-now instructions and not on CPU speed or cache levels. Hardware virtualization optimization features such as Intel VT-Flexmigration or AMD-V Extended Migration and Memory Management Unit virtualization such as Intel EPT or AMD RVI will still be available to the VMkernel even if EVC is enabled. As mentioned before EVC only focuses of the availability of features and instructions of the existing CPUs in the cluster. For example features like SIMD instructions such as the SSE instruction set.

Let’s take a closer look, when selecting an EVC baseline, it will apply a baseline feature set of the selected CPU generation and will expose specific features. If an ESX host joins the cluster, only those CPU instructions that are new and unique to that specific CPU generation are hidden from the virtual machines. For example; if the cluster is configured with an Intel Xeon Core i7 baseline, it will make the standard Intel Xeon Core 2 feature plus SSE4.1., SSE4.2, Popcount and RDTSCP features available to all the virtual machines, when an ESX host with a Westmere (32nm) CPU joins the cluster, the additional CPU instruction sets like AES/AESNI and PCLMULQDQ are suppressed.

As mentioned in the various VMware KB articles, it is possible, but unlikely, that an application running in a virtual machine would benefit from these features, and that the application performance would be lower as the result of using an EVC mode that does not include the features.

DRS-FT integration and building block approach
When EVC is enabled in vSphere 4.1, DRS is able to select an appropriate ESX host for placing FT-enabled virtual machines and is able to load-balance these virtual machines, resulting in a more load-balanced cluster which likely has positive effect on the performance of the virtual machines. More info can be found in the article “DRS-FT integration”.

Equally interesting is the building block approach, by enabling EVC, architects can use predefined set of hosts and resources and gradually expand the ESX clusters. Not every company buys computer power per truckload, by enabling EVC clusters can grow clusters by adding ESX host with new(er) processor versions.
One potential caveat is mixing hardware of different major generations in the same cluster, as Irfan Ahmad so eloquently put it “not all MHz are created equal”. Meaning that newer major generations offer better performance per CPU clock cycle, creating a situation where a virtual machine is getting 500 MHz on a ESX host and when migrated to another ESX host where that 500 MHz is equivalent to 300 MHz of the original machine in terms of application visible performance. This increases the complexity of troubleshooting performance problems.

Recommendations?
No performance loss will be likely when enabling EVC. By enabling EVC, DRS-FT integration will be supported and organizations will be more flexible with expanding clusters over longer periods of time, therefor recommending enabling EVC on clusters. But will it be a panacea to stream of new major CPU generation releases? Unfortunately not! A possibility is to treat the newest hardware (Major releases) as a higher service as the older hardware and because of this create new clusters

4 Comments

  1. Hey Frank,

    Recently our Cisco UCS ESX cluster had an additional host added to it that was the Core i7 Xeon 5650 family. The other two hosts in the cluster where Xeon 5550’s so they had slightly different CPUs. I had to use EVC for the first time, but it worked like a charm and enabled my cluster to work normally while we waited for different hardware. Awesome, awesome feature.

  2. One scenario EVC came in handy was when I performed a hardware refresh of servers with Core2 processors with servers using procs in the Nehalem family. I was able to set EVC to the Core2 level, bring in the new hardware and vMotion all VMs to the new hardware. I then pulled out the old that was used later used for DR equipment.

    All of this was done with zero downtime during normal business hours. Once all new servers were identical in the cluster, the EVC level was able to be increased to the i7 level. The VMs just needed to be rebooted during the next maintenance schedule to take advantage of the new instruction set.

  3. Hi Frank,

    Great summary! This concept keeps confusing people and I guess that’s because not many are familiar (nor need to be!) with these processor features.
    Adding a processor with more features to a cluster is relatively easy (just hide those new features as you indicated). What about adding a processor with less features though? Hiding existing features on processors that are already in use doesn’t look like a feasible solution. Would that imply that one cannot add a processor with less features to a cluster?

  4. We upgraded our cluster for Xeon 5570’s to 5670’s when a major new application load shifted the balance between memory and CPU speed and we did not want to have the significant expense of more hosts, more licenses, and more power usage. So we upgraded one host and placed it into a new EVC cluster and manually vmotioned some VM’s to it and then upgraded all the other cluster members moving them and their share of the workload into the new EVC cluster a piece at a time. All during production hours without any user notice except for the improved performance =)

    Bert, no to join the cluster the host must have at least the features of the current EVC mode. What you could do is create a new cluster with the EVC mode set to that of your new host and move the existing cluster members over, but unlike my move that would require downtime as the guests would suddenly not have access to features that they might be using.

Comments are closed.

© 2017 frankdenneman.nl

Theme by Anders NorenUp ↑