Frank Denneman - Chief Technologist AI at VMware

Impact of oversized virtual machines part 1

December 16, 2010 by frankdenneman

Recently we had an internal discussion about the overhead an oversized virtual machine generates on the virtual infrastructure. An oversized virtual machine is a virtual machine that consistently uses less capacity than its configured capacity. Many organizations follow vendor recommendations and/or provision virtual machine sized according to the wishes of the customer i.e. more resources equals better performance. By oversizing the virtual machine you can introduce the following overhead or even worse decrease the performance of the virtual machine or other virtual machines inside the cluster.
Note: This article does not focus on large virtual machines that are correctly configured for their workloads.
Memory overhead
Every virtual machine running on an ESX host consumes some memory overhead additional to the current usage of its configured memory. This extra space is needed by ESX for the internal VMkernel data structures like virtual machine frame buffer and mapping table for memory translation, i.e. mapping physical virtual machine memory to machine memory.
The VMkernel will calculate a static overhead of the virtual machine based on the amount of vCPUs and the amount of configured memory. Static overhead is the minimum overhead that is required for the virtual machine startup. DRS and the VMkernel uses this metric for Admission Control and vMotion calculations. If the ESX host is unable to provide the unreserved resources for the memory overhead, the VM will not be powered on, in case of vMotion, if the destination ESX host must be able to back the virtual machine reservation and the static overhead otherwise the vMotion will fail.
The following table displays a list of common static memory overhead encountered in vSphere 4.1. For example, a 4vCPU, 8GB virtual machine will be assigned a memory overhead reservation of 413.91 MB regardless if it will use its configured resources or not.

Memory (MB)	2vCPUs	4vCPUs	8vCPUs
2048	198.20	280.53	484.18
4096	242.51	324.99	561.52
8192	331.12	413.91	716.19
16384	508.34	591.76	1028.07

The VMkernel treats virtual machine overhead reservation the same as VM-level memory reservation and it will not reclaim this memory once it has been used, furthermore memory overhead reservations will not be shared by transparent page sharing.
Shares (size does not translate into priority)
By default each virtual machine will be assigned a specific amount of shares. The amount of shares depends on the share level, low, normal or high and the amount of vCPUs and the amount of memory.

Share Level	Low	Normal	High
Shares per CPU	500	1000	2000
Shares per MB	5	10	20

I.e. a virtual machine configured with 4CPUs and 8GB of memory with normal share level receives 4000 CPU shares and 81960 memory shares. Due to relating amount of shares to the amount of configured resources this “algorithm” indirectly implies that a larger virtual machine needs to receive a higher priority during resource contention. This is not true, as some business critical applications perfectly are run on virtual machines configured with low amounts of resources.
Oversized VMs on NUMA architecture
vSphere 4.1 CPU scheduler has undergone optimization to handle virtual machines which contains more vCPUs than available cores on one NUMA physical CPU. The virtual machine (wide-vm) will be spread across the minimum number of NUMA nodes, but memory locality will be reduced, as memory will be distributed among its home NUMA nodes. This means that a vCPU running on one NUMA node might needs to fetch memory from its other NUMA node. Leading to unnecessary latency, CPU wait states, which can lead to %ready time for other virtual machines in high consolidated environments.
Wide-NUMA nodes are of great use when the virtual machine actually run load comparable to its configured size, it reduces overhead compared to the 3.5/4.0 CPU scheduler, but it still will be better to try to size the virtual machine equal or less than the available cores in a NUMA node.
More information about CPU scheduling and NUMA architectures can be found here:
http://frankdenneman.nl/2010/09/esx-4-1-numa-scheduling/
Go to Part 2: Impact of oversized virtual machine on HA and DRS

Enhanced vMotion Compatibility

December 14, 2010 by frankdenneman

Enhanced vMotion Compatibility (EVC) is available for a while now, but it seems to be slowly adopted. Recently VMguru.nl featured an article “Challenge: vCenter, EVC and dvSwitches” which illustrates another case where the customer did not enable EVC when creating the cluster. There seem to be a lot of misunderstanding about EVC and the impact it has on the cluster when enabled.
What is EVC?
VMware Enhanced VMotion Compatibility (EVC) facilitates VMotion between different CPU generations through use of Intel Flex Migration and AMD-V Extended Migration technologies. When enabled for a cluster, EVC ensures that all CPUs within the cluster are VMotion compatible.
What is the benefit of EVC?
Because EVC allows you to migrate virtual machines between different generations of CPUs, with EVC you can mix older and newer server generations in the same cluster and be able to migrate virtual machines with VMotion between these hosts. This makes adding new hardware into your existing infrastructure easier and helps extend the value of your existing hosts.

EVC forces newer processors to behave like old processors

Well, this is not entirely true; EVC creates a baseline that allows all the hosts in the cluster that advertises the same feature set. The EVC baseline does not disable the features, but indicates that a specific feature is not available to the virtual machine.
Now it is crucial to understand that EVC only focuses on CPU features, such as SSE or AMD-now instructions and not on CPU speed or cache levels. Hardware virtualization optimization features such as Intel VT-Flexmigration or AMD-V Extended Migration and Memory Management Unit virtualization such as Intel EPT or AMD RVI will still be available to the VMkernel even if EVC is enabled. As mentioned before EVC only focuses of the availability of features and instructions of the existing CPUs in the cluster. For example features like SIMD instructions such as the SSE instruction set.
Let’s take a closer look, when selecting an EVC baseline, it will apply a baseline feature set of the selected CPU generation and will expose specific features. If an ESX host joins the cluster, only those CPU instructions that are new and unique to that specific CPU generation are hidden from the virtual machines. For example; if the cluster is configured with an Intel Xeon Core i7 baseline, it will make the standard Intel Xeon Core 2 feature plus SSE4.1., SSE4.2, Popcount and RDTSCP features available to all the virtual machines, when an ESX host with a Westmere (32nm) CPU joins the cluster, the additional CPU instruction sets like AES/AESNI and PCLMULQDQ are suppressed.

As mentioned in the various VMware KB articles, it is possible, but unlikely, that an application running in a virtual machine would benefit from these features, and that the application performance would be lower as the result of using an EVC mode that does not include the features.
DRS-FT integration and building block approach
When EVC is enabled in vSphere 4.1, DRS is able to select an appropriate ESX host for placing FT-enabled virtual machines and is able to load-balance these virtual machines, resulting in a more load-balanced cluster which likely has positive effect on the performance of the virtual machines. More info can be found in the article “DRS-FT integration”.
Equally interesting is the building block approach, by enabling EVC, architects can use predefined set of hosts and resources and gradually expand the ESX clusters. Not every company buys computer power per truckload, by enabling EVC clusters can grow clusters by adding ESX host with new(er) processor versions.
One potential caveat is mixing hardware of different major generations in the same cluster, as Irfan Ahmad so eloquently put it “not all MHz are created equal”. Meaning that newer major generations offer better performance per CPU clock cycle, creating a situation where a virtual machine is getting 500 MHz on a ESX host and when migrated to another ESX host where that 500 MHz is equivalent to 300 MHz of the original machine in terms of application visible performance. This increases the complexity of troubleshooting performance problems.
Recommendations?
No performance loss will be likely when enabling EVC. By enabling EVC, DRS-FT integration will be supported and organizations will be more flexible with expanding clusters over longer periods of time, therefor recommending enabling EVC on clusters. But will it be a panacea to stream of new major CPU generation releases? Unfortunately not! A possibility is to treat the newest hardware (Major releases) as a higher service as the older hardware and because of this create new clusters

European distributor for HA and DRS book

December 9, 2010 by frankdenneman

As of today, our book “vSphere 4.1 HA and DRS Technical Deepdive” can be ordered via ComputerCollectief. Computercollectief is a dutch computer book and software reseller and ships to most European countries. Using Computercollectief, we hope to evade the long shipping times and accompanying costs.
Go check it out. http://www.comcol.nl/detail/73133.htm
Comcol expect to be able to deliver at the end of this month.

Dutch vBeers

December 6, 2010 by frankdenneman

Simon Long of The SLOG is introducing vBeers to Holland. I’ve copied the text from his vBeers blog article.

Every month Simon Seagrave and I try organise a social get together of like-minded Virtualization enthusiasts held in a pub in central London (and Amsterdam). We like to call it vBeers. Before I go on, I would just like to state, although it’s called vBeers, you do NOT have to drink beer or any other alcohol for that matter. This isn’t just an excuse to get blind drunk.
We came up with idea whilst on the Gestalt IT Tech Field Day back in April. We were chatting and we both recognised that we don’t get together enough to catch-up, mostly do to busy work schedules and private lives. We felt that if we had a set date each month, the likely hood of us actually making that date would be higher than previous attempts. So the idea of vBeers was born.

The first Amsterdam vBeers will be held on Thursday 16th of December starting at 6:30pm in ‘Herengracht Cafe’ which is placed close to Leidseplein and Dam Square. This venue serves a fine of selection of beers along with soft drinks and bar food.
Drinks will not be paid for, there will not be a tab. When you buy a drink please pay for it as no one else will be paying for your drinks.
* Location: The ‘Herengracht Cafe‘ Amsterdam
* Address: Herengracht 435, Herengracht/Leidsestraat
* Nearest Tram Station: Koningsplein – Lijn 1,2,5
* Time: 6:30pm
* Location: Map

HA and DRS Technical deepdive available

December 6, 2010 by frankdenneman

After spending almost a year on writing, drawing and editing, the moment Duncan and I waited for finally arrived… Our new book, the vSphere 4.1 HA and DRS technical deepdive is available on CreateSpace and Amazon.com.
Early this year Duncan approached me and asked me if I was interested in writing a book together on HA and DRS, without hesitation I accepted the honor. Before discussing the contents of the book I would like take the opportunity to thank our technical reviewers for their time, their wisdom and their input: Anne Holler (VMware DRS Engineering), Craig Risinger (VMware PSO), Marc Sevigny (VMware HA Engineering) and Bouke Groenescheij (Jume.nl). And a very special thanks to Scott Herold for writing the foreword!
But most of all I would like to thank Duncan for giving me this opportunity to work together with him on creating this book. The in-depth discussions we had are without a doubt the most difficult I have ever experienced and were very interesting, both most of all fun! Thanks!
Now let’s take a look at the book.Please note that we are still working on an electronic version of the book and we expect to finish this early 2011. This is the description of the book that is up on CreateSpace:
About the authors:
Duncan Epping (VCDX 007) is a Consulting Architect working for VMware as part of the Cloud Practice. Duncan works primarily with Service Providers and large Enterprise customers. He is focused on designing Public Cloud Infrastructures and specializes in bc-dr, vCloud Director and VMware HA. Duncan is the owner of Yellow-Bricks.com, the leading VMware blog.
Frank Denneman (VCDX 029) is a Consulting Architect working for VMware as part of the Professional Services Organization. Frank works primarily with large Enterprise customers and Service Providers. He specializes in Resource Management, DRS and storage. Frank is the owner of frankdenneman.nl which has recently been voted number 6 worldwide on vsphere-land.com
VMware vSphere 4.1 HA and DRS Technical Deepdive zooms in on two key components of every VMware based infrastructure and is by no means a “how to” guide. It covers the basic steps needed to create a VMware HA and DRS cluster, but even more important explains the concepts and mechanisms behind HA and DRS which will enable you to make well educated decisions. This book will take you in to the trenches of HA and DRS and will give you the tools to understand and implement e.g. HA admission control policies, DRS resource pools and host affinity rules. On top of that each section contains basic design principles that can be used for designing, implementing or improving VMware infrastructures.
Coverage includes:
• HA node types
• HA isolation detection and response
• HA admission control
• VM Monitoring
• HA and DRS integration
• DRS imbalance algorithm
• Resource Pools
• Impact of reservations and limits
• CPU Resource Scheduling
• Memory Scheduler
• DPM

We hope you will enjoy reading it as much as we did writing it. Thanks,
Eric Sloof received a proof copy of the book and shot a video about it.