VMware Archives - Page 6 of 29

Resource Pools and Sibling Rivalry

July 16, 2018 by frankdenneman

One of the most powerful constructs in the Software Defined Data Center is the resource pool. The resource pool allows you to abstract and isolate cluster compute resources. Unfortunately, it’s mostly misunderstood and it received a bad rep in the past that it cannot get rid off.
One of the challenges of resource pools is to fully commit to resource pools. Placing virtual machines next to resource pools can have an impact of resource distribution. This article zooms in on sibling rivalry.
But before this adventure begins, I would like to stress that the examples provided in the article are a worst-case scenario. In this scenario, all VMs are 100% active. An uncommon situation, but it helps to easily explain the resource distribution. Later in the article, I use a few examples, in which some VMs are active and some are idle. And as you will see, resource pools aren’t that bad after all.
Resource Pool Size
Because resource pool shares are relative to other resource pools or virtual machines with the same parent resource pool, it is important to understand how vCenter sizes resource pools.
The values of CPU and memory shares applied to resource pools are similar to virtual machines. By default, a resource pool is sized like a virtual machine with 4 vCPUs and 16GB of RAM. Depending on the selected share level, a predefined number of shares are issued. Similar to VMs, four share levels can be selected. There are three predefined settings: High, Normal or Low, which specify share values with a 4:2:1 ratio, and the Custom setting, which can be used to specify a different relative relationship.

Share Level	Shares of CPU	Shares of Memory
Low	2000	81920
Normal	4000	163 840
High	8000	327 680

Caution must be taken when placing VMs at the same hierarchical level as resource pools, as VMs can end up with a higher priority than intended. For example, in vSphere 6.7, the largest virtual machine can be equipped with 128 vCPUs and 6 TB of memory. A 128vCPU and 6TB VM owns 256 000 (128 x 2000) CPU shares and 122 560 000 (6 128 000 x 20) memory shares. Comparing these two results in a CPU ratio is 32:1 and memory 374:1. The previous is an extreme example, but the reality is that 16GB and 4 vCPU VM is not uncommon anymore. Placing such a VM next to a resource pool results in unfair sibling rivalry.
The Family Tree of Resource Consumers
As shares determine the priority of the resource pool or virtual machine relative to its siblings, it is important to determine which objects compete for priority.

In the scenario depicted above, multiple sibling levels are present. VM01 and Resource Pool-1 are child objects of the cluster and therefore are on the same sibling level. VM02 and VM03 are child objects of Resource Pool-1. VM02 and VM03 are siblings, and both compete for resources provided by Resource Pool-1. DRS compares their share values to each other. The share values of VM01 and the other two VMs cannot be compared with each other because they each have different parents and thus do not experience sibling rivalry.
Shares indicate the priority at that particular hierarchical level, but the relative priority of the parent at its level determines the availability of the total amount of resources.
VM01 is a 2-vCPU 8GB virtual machine. The share value of Resource Pool-1 is set to high. As a result, the resource pool owns 8000 shares of CPU. The share value of VM01 is set to Normal and thus it owns 2000 CPU shares. Contention occurs, and the cluster distributes its resources between Resource Pool-1 and VM01. If both VM02 and VM03 are 100% utilized, Resource Pool-1 receives 80% of the cluster resources based on its share value.

Resource Pool-1 divides its resources between VM02 and VM03. Both child-objects own an equal number of shares and therefore receive each 50% of the resources of Resource Pool-1

This 50% of Resource Pool-1 resources equals to 40% of the cluster resources. As for now, both VM02 and VM03 are able to receive more resources than VM-1. However, three additional VMs are placed inside Resource Pool-1. The new VMs own each 2000 CPU shares, increasing the total number of outstanding shares to 10.000.

The distribution at the first level remains the same during contention. The cluster distributes its resources amongst its child-object, VM01 and Resource Pool-1; 20% to VM01 and 80% to Resource Pool-1. Please note this only occurs when all objects are generating 100% utilized.
If VM01 was generating 50% of its load and the VMs in Resource Pool-1 are 100% utilized, the cluster would flow the unused resources to the resource pool to satisfy the demand of its child objects.
The dynamic entitlement is adjusted to the actual demand. The VMs inside RP-1 are equally active, as a result of the reduced activity of VM01, they each receive 2% more resources.

VM02, VM03, and VM04 start to idle. The Resource Pool shifts the entitlement and allocates the cluster resources to the VMs that are active, VM05 and VM06. They each get 50% of 80% of the cluster resources due to their sibling rivalry.

Share Levels are Pre-sets, not Classes
A VM that is placed inside the resource pool, or created in a resource pool, does not inherit the share level of the resource pool. When creating a VM or a resource pool, vCenter assigns the Normal share level by default, independent of the share level of its parent.
Think of share levels as presets of share values. Configure a resource pool or virtual machine with the share-level set to high, and it gets 2000 CPU shares per vCPU. A VM configured with the share level set to low gets 500 CPU shares. If the VM has 4 vCPUs, the VM owns the same number of shares than the 1 vCPU with a share value set to high. Both compete with each other based on share amounts, not based on share level values.
Next Article
This article is a primer for a question about which direction we should take with Resource Pools. This week I shall post a follow up article that zooms in on the possible changes of shares behavior of resource pool in the near future. Stay tuned.

Virtually Speaking Podcast about Technical Writing

July 11, 2018 by frankdenneman

Last week Duncan and I were guests on the ever popular Virtually Speaking Podcast. In this show we discussed the difference in technical writing, i.e., writing a blog post versus writing a book. We spoke a lot about the challenges of writing a book and the importance of a supporting cast. We received a lot of great feedback on social media, and Pete told me the episode was downloaded more than a 1000 times in the first 24 hours. I think this is especially impressive as he published the podcast on a Saturday Afternoon. Due to this popularity, I thought it might be cool to share the episode in case you missed the announcement.
Pete and John shared the links to our VMworld sessions on this page. During the show, I mentioned the VMworld session of Katarina Wagnerova and Mark Brookfield. If you go to VMworld, I would recommend attending this session. It’s always interesting to hear people talk about how they designed an environment and dealt with problems in a very isolated place on earth.
Enjoy listening to the show.

Resource Consumption of Encrypted vMotion

June 26, 2018 by frankdenneman

vSphere 6.5 introduced encrypted vMotion and encrypts vMotion traffic if the destination and source host are capable of supporting encrypted vMotion. If true, vMotion traffic consumes more CPU cycles on both the source and destination host. This article zooms in on the impact of CPU consumption of encrypted vMotion on the vSphere cluster and how DRS leverages this new(ish) technology.

CPU Consumption of vMotion Process

ESXi reserves CPU resources on both the destination and the source host to ensure vMotion can consume the available bandwidth. ESXi only takes the number of vMotion NICs, and their respective speed into account, the number of vMotion operations does not affect the total of CPU resources reserved! 10% of a CPU core for a 1 GbE NIC, 100% of a CPU core for a 10 GbE NIC. vMotion is configured with a minimum reservation of 30%. Therefore, if you have 1 GbE NIC configured for vMotion, it reserves at least 30% of a single core.

Encrypted vMotion

As mentioned, vSphere 6.5 introduced encrypted vMotion and by doing so it also introduced a new stream channel architecture. When an encrypted vMotion process is started, 3 stream channels are created. Prepare, Encrypt and Transmit. The encryption and decryption process consumes CPU cycles and to reduce the overhead as much as possible, the encrypted vMotion process uses the AES-NI Instruction set of the physical CPU.
AES-NI stands for Advanced Encryption Standard- New Instruction and was introduced in the Intel Westmere-EP generation (2010) and AMD Bulldozer (2011). It’s safe to say that most data centers run on AES-NI equipped CPUs. However, if the source or destination host is not equipped with AES-NI, vMotion automatically reverts to unencrypted if the default setting is selected.
Although the encrypted vMotion leverages special CPU hardware instructions set to offload overhead, it does increase the CPU utilization. The technical paper “VMware vSphere Encrypted vMotion Architecture, Performance, and Best Practices” published by VMware list the overhead on the source and destination host.

Encrypted vMotion CPU Overhead on the Destination Host

Encrypted vMotion is a per-VM setting, by default, every VM is configured with Encrypted vMotion set to Opportunistic. The three settings are:

Setting	Behavior
Disabled	Does not use encrypted vMotion
Opportunistic	Use encrypted vMotion if the source and destination host supports it. Only vSphere 6.5 and later use encrypted vMotion
Required	Only allow encrypted vMotion. If the source and destination host does not support encrypted vMotion, migration with vMotion is not allowed.

Please be aware that encrypted vMotion settings are transparent to DRS. DRS generates a load balancing migration, and when the vMotion process starts, the vMotion process verifies the requirements. Due to the transparency, DRS does not take encrypted vMotion settings and host compatibility into account when generating a recommendation.
If you select required, because of security standards, it is important to understand if you are running a heterogeneous cluster with various vSphere versions. Is every host in your cluster 6.5 otherwise you are impacting the ability of DRS to load-balance optimally. Or are different types of CPU generations inside the cluster, do they support AES-NI? Please make sure the BIOS version supports AES-NI and make sure AES-NI is enabled in the BIOS! Also, verify if the applied Enhanced vMotion Compatibility (EVC) baseline exposes AES-NI.

CPU Headroom

It is important to keep some unreserved and unallocated CPU resources available for the vMotion process, to avoid creating gridlock. DRS needs some resources to run its threads, and vMotion requires resources to move VMs to lesser utilized ESXi host. Know that encrypted vMotion taxes the system more, in oversaturated clusters, it might be interesting to understand whether your security officer state encrypted vMotion as a requirement.

Stretched Clusters on VMware Cloud on AWS, a Really Big Thing

May 16, 2018 by frankdenneman

This week Emad published an excellent article about the stretched cluster functionality of VMware Cloud on AWS. To sum up, you can now deploy a single vSphere cluster across two AWS availability zones.

A trip to Memory Lane
I think the ability to stretch a vSphere cluster across two availability zones is a really big thing. Go back to the days where we had to refactor the application to make it highly available. To reduce application downtime, you typically used clustering software such as Microsoft cluster or Veritas clustering services. But not all applications were fit for this solution.
When we introduced VMware High Availability back in 2006, we brought a big change to the industry. From that point on you could provide crash-consistent failover ability to all your workloads. No need to refactor any application, no need to build outlandish hardware solutions. Just enable a few tickboxes at the infrastructure layer, and every workload running inside a VM is protected. And to this day, HA remains the most popular functionality of vSphere.
Amazon Web Services Resiliency Strategy
Amazon urges you to design your application to be resilient to infrastructure outages. Amazon AWS is hosted in multiple locations worldwide. These locations are composed of regions and Availability Zones. Each region is a separate geographic area that has multiple, isolated locations known as Availability Zones. AWS provides the ability to place instances and data in multiple locations.
And you can take advantage of the safety and reliability of geographic redundancy by spanning your Auto Scaling group across multiple Availability Zones within a region and then attach a load balancer to distribute incoming traffic across those Availability Zones. Incoming traffic is distributed equally across all Availability Zones enabled for your load balancer.

And this works very well if you are refactoring your application or if you are building a complete new cloud-native stack. The challenge we face today is that not all applications lend to getting refactored, or some applications do not require the journey from monolithic to full-FAAS.
Hybrid-Cloud Experience
With stretched clusters in VMware Cloud on AWS, we introduce the same ease of infrastructure resiliency to workloads that run on AWS infrastructure. Merely expand you vSphere cluster to 6 hosts and select multi-az deployment.

After that, the workload in the Cloud SDDC is protected for AZ outages. If something happens, HA detects the failed VMs and restarts them on different physical servers in the remaining AZ without manual human involvement.
The ability to stretch your vSphere cluster across AZs allows you to easily provide resiliency to your workload within the AWS infrastructure without the Herculean effort of refactoring all your applications.

Dying Home Lab – Feedback Welcome

May 15, 2018 by frankdenneman

The servers in my home lab are dying on a daily basis. After four years of active duty, I think they have the right to retire. So I need something else. But what? I can’t rent lab space as I work with unreleased ESXi code. I’ve been waiting for the Intel Xeon D 21xx Supermicro systems, but I have the feeling that Elon will reach Mars before we see these systems widely available. The system that I have in mind is the following:

Intel Xeon Silver 4108 – 8 Core at 1.8 GHz (85TDP)
Supermicro X11SPM-TF (6 DIMMs, 2 x 10 GbE)
4 x Kingston Premier 16GB 2133
Intel Optane M.2 2280 32 GB

CPU
Intel Xeon Silver 4108 8 Core. I need to have a healthy number of cores in my system to run some test workload. Primarily to understand host and cluster scheduling. I do not need to run performance tests, thus no need for screaming fast CPU cores. TDP value of 85W. I know there is a 4109T with a TDP value of 70W, but they are very hard to get in the Netherlands.
Motherboard
Supermicro X11SPM-TF.Rocksolid Supermicro, 2 x Intel X722 10GbE NICs onboard and IPMI.
Memory
Kingston Premier 4 x 16 GB 2133 MHz. DDR4 money is nearing HP Printer Ink prices, 2133 MHz is fast enough for my testing, and I don’t need to test 6 channels of RAM at the moment. The motherboard is equipped with 6 DIMM slots, so if memory prices are reducing, I can expand my system.
Boot Device
Intel Optane M.2 32 GB. ESXi still needs to have a boot device, no need to put in 256 GB SSD.
This is the config I’m considering. What do you think? Any recommendations or alternate views?