Playing tonight: DRS and the IO controllers

Ever wondered why the band is always mentioned second, is the band replaceable? Is the sound of the instruments so ambiguous that you can swap out any musician with another? Apparently the front man is the headliner of the show and if he does he job well he will never be forgotten. The people who truly recognize talent are the ones that care about the musicians. They understand that the artist backing the singer create the true sound of the song. And I think this is also the case when it comes to DRS and his supporting act the Storage controllers. Namely SIOC and NETIOC. If you do it right, the combination creates the music in your virtual datacenter, well at least from a resource management perspective. ;)

Last week Chris Wahl started a discussion about DRS and its inability to not load-balance perfectly the VMs amongst host. Chris knows about the fact that DRS is not a VM distribution mechanism, his argument is more focused on the distribution of load on the backend; the north-south and east-west uplinks. And for this I would recommend SIOC and NETIOC. Let’s do a 10.000 foot flyby over the different mechanisms.

Distributed Resource Scheduler (DRS)
DRS distributes the virtual machines – the consumers – across the ESXi hosts, the producers. Whenever the virtual machine wants to consume more resources, DRS attempts to provide these resources to this virtual machine. It can do this by moving other virtual machines to different hosts, or move the virtual machine to another host. Trying to create an environment where the consumers can consume as much as possible. As workload patterns differ from time to time, from day to day, an equal number of VMs per host does not provide a balanced resource offering. It’s best to create a combination of idle and active virtual machines per host. And now think about the size of virtual machines, most environments do not have a virtual machine configuration landscape to utilizes a identical hardware configuration. And if that was the case, think about the applications, Some are memory bound, some applications are CPU bound. And to make it worse, think load correlation and load synchronicity. Load correlation defines the relationship between loads running in different machines. If an event initiates multiple loads, for example, a search query on front-end webserver resulting in commands in the supporting stack and backend. Load synchronicity is often caused by load correlation but can also exist due to user activity. It’s very common to see spikes in workload at specific hours, for example think about log-on activity in the morning. And for every action, there is an equal and opposite re-action, quite often load correlation and load synchronicity will introduce periods of collective non-or low utilization, which reduce the displayed resource utilization. All these things, all this coordination is done by DRS, fixating an identical number of VMs per host is in my opinion lobotomizing DRS.

But DRS is only focused on CPU and Memory. Arguably you can treat network and storage somewhat CPU consumption as well, but lets not go that deep. Some applications are storage bound some applications are network bound. For this other components are available in your vSphere infrastructure. The forgotten heroes, SIOC and NETIOC.

Storage IO Control (SIOC)
Storage I/O Control (SIOC) provides a method to fairly distribute storage I/O resources during times of contention. SIOC provides a datastore-wide scheduling using virtual disk shares to calculate priority. In a healthy and properly designed environment, every host that is part of the cluster should have a connection to the datastore and all host should have an equal amount of paths to the datastore. SIOC monitors the consumption and if the latency experienced by the virtual machine exceeds the user-defined threshold, SIOC distributes priority amongst the virtual machines hitting that datastore. By default every virtual machine receives the same priority per VMDK per datastore, but this can be modified if the application requires this from a service level perspective.

Network I/O Control (NETIOC)
The east-west equivalent of its north-south brother SIOC. NETIOC provides control for predictable networking performance while different network traffic streams are contending for the same bandwidth. Similar controls are offered, but are now done on traffic patterns instead of a per virtual machine basis. Similar architecture design hygiene applies here as well. All hosts across the cluster should have the same connection configuration and amount of bandwidth available to them. The article “A primer on Network I/O Control” provides more info on how NETIOC works, VMware published a NETIOC Best Practice white paper a while ago, but most of it is still accurate.

And the bass guitar player of the virtual datacenter, Storage DRS.
Storage DRS provides virtual machine disk placement and load balancing mechanisms based on both space and I/O capacity. Where SIOC reactively throttles hosts and virtual machines to ensure fairness, SDRS proactively generates recommendations to prevent imbalances from both space utilization and latency perspectives. More simply, Storage DRS does for storage what DRS does for compute resources.

These mechanism combined with a healthy – well architected – environment will help you distribute the consumers across the producers with the proper context in mind. Which virtual machines are hot and which are not? Much better than playing the numbers game! Now, one might argue but what about failure scenarios? If a have an equal number of VMs running on my host, my failover time decreases as well. Well it depends. HA distributes virtual machines across the cluster and if DRS is up and running, it moves virtual machines around if it cannot satisfy the resource entitlement of the virtual machines (VM level reservations). Duncan wrote about DRS and HA behavior a while ago, and of course we touched upon this in our book the 5.1 clustering deepdive. (still fully applicable for 5.5 environments)

In my opinion, trying to outsmart advanced and adaptive computer algorithms with basic math reasoning is really weird. Especially when most people are talking about Software defined datacenters and whether you are managing pets versus cattle. When your environment is healthy and layed-out in a homogenous way , you cannot beat computer algorithms. The thing you should focus on is the alignment of resource priority to business service levels. And that’s what you achieve by applying the correct share levels at DRS, SIOC and NETIOC levels. Maybe you can devops your way into leveraging various scripting languages. ;)

Help my DRS cluster is not load balancing!

Unfortunately I still see this cry for help appearing on the VMTN forums and on twitter. And they usually are accompanied by screenshots like this:

01-DRS-unbalanced-memory

This screen doesn’t really show you if your DRS cluster is balanced or not. It just shows if the virtual machine receives the resources they are entitled to. The reason why I don’t use the word demand is that DRS calculates priority based on virtual machine and resource pool resource settings and resource availability.

To understand if the virtual machine received the resources it requires, hover over the bar and find the virtual machine. A new window is displayed with the metric “Entitled Resources Delivered”

02-DRS-VM-resource-entitlement

DRS attempts providing the resources requested by the virtual machine. If the current host is not able to provide the resources, DRS move it to another host that is able to provide the resources. If the virtual machine is receiving the resources it requires then there is no need to move the virtual machine to another hosts. Moves by DRS consume resources as well and you don’t want to waste resources on unnecessary migrations.

To avoid wasting resources, DRS calculates two metrics, the current host load standard deviation and the target host load standard deviation. These metrics indicate how far the current load of the host is removed from the ideal load. The migration threshold determines how far these two metrics can lie apart before indicating that the distribution of virtual machines needs to be reviewed. The web client contains this cool water level image that indicates the overall cluster balance. It can be found at the cluster summary page and should be used as a default indicator of the cluster resource status.

03-DRS-Host Load Standard Deviation

One of main arguments is that a host contain more than CPU and memory resources alone. Multiple virtual machines located on one host, can stress or saturate the network and storage paths extensively, whereas a better distribution of virtual machine across the hosts would also result in a better distribution of resources at the storage and network path layer. And this is a very valid argument, however DRS is designed to take care of CPU and Memory resource distribution and is therefor unable to take these other resource consumption constraints into account.

In reality DRS takes a lot of metrics into account during its load balance task. For more in-depth information I would recommend to read the article: “DRS and memory balancing in non-overcommitted clusters” and “Disabling mingoodness and costbenefit”.

3 common questions about DRS preferential VM-Host affinity rules

On a regular basis I receive questions about the behavior of DRS when dealing with preferential VM to Host affinity rules. The rules configured with the rule set “should run on / should not run on” are considered preferential. Meaning that DRS prefers to satisfy the requirements of the rules, but is somewhat flexible to run a VM outside the designated hosts. It is this flexibility that raises questions; lets see how “loosely” DRS can operate within the terms of conditions of a preferential rule:

Question 1: If the cluster is imbalanced does DRS migrate the virtual machines out of the DRS host group?
DRS only considers migrating the virtual machines to hosts external to the DRS host group if each host inside the group is 100% utilized. And if the hosts are 100% utilized, then DRS will consider virtual machines that are not part of a VM-Host affinity rule first. DRS will always avoid violating an affinity rule

Question 2: When a virtual machine is powered on, will DRS start the virtual machine on a host external to the DRS host group?
By default DRS will start the virtual machine on hosts listed in the associated Host DRS group. If all hosts are 100% utilized – or – if they do not meet the virtual machine hardware requirements such as datastore or network connectivity, then DRS will start the virtual machine on a host external to the Host DRS group.

Question 3: If a virtual machine is running on a host external to the associated host DRS group, shall DRS try to migrate the virtual machine to a host listed in the DRS host group?
The first action DRS triggers during an invocation is to determine if an affinity rules is violated. If a virtual machine is running on a host external to the associated Host DRS group then DRS will try to correct this violation. This move will have the highest priority ensuring that this move is carried out during this invocation.

Migrating datastore clusters by changing storage profiles in a vCloud

vCloud director 5.1 supports the use of both storage profiles and Storage DRS. One of the coolest features and unfortunately relatively unknown is the ability to live migrate virtual machines between datastore clusters by changing the storage profile in the vCloud director portal.

In my lab I’ve set up a provider vDC that contains two compute clusters. Each compute cluster connects to two datastore clusters. Datastore Cluster “vCloud-SDC-Gold” is compatible with the VM storage profile “vCloud-Gold-Storage”, while Datastore Cluster “vCloud-SDC-Silver” is compatible with the VM storage profile “vCloud-Silver-Storage”.

01

When creating a vApp the default storage profile of the organization vDC is applied to the vApp and all its virtual machines. In this case, the VM storage profile Gold is applied to all the virtual machines in the vApp.

You can determine which VM Storage Profile is associated with the virtual machine by selecting the properties of the virtual machine in the “My Cloud” tab. Please note that vCloud Director does not show the VM Storage Profile at the vApp level!

02

By selecting the drop-down box, all storage profiles that are associated with the organization vCD are displayed.

03

By selecting the Storage Profile “vCloud-Silver-Storage” vCloud Director determines that the virtual machine is stored on a datastore that is not compatible with the associated storage profile. In other words the current configuration is violating the storage level policy.

To correct this violation, vCloud director instructs vSphere to migrate the virtual machine via Storage vMotion to a datastore that is compatible with the VM storage Profile. In this case the datastore cluster “vCloud-DSC-Silver” is selected as the destination. Storage DRS determines the most suitable datastore by using its initial placement algorithm and selects the datastore that has the most amount of free space and the lowest I/O load.

To demonstrate the feature, I selected the virtual machine “W2K8_R2-SP1”. The VM storage profile “vCloud-Gold-Storage” is applied and Storage DRS determined that the datastore “nfs-f-vcloud03” of the datastore cluster “vCloud-DSC-Gold” was the most suitable location.

04

By changing the Storage Profile to “vCloud-Silver-Storage” vCloud director instructed vSphere to migrate it to the datastore cluster that is compatible with the newly associated VM storage profile.

05

When logging into the vCenter server managing the ESXi hosts the following task is running:

06

After the task is complete, vCenter shows that the virtual machine is now stored on datastore “nfs-f-vcloud06″ in the datastore cluster “vCloud-DSC-Silver”.

07

The power of abstraction
The abstraction layer of vCloud Director makes this possible. When changing the storage profile directly on the vSphere layer, nothing happens. vSphere will not migrate the virtual machine to the appropriate datastore cluster that is compatible with the selected VM storage profile.

Useful for stretched clusters?
The reason why I was looking into this feature in my lab is due to an conversation with my esteemed colleagues Lee Dilworth and Aidan Dalgleish. We were looking to an alternative scenario for a stretched cluster. By leveraging the elastic vDC feature of vCloud director, a seperate DRS cluster is created in each site. Due to the automatic initial placement engine on the compute level, we needed to find a construct that can provide us a more deterministic method of virtual machine placement. We immediately thought of the VM profile storage feature. Create two datastore clusters, one per site and associate a profile storage based on site name to the respective datastore clusters.

08

When creating the vApp, just select the site-related Storage Profile to place the virtual machine in a specific site. Due to the compatibility check, vCloud Director determines that in order to be compliant with the storage profile it places the virtual machine on the compute cluster in the same site. For example, if you want to place a virtual machine in site 1, select the VM storage Profile “site 1”. vCloud director determines that the virtual machine needs to be stored in datastore cluster “DSC-Site-1”. The compute cluster Site-1 is the only compute cluster connected to the datastore cluster, therefor both the compute and storage configuration of the virtual machine is stored in Site 1.

This configuration works perfect if you want to simplify initial placement if you have multiple sites/locations and you always want to keep the virtual machine in the same site. However this solution might not be optimal for a Stretched cluster configuration where failover to another site is necessary.

Connectivity to all datastores necessary
As this feature uses storage vMotion instead of cross-host/datastore vMotion, means that the cluster needs to be connected to both datastore clusters.

09
When selecting the different storage profile, the storage state is migrated to another datastore cluster. However it doesn’t move the compute state of the virtual machine. This means that storage is moved to site B, while the compute state is still in Site A. vCloud director does not provide an option to migrate the virtual machine to a different compute cluster within the provider vDC. You can either solve it by logging into the vCenter server that manages the ESXi hosts and manually vMotion the virtual machines to cluster in Site B, or power-off the virtual machine in vCloud Director, then change the storage profile and power-on the virtual machine. Both “solutions” are not very enterprise-level scenario’s therefor I think this is not yet suitable as a stretched cluster configuration

Saving a Resource Pool Structure web client feature not suitable for vCD environments

Last week I published the article “Saving a Resource Pool Structure” describing the RP-tree backup and restore feature of vSphere 5.1 web client. Multiple people immediately asked if the feature keeps the Managed Object Reference ID (MoRef) of the resource pools identical when it restores the resource pool tree? This is important for vCloud Director as it creates a relationship between vCloud Director objects organization vCD and the vSphere level resource pool. vCloud Director ties the org vCD UUID with the vSphere resource pool Moref id within vCD database. For more information read Chris his post: “Gotcha: Disabling VMware DRS with vCloud Director“.

Unfortunately the feature just captures the old tree structure and rebuilds a new tree structure. I tested it by using William Lam’s custom Perl script called moRefFinder.pl. Please visit Williams site to download his script.

1-moref-before

Then I proceeded to backup and restore the resource pool tree. vCenter showed the follow commands being processed.

2-vCenter-operations

Then I checked if the MoRef ID was the same as prior to disabling DRS.

3-moref-after

As shown, the current MoRef ID of the “00-Infra-mgmt” resource pool is 137 contrary to MoRef ID of 129 before disabling DRS.

Therefor you should not use this feature when planning to backup and restore the resource pool used by VCD for its organization vCD structures.