DRS Archives - Page 7 of 11 - frankdenneman.nl

Limiting the number of Storage vMotions

June 28, 2012 by frankdenneman

When enabling datastore maintenance mode, Storage DRS will move virtual machines out of the datastore as fast as it can. The number of virtual machines that can be migrated in or out of a datastore is 8. This is related to the concurrent migration limits of hosts, network and datastores. To manage and limit the number of concurrent migrations, either by vMotion or Storage vMotion, a cost and limit factor is applied. Although the term limit is used, a better description of limit is maximum cost.
In order for a migration operation to be able to start, the cost cannot exceed the max cost (limit). A vMotion and Storage vMotion are considered operations. The ESXi host, network and datastore are considered resources. A resource has both a max cost and an in-use cost. When an operation is started, the in-use cost and the new operation cost cannot exceed the max cost.

The operation cost of a storage vMotion on a host is “4”, the max cost of a host is “8”. If one Storage vMotion operation is running, the in-use cost of the host resource is “4”, allowing one more Storage vMotion process to start without exceeding the host limit.
As a storage vMotion operation also hits the storage resource cost, the max cost and
in-use cost of the datastore needs to be factored in as well. The operation cost of a Storage vMotion for datastores is set to 16, the max cost of a datastore is 128. This means that 8 concurrent Storage vMotion operations can be executed on a datastore. These operations can be started on multiple hosts, not more than 2 storage vMotion from the same host due to the max cost of a Storage vMotion operation on the host level.

Storage vMotion - Cost - svmotion in progress — Storage vMotion in progress

How to throttle the number of Storage vMotion operations?
To throttle the number of storage vMotion operations to reduce the IO hit on a datastore during maintenance mode, it preferable to reduce the max cost for provisioning operations to the datastore. Adjusting host costs is strongly discouraged. Host costs are defined as they are due to host resource limitation issues, adjusting host costs can impact other host functionality, unrelated to vMotion or Storage vMotion processes.
Adjusting the max cost per datastore can be done by editing the vpxd.cfg or via the advanced settings of the vCenter Server Settings in the administration view.
If done via the vpxd.cfg, the value vpxd.ResourceManager.MaxCostPerEsx41Ds is added as follows:

< config >
< vpxd >
< ResourceManager >
< MaxCostPerEsx41Ds > new value < /MaxCostPerEsx41Ds >
< /ResourceManager >
< /vpxd >
< /config >

As the max cost have not been increased since ESX 4.1, the value-name remains the same and is valid for all ESX 4.1+ hosts.
Please remember to leave some room for vMotion when resizing the max cost of a datastore. The vMotion process has a datastore cost as well. During the stun/unstun of a virtual machine the vMotion process hits the datastore, the cost involved in this process is 1.
For example, Changing the to 112, allows 7 concurrent Storage vMotions against a given datastore in the vCenter inventory. If 7 concurrent Storage vMotions are started on this datastore, a vMotion process of a virtual machine using this datastore is queued as the vMotion process would violate the max cost of the datastore. 7 x 16 = 112 + 1 vMotion = 113. The moment a Storage vMotion is completed, the vMotion process will resume as resources become available.
Please note that cost and max values are applied to each migration process, impact normal day to day DRS and Storage DRS load balancing operations as well as the manual vMotion and Storage vMotion operations occuring in the virtual infrastructure managed by the vCenter server.
As mentioned before adjusting the cost at the host side can be tricky as the costs of operation and limits are relative to each other and can even harm other host processes unrelated to migration processes. If you still have the urge to change the cost on the host, consider the impact on DRS! When increasing the cost of a Storage vMotion operation on the host, the available “slots” for vMotion operations are reduced. This might impact DRS load balancing efficiency when a storage vMotion process is active and should be avoided at all times.
Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

DRS clusters and allocating reserved memory

May 21, 2012 by frankdenneman

As mentioned in the admission control family, multiple features on multiple layers check to there is enough unused reserved memory available. This article is a part of a short series of articles on how memory is being claimed and provided as reserved memory; other articles will be posted throughout the week.
Refresher
I’ve published two articles that describes memory reservation at the VM level and the resource pool level. These two sources are an excellent way to refresh your memory (no pun intended) on the reservation construct:
• Impact of memory reservation (VM-level)
• Resource pool memory reservations

“Unclaimed” reserved memory?
If a memory reservation is configured on a child object (virtual machine or resource pool) admission control checks if there is enough reserved memory available. Which memory can be claimed for reserved memory? And how about the host-level memory and cluster level memory? Let’s dissect the cluster tree of resource providers and resource consumers and start with a bottom-up approach.

Host-level to DRS cluster
Both the host and DRS cluster are resource providers to the resource consumers i.e. resource pools and virtual machines. When a host is made a member of a DRS cluster, all its available memory resources are placed at the DRS disposal. The available memory of a host is the memory that is left after the VMkernel claimed host memory. The DRS cluster, also called the root resource pool, reserves this remaining memory.

As the DRS cluster reserves this memory per host, all the memory aggregated inside the root resource pool and is actually designated as reserved memory. However to prevent confusion, this reserved memory is labeled as unused reserved memory in the vSphere Client user interface and as such provided to the child resource pools and child virtual machines.At the Resource Allocation Tab of the cluster, the Total memory capacity of the cluster is listed as well as the reserved capacity. The Available capacity is the result of Total capacity – Reserved capacity. Note that if HA is configured the amount of resources reserved for failover is automatically added to the reserved capacity.

Child resource pools
Resource pools allow for hierarchical partitioning of the cluster, but they always span the entire cluster. Resource pools draw resources from the root resource pool and do not pick and select resources from a specific hosts. The root resource pool functions as an abstraction layer. When configuring a reservation on resource pool level the specified amount of memory is claimed by that specific resource pool and cannot be allocated by other resource pools.

Note that the claim of reserved resources by the resource pool is done immediately during the creation of the resource pool. It does not matter if there are running virtual machine inside the resource pools or not. The total configured memory is withdrawn from the root resource pool and thus unavailable for other resource pools. Please keep this in mind when sizing resources pools.
The next article will expand on virtual machines inside a resource pool.

The Admission Control Family

May 10, 2012 by frankdenneman

It’s funny how sometimes something, in this case a vSphere feature, becomes a “trending topic” on any given day or week. Yesterday I was discussing admission control policies with Rawlinson Riviera (@punchingclouds) and we discussed how to properly calculate a percentage for the percentage based admission control with keeping consolidation ratios in mind. And today Gabe published an article about his misconception of admission control, which triggered me to write an article of my own about admission control.
When discussing admission control usually only HA admission control policies are mentioned. However, HA isn’t the only feature using some sort of admission control. Storage DRS as well as DRS and the ESX(i) host have each their own admission control. Let’s take a closer look what admission control actually is and see how each admission control fits in the process of a virtual machine power-on operation.
What’s the function of admission control? I like to call it our team of virtual bouncers. Admission control is there to ensure that sufficient resources are available for the virtual machine to function within it’s specified parameters / resource requirements. The last part about the parameters and requirements is the key to understand admission control.
During a virtual machine power-on or a move operation, admission control checks if sufficient unreserved resources are available before allowing a virtual machine to power on or moved into the cluster. If a virtual machine is configured with reservation, this could be CPU, memory or even both, admission control needs to make sure that the datastore cluster, compute cluster, resource pool and host can provide these resources. If one of these constructs cannot allocate and provide these resources, then the datastore cluster, compute cluster, resource pool or host cannot provide an environment where the virtual machine can operate within its required parameters. In other words, the moment a virtual machine is configured to have an X amount of resources guaranteed, you want the environment to actually oblige to that wish and that’s why admission control is developed.
As a vSphere environment can be configured in many different ways, each feature sports its own admission control, as you do not want to introduce dependencies for such a crucial component. Let’s take a closer look at each admission control feature and their checkpoints.

High Availability Admission control: During a virtual machine power-on operation, HA checks if the virtual machine can be powered-on without violating the required capacity to cope with a host failure event. Depending on the HA admission control policy, HA checks if the cluster can provide enough unreserved resources to satisfy the virtual machine reservation. The internals of each type Admission control policies is outside the scope of this article, more information can be found in the clustering deep dive books or online at Yellow-bricks.
After HA admission control gives the green light, it’s up to Storage DRS admission control if the virtual machine is placed in a Storage DRS datastore cluster. Storage DRS admission control checks datastore connectivity amongst the hosts in the datastore cluster and selects the hosts with the highest datastore connectivity to ensure the highest portability of a virtual machine. If there are multiple hosts with the same number of datastore connected it selects the host with the lowest compute utilization.
Up next is DRS admission control to review the cluster state. DRS ensures that sufficient unreserved resources are available in the cluster before allowing the virtual machine to power on. If the virtual machine is placed inside a resource pool, DRS checks if the resource pool can provide enough resources to satisfy the reservation. Depending on the setting “expandable reservation” the resource pool checks its own pool of unreserved resources or borrows resources from its parent. If a virtual machine is moved into the cluster and EVC is enabled in the DRS cluster, EVC admission control verifies if the applied EVC mode to the virtual machine does not exceed the current EVC baseline of the cluster. DRS selects a host based on configured VM-VM and VM-Host affinity rules.
Last step is Host admission control. In the end it’s the host that actually needs to provide the compute environment to allow the virtual machine to operate in. A cluster can have enough unreserved resources available, but it can be in a fragmented stage, where there are not enough resources available per host to satisfy the virtual machine reservation. To solve this problem a DRS invocation is triggered to recommend virtual machine migrations to re-balance the cluster and free up space on a particular host for the new virtual machine. If DRS is not enabled, the Host rejects the virtual machine due to the inability to provide the required resources. Host admission control also verifies is the virtual machine configuration is compatible with the host. The VM networks and datastores must be available in order to accommodate the virtual machine. The virtual machine compatibility list also the suitable host if the virtual machine is placed inside a “must” VM-Host affinity rules, admission control checks if its listed in the compatibility list. The last check is if the host can create a VM-swap file on the designated VM swap location.
So there you have it before a virtual machine is powered-on or moved into a cluster, all these admission controls will make sure the virtual machine can operate within its required parameters and no cluster feature requirement is being violated.

Mixing Resource Pools and Virtual Machines on the same hierarchical level

May 9, 2012 by frankdenneman

One of most frequent questions I receive is about mixing resource pools and virtual machines at the same hierarchical level. In almost all of the cases we recommend to commit to one type of child objects. Either use resource pools and place virtual machines within the resource pools or place only virtual machines at that hierarchical level. The main reason for this is how resource shares work.
Shares determine the relative priority of the child-objects (virtual machines and resource pools) at the same hierarchical level and decide how excess of resources (total system resources – total Reservations) made available by other virtual machines and resource pools are divided.
Shares are level-relative, which means that the number of shares is compared between the child-objects of the same parent. Since, they signify relative priorities; the absolute values do not matter, comparing 2:1 or 20.000 to 10.000 will have the same result.
Lets use an example to clarify. In this scenario the DRS cluster (root resource pool) has two child objects, resource pool 1 and a virtual machine 1. The DRS cluster issues shares amongst its children, 4000 shares issued to the resource pool, 2000 shares issued to the virtual machine. This results in 6000 shares being active on that particular hierarchical level.

During contention the child-objects compete for resources as they are siblings and belong to the same parent. This means that the virtual machine with 2000 shares needs to compete with the resource pool that has 4000 shares. As 6000 shares are issued on that hierarchical level, the relative value of each child entity is (2000 of 6000 shares) = 33% for the virtual machine and (4000 shares of 6000=66%) for the resource pool.
The problem with this configuration is that the resource pool is not only a resource consumer but also a resource provider. So that it must claim resources on behalf of its children. Expanding the first scenario, two virtual machines are placed inside the resource pool.

The Resource Pool issues shares amongst its children, 1000 shares issued to virtual machine 2 (VM2) and 2000 shares issued to virtual machine 3 (VM3). This results in 3000 shares being active on that particular hierarchical level.
During contention the child-objects compete for resources as they are siblings and belong to the same parent which is the resource pool. This means that VM2 owning 1000 shares needs to compete with VM3 that has 2000 shares. As 3000 shares are issued on that hierarchical level, the relative value of each child entity is (1000 of 3000 shares) = 33% for VM2 and (2000 shares of 3000=66%) for VM3.
As the resource pool needs to compete for resources with the virtual machine on the same level, the resource pool can only obtain 66% of the cluster resources. These resources are divided between VM2 and VM3. That means that VM2 can obtain up to 22% of the cluster resources (1/3 of 66% of the total cluster resources is 22%).
Forward to scenario 3, two additional virtual machines are created and are on the same level as Resource Pool 1 and virtual machine 1. The DRS cluster issues 1000 shares to VM4 and 1000 shares to VM5.

As the DRS cluster issued an additional 2000 shares, the total issued shares is increased to 8000 shares. Resulting in dilution of the relative share values of Resource Pool 1 and VM1. Resource pool 1 now owns 4000 shares of a total 8000 bringing the relative value down from 66% to 50%. VM1 owns 2000 shares of 8000, bringing its value down to 25%. Both VM4 and VM5 own each 12.5% of shares.
As resource pool 1 provides resources to its child-object VM2 and VM3, fewer resources are divided between VM2 and VM3. That means that in this scenario VM2 can obtain up to 16% of the cluster resources (1/3 of 50% of the total cluster resources is 16%). Introducing more virtual machines to the same sibling level as the Resource Pool 1, will dilute the resources available to virtual machines inside Resource Pool 1.
This is the reason why we recommend to commit to a single type of entity at a specific sibling level. If you create resource pools, stick with resource pools at that level and provision virtual machines inside the resource pool.
Another fact is that a resource pool receives shares similar to a 4-vcpu 16GB virtual machine. Resulting in a default share value of 4000 shares of CPU and 163840 shares of memory when selecting Normal share value level. When you create a monster virtual machine and place it next to the resource pool, the resource pool will be dwarfed by this monster virtual machine resulting in resource starvation.
Note: Shares are not simply a weighting system for resources. All scenarios to demonstrate way shares work are based on a worst-case scenario situation: every virtual machine claims 100% of their resources, the system is overcommitted and contention occurs. In real life, this situation (hopefully) does not occur very often. During normal operations, not every virtual machine is active and not every active virtual machine is 100% utilized. Activity and amount of contention are two elements determining resource entitlement of active virtual machines. For ease of presentation, we tried to avoid as many variable elements as possible and used a worst-case scenario situation in each example.
So when can it be safe to mix and match virtual machines and resource pools at the same level? When all child-objects are configured with reservation equal to their configuration and limits, this result in an environment where shares are overruled by reservation and no opportunistic allocation of resources exist. But this designs introduces other constraints to consider.

(Storage) DRS (anti-) affinity rule types and HA interoperability

February 6, 2012 by frankdenneman

Lately I have received many questions about the interoperability between HA and affinity rules of DRS and Storage DRS. I’ve created a table listing the (anti-) affinity rules available in a vSphere 5.0 environment.

Technology	Type	Affinity	Anti-Affinity	Respected by VMware HA
DRS	VM-VM	Keep virtual machines together	Separate virtual machines	No
	VM-Host	Should run on hosts in group	Should not run on hosts in group	No
	VM-Host	Must run on hosts in group	Must not run on hosts in group	Yes

SDRS	Intra-VM	VMDK affinity	VMDK anti-affinity	N/A
SDRS	VM-VM	Not available	VM Anti-Affinity	N/A

As the table shows, HA will ignore most of the (anti-) affinity rules in its placement operations after a host failure except the “Virtual Machine to Host – Must rules”. Every type of rule is part of the DRS ecosystem and exists in the vCenter database only. A restart of a virtual machine performed by HA is a host-level operation and HA does not consult the vCenter database before powering-on a virtual machine.
Virtual machine compatibility list
The reason why HA respect the “must-rules” is because of DRS’s interaction with the host-local “compatlist” file. This file contains a compatibility info matrix for every HA protected virtual machine and lists all the hosts with which the virtual machine is compatible. This means that HA will only restart a virtual machine on hosts listed in the compatlist file.
DRS Virtual machine to host rule
A “virtual machine to hosts” rule requires the creation of a Host DRS Group, this cluster host group is usually a subset of hosts that are member of the HA and DRS cluster. Because of the intended use-case for must-rules, such as honoring ISV licensing models, the cluster host group associated with a must-rule is directly pushed down in the compatlist.
Note
Please be aware that the compatibility list file is used by all types of power-on operations and load-balancing operations. When a virtual machine is powered-on, whether manual (admin) or by HA, the compatibility list is checked. When DRS performs a load-balancing operation or maintenance mode operation, it checks the compatibility list. This means that no type of operation can override must- type affinity rules. For more information about when to use must and should rules, please read this article: Should or Must VM-Host affinity rules.
Contraint violations
After HA powers-on a virtual machine, it might violate any VM-VM or VM-host should (anti-) affinity rule. DRS will correct this constraint violation in the first following invocation and restore “peace” to the cluster.
Storage DRS (anti-) affinity rules
When HA restarts a virtual machine, it will not move the virtual machine files. Therefore creation of Storage DRS (anti-) affinity rules do not affect virtual machine placement after a host failure.