vSphere 5.1 Storage DRS Multi-VM provisioning improvement

When a virtual machine is provisioned to the datastore cluster, Storage DRS algorithm runs to determine the best placement of the virtual machine. The interesting part of this process is the method Storage DRS determines the free space of a datastore or to be more precise the improvement made in vSphere 5.1 regarding free space calculation and the method of finding the optimal destination datastore.

vSphere 5.0 Storage DRS behavior
Storage DRS is designed to balance the utilization of the datastore cluster, it selects the datastore with the highest free space value to balance the space utilization of the datastores in the datastore cluster and avoids out-of-space situations.

During the deployment of a virtual machine, Storage DRS initiates a simulation to generate an initial placement operation. This process is an isolated process and retrieves the current datastore free space values. However, when a virtual machine is deployed, the space usage of the datastore is updated once the virtual machine deployment is completed and the virtual machine is ready to power-on. This means that the initial placement process is unaware of any ongoing initial placement recommendations and pending storage space allocations. Let’s use an example that explains this behavior.

In this scenario the datastore cluster contains 3 datastores, the size of each datastore is 1TB, no virtual machines are deployed yet, and therefor they each report a 100% free space. When deploying a 500GB virtual machine, storage DRS selects the datastore with the highest reported number of free space and as all three datastores are equal it will pick the first datastore, Datastore-01. Until the deployment process is complete the datastore remains reporting 1000GB of free space.

vSphere 5.0 Storage DRS Initial placement

When deploying single virtual machines this behavior is not a problem, however when deploying multiple virtual machines this might result in an unbalanced distribution of virtual machine across the datastores.

As the available space is not updated during the deployment process, Storage DRS might select the same datastore, until one (or more) of the provisioning operations complete and the available free space is updated. Using the previous scenario, Storage DRS in vSphere 5.0 is likely to pick Datastore-01 again when deploying VM2 before the provisioning process of VM1 is complete, as all three datastore report the same free space value and Datastore-01 is the first datastore it detected.

vSphere 5.0 Storage DRS Initial placement add VM2

vSphere 5.1 Storage DRS behavior
Storage DRS in vSphere 5.1 behaves differently and because Storage DRS in vSphere 5.1 supports vCloud Director, it was vital to support the provisioning process of a vApp that contains multiple virtual machines.

Enter the storage lease
Storage DRS in vSphere 5.1 applies a storage lease when deploying a virtual machine on a datastore. This lease “reserves” the space and making deployments aware of each other, thus avoiding suboptimal/invalid placement recommendations. Let’s use the deployment of a vApp as an example.

The same datastore cluster configuration is used, each datastore if empty, reporting 1000GB free space. The vApp exists of 3 virtual machines, VM1, VM2 and VM3. Respectively they are 100GB, 200GB and 400GB in size. During the provisioning process, Storage DRS needs to select a datastore for each virtual machine. As the main goal of Storage DRS is to balance the utilization of the datastore cluster, it determines which datastore has the highest free space value after each placement during the simulation.

Storage DRS Initial placement process - vApp-step-1

During the simulation VM1 is placed on Datastore-01, as all three datastores report an equal value of free space. Storage DRS then applies the lease of 100GB and reduces the available free space to 900GB.

Storage DRS Initial placement process - vApp-step-2

When Storage DRS simulates the placement of VM2, it checks the free space and determines that Datastore-02 and Datastore-03 each have 1000GB of free space, while Datastore-01 reports 900GB. Although VM2 can be placed on Datastore-01 as it does not violate the space utilization threshold, Storage DRS prefers to select the datastore with the highest free space value. Storage DRS will choose Datastore-02 in this scenario as it picks the first datastore if multiple datastores report the same free space value.

Storage DRS Initial placement process - vApp-step-3

The simulations determines that the optimal destination for VM3 is Datastore-03, as this reports a free space value of 1000GB, while Datastore-02 reports 800 free space and Datastore-01 reports 900GB of space.

This lease is applied during the simulation of placement for the generation of the initial placement recommendation and remains applied until the placement process of the virtual machine is completed or when the operation times out. This means that not only a vApp deployment is aware of the storage resource lease but also other deployment processes.

Update to vSphere 5.1
This new behavior is extremely useful when deploying multiple virtual machines in batches such as vApp deployment or vHadoop environments with the use of Serengeti.

Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Comments

  1. Ed says

    I’m going to have respectfully disagree and say that SDRS is NOT looking at free space because there are well-documented bugs that say it isn’t – it’s subtracting the sum of the allocated storage on the datastore from the size of the datastore. So if you have a 1TB datastore with a total of 800VMDKs allocated, it will use 200GB in its calculations. If you de-dupe the storage (let’s assume 2:1), it will still use 200GB “free” in its calculations even though there is 600GB free. if you have a 1TB datastore with 1.2 TB of allocated space but 500GB free, it will say that it’s full.

    This is very significant for customers that de-dupe on their storage controllers (e.g NetApp and ASIS) and is not planned to be fixed in 5.1. I’m forced to remove SDRS completely from my environment.

  2. says

    Hi Ed,

    Thanks for replying. I’m curious about the well-documented bugs. Which bugs are you referring to? As Storage DRS does not have the ability to determine the current state of array-based features such as deduplication, and as no storage vendor does not provide any information about their functionality states to Storage DRS it cannot predict the dedup effectiveness of the dedup pool. Therefor Storage DRS algorithms are only applied on the logical level and that is the datastore level visible to the vSphere infrastucture (ESXi hosts and vCenter). And thus it will only consider VM footprints inside the datastore and not the actual footprint on physical media. Duncan and I wrote the whitepaper: “VMware vSphere Storage DRS Interoperability” which is available at: http://www.vmware.com/files/pdf/techpaper/vsphere-storage-drs-interoperability.pdf
    I recommend you to read it as it also provides more detailed information about the interoperability of Storage DRS with other storage array features.

  3. Ed says

    I agree that storage DRS may not know whether or not the array is deduplicated or thin provisioned. However, it definitely does know how much free space is available on a datastore. It will, by current design, refuse to allocate more space on the datastore than the size of the datastore, no matter how much free space it has. It will fail to make a recommendation whether the array is deduplicated or if the virtual disks are thin-provisioned.

    The interoperability whitepaper (which I had already printed out) says that both array-based deduplication and thin provisioning are supported for initial placement. However, initial placement will fail if the storage is over-committed and is documented in http://kb.vmware.com/kb/2017605. I would like to request that you update the whitepaper to reflect the current limitations – why would we dedupe or thin provision if we weren’t going to over-commit? I can’t go to my storage team and ask for another 50-60TB of disk space (and I’m not exaggerating) so I can turn on storage DRS but not over-commit.

    I don’t need full automation but I’m really frustrated that initial placement doesn’t work. If storage DRS simply looked at the free space of the datastores rather than the sum of the virtual disks, this problem would go away (for me anyway). This is VMware PR# 861923. My support escalation yielded the following response:

    “The reason we don’t have a scheduled fix or patch for this is because this requires a design change on the product at the code level and per the update from engineering, this is not viable with a hot patch.

    The current workaround is not to over commit storage and as long as the storage is not over-committed, the SDRS feature would work as designed.

    This is to be fixed in VC/ESX 6.0 and I will notify you if there’s an earlier fix available. “

  4. Ed says

    At the VMware Optimize and Scale class I recently attended, I mentioned the initial placement bug to the instructor who was actually quite surprised when I told him that thin provisioning wasn’t really supported. I demonstrated the bug very simply during class:
    1. Create a 10GB datastore.
    2. Create a 20GB thin-provisioned virtual machine. This works.
    3. Delete the virtual machine.
    4. Create an SDRS cluster with the same datastore.
    5. Create a 20GB thin-provisioned virtual machine. This fails.

    This is a supported configuration but simply doesn’t work.

  5. says

    Ed,

    It’s not a bug, this is the way its designed. We are working on an alternative procedure for initial placement of thin provisioned disks. When this is released, I cannot share this information with the public, but if you have the need to have the current behavior changed, please file a feature request as this will help to prioritize this new feature.

  6. says

    I was interested to read this article about use of Storage DRS with vCloud Director, as the documentation says that Storage DRS is not to be enabled on vCenters used with vCloud installations.

    Page 9 of the vCloud Director 5.1 Install and Upgrade Guide:

    “vCenter clusters used with vCloud Director must not enable storage DRS.”

    Is the opposite actually the case? Enabling it makes sense as we’ve just had a case on our testing where a single Org was able to fill a datastore, taking out all the VMs on that datastore. Not ideal on a multi-tenant system!