New Storage DRS whitepaper available at VMware.com

Last Friday my last and latest whitepaper about Storage DRS was published on VMware.com. Go to http://www.vmware.com/resources/techresources/10363 and download the whitepaper: “Understanding vSphere 5.1 Storage DRS“.

Download and read this whitepaper if you want to learn more about the five key elements of Storage DRS. Here’s a little snippet from the whitepaper:

Step 1. Determine Whether Datastores Are Violating the Space-Utilization Threshold
If the space utilization of a datastore exceeds 80 percent, the datastore violates the threshold and the vSphere Storage DRS load-balancing algorithm is invoked. vSphere Storage DRS attempts to avoid an out-of-space situation and therefore runs a load-balancing operation as soon as the datastore exceeds its space-utilization threshold. This operation can be outside of the normal load-balancing interval of every 8 hours.
The space-utilization threshold is a soft limit, enabling vSphere Storage DRS to place virtual machines in the datastore cluster even if all datastores exceed the space-utilization threshold. vSphere Storage DRS attempts to generate prerequisite migrations before virtual machine placement. If this fails, the virtual machine is placed on the datastore that provides the best overall cluster balance.

SDRS-whitepaper-screenshot

This performance applies to space-utilization load-balancing operations as well, even if all datastores violate the space-utilization threshold. vSphere Storage DRS tries to keep space utilization near the threshold across
all datastores.

Download:
http://www.vmware.com/files/pdf/vmw-vsphr-5-1-stor-drs-uslet-101-web.pdf

Migrating datastore clusters by changing storage profiles in a vCloud

vCloud director 5.1 supports the use of both storage profiles and Storage DRS. One of the coolest features and unfortunately relatively unknown is the ability to live migrate virtual machines between datastore clusters by changing the storage profile in the vCloud director portal.

In my lab I’ve set up a provider vDC that contains two compute clusters. Each compute cluster connects to two datastore clusters. Datastore Cluster “vCloud-SDC-Gold” is compatible with the VM storage profile “vCloud-Gold-Storage”, while Datastore Cluster “vCloud-SDC-Silver” is compatible with the VM storage profile “vCloud-Silver-Storage”.

01

When creating a vApp the default storage profile of the organization vDC is applied to the vApp and all its virtual machines. In this case, the VM storage profile Gold is applied to all the virtual machines in the vApp.

You can determine which VM Storage Profile is associated with the virtual machine by selecting the properties of the virtual machine in the “My Cloud” tab. Please note that vCloud Director does not show the VM Storage Profile at the vApp level!

02

By selecting the drop-down box, all storage profiles that are associated with the organization vCD are displayed.

03

By selecting the Storage Profile “vCloud-Silver-Storage” vCloud Director determines that the virtual machine is stored on a datastore that is not compatible with the associated storage profile. In other words the current configuration is violating the storage level policy.

To correct this violation, vCloud director instructs vSphere to migrate the virtual machine via Storage vMotion to a datastore that is compatible with the VM storage Profile. In this case the datastore cluster “vCloud-DSC-Silver” is selected as the destination. Storage DRS determines the most suitable datastore by using its initial placement algorithm and selects the datastore that has the most amount of free space and the lowest I/O load.

To demonstrate the feature, I selected the virtual machine “W2K8_R2-SP1”. The VM storage profile “vCloud-Gold-Storage” is applied and Storage DRS determined that the datastore “nfs-f-vcloud03” of the datastore cluster “vCloud-DSC-Gold” was the most suitable location.

04

By changing the Storage Profile to “vCloud-Silver-Storage” vCloud director instructed vSphere to migrate it to the datastore cluster that is compatible with the newly associated VM storage profile.

05

When logging into the vCenter server managing the ESXi hosts the following task is running:

06

After the task is complete, vCenter shows that the virtual machine is now stored on datastore “nfs-f-vcloud06″ in the datastore cluster “vCloud-DSC-Silver”.

07

The power of abstraction
The abstraction layer of vCloud Director makes this possible. When changing the storage profile directly on the vSphere layer, nothing happens. vSphere will not migrate the virtual machine to the appropriate datastore cluster that is compatible with the selected VM storage profile.

Useful for stretched clusters?
The reason why I was looking into this feature in my lab is due to an conversation with my esteemed colleagues Lee Dilworth and Aidan Dalgleish. We were looking to an alternative scenario for a stretched cluster. By leveraging the elastic vDC feature of vCloud director, a seperate DRS cluster is created in each site. Due to the automatic initial placement engine on the compute level, we needed to find a construct that can provide us a more deterministic method of virtual machine placement. We immediately thought of the VM profile storage feature. Create two datastore clusters, one per site and associate a profile storage based on site name to the respective datastore clusters.

08

When creating the vApp, just select the site-related Storage Profile to place the virtual machine in a specific site. Due to the compatibility check, vCloud Director determines that in order to be compliant with the storage profile it places the virtual machine on the compute cluster in the same site. For example, if you want to place a virtual machine in site 1, select the VM storage Profile “site 1”. vCloud director determines that the virtual machine needs to be stored in datastore cluster “DSC-Site-1”. The compute cluster Site-1 is the only compute cluster connected to the datastore cluster, therefor both the compute and storage configuration of the virtual machine is stored in Site 1.

This configuration works perfect if you want to simplify initial placement if you have multiple sites/locations and you always want to keep the virtual machine in the same site. However this solution might not be optimal for a Stretched cluster configuration where failover to another site is necessary.

Connectivity to all datastores necessary
As this feature uses storage vMotion instead of cross-host/datastore vMotion, means that the cluster needs to be connected to both datastore clusters.

09
When selecting the different storage profile, the storage state is migrated to another datastore cluster. However it doesn’t move the compute state of the virtual machine. This means that storage is moved to site B, while the compute state is still in Site A. vCloud director does not provide an option to migrate the virtual machine to a different compute cluster within the provider vDC. You can either solve it by logging into the vCenter server that manages the ESXi hosts and manually vMotion the virtual machines to cluster in Site B, or power-off the virtual machine in vCloud Director, then change the storage profile and power-on the virtual machine. Both “solutions” are not very enterprise-level scenario’s therefor I think this is not yet suitable as a stretched cluster configuration

VCD and initial placement of virtual disks in a Storage DRS datastore cluster

Recently a couple of consultants brought some unexpected behavior of vCloud Director to my attention. If the provider vDC is connected to a datastore cluster and a virtual disk or vApp is placed in the datastore, vCD displays an error when the datastores do not have enough free space available.

Last year I wrote an article (Storage DRS initial placement and datastore cluster defragmentation) describing storage DRS initial placement engine and it’s ability to move virtual machines around the datastore cluster if individual datastores did not have enough free space to store the virtual disk.

I couldn’t figure it out why Storage DRS did not defragment the datastores in order to place the vApp, thus I asked the engineers about this behavior. It turns out that this behavior is by design. When creating vCloud director the engineers optimized the initial placement engine of vCD for speed. When deploying a virtual machine, defragmenting a datastore cluster can take some time. To avoid waiting, vCD reports an error of not enough free space and relies on the vCloud administrator to manage and correct the storage layer. In other words, Storage DRS initial placement datastore cluster defragmentation is disabled in vCloud Director.

I can understand the choice the vCD engineers made, but I also believe in the benefit of having datastore cluster defragmentation. I’m interested in your opinion? Would you trade initial placement speed over reduced storage management?

Storage DRS Initial placement workflow

Last week I received the question how exactly Storage DRS picks a datastore.

On a SDRS the initial placement of a vm is done on the weight calculated based on the storage free and IO. My question is: when I have a similar weight between all the datastore in the cluster, which datastore is choose for the initial placement?

Storage DRS takes the virtual machine configuration into account, the platform & user-defined constraints and the resource utilization of the datastores within the cluster. Let’s take a closer look at the Storage DRS initial placement workflow.

User-defined constraint
When selecting the datastore cluster as a storage destination, the default datastore cluster affinity rule is applied to the virtual machine configuration. The datastore cluster can be configured with a VMDK affinity rule (Keep files together) or a VMDK anti-affinity rule (Keep files separated). Storage DRS obeys the affinity rule and is forced to find a datastore that is big enough to store the entire virtual machine or the individual VMDK files. The affinity rule is considered to be a user-defined constraint.

Platform constraint
The next step in the process is to present a list of valid datastores to the Storage DRS initial placement algorithm. The Storage DRS placement engine checks for platform constraints.

The first platform constraint is the check of the connectivity state of the datastores. Fully connected datastores (datastores connected to all host in the compute cluster) are preferred over partially connected datastores (datastores that are not connected to all host in the cluster) due to the impact of mobility of the virtual machine in the compute cluster.

The second platform constraint is applicable to thin-provisioned LUNs. If the datastore exceeds the thin-provisioning threshold of 75 percent, the VASA provider (if installed) triggers the thin-provisioning alarm. In response to this alarm Storage DRS removes the datastores from the list of valid destination datastores, in order to prevent virtual machine placement on low-capacity datastores.

Resource utilization
After the constraint handling, Storage DRS sorts the valid datastores in order of combined resource utilization rate. The combined resource utilization rate consists of the space utilization and the I/O utilization of a datastore. The best-combined resource utilization rate is a datastore that has a high level of free capacity and Low I/O utilization. Storage DRS selects the datastore that has the best-combined utilization rate and attempts to place the virtual machine. If the virtual machine is configured with a VMDK anti-affinity rule, Storage DRS starts with placing the biggest VMDK first.

Storage DRS Cluster shows error "Index was out of range. Must be non-negative and less than the size of the collection.

Recently I have noticed an increase of tweets mentioning the error “Index was out of range”

Error message:
When trying to edit the settings of a storage DRS cluster, or when clicking on SDRS scheduling settings an error pops up:

“An internal error occurred in the vSphere Client. Details: Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: index”

Impact:
This error does not have any impact on Storage DRS operations and/or Storage DRS functionality and is considered a “cosmetic” issue. See KB article: KB 2009765.

Solution:
VMware vCenter Server 5.0 Update 2 fixes this problem. Apply Update 2 if you experience this error.
As the fix is included in the Update 2 for vCenter 5.0, I expect that vCenter 5.1 Update 1 will include the fix as well, however I cannot confirm any deliverables.