Recently I received the question if you can connect multiple compute (HA and DRS) clusters to a single Storage DRS datastore cluster and specifically how this setup might impact Storage IO Control functionality. Let’s cover sharing a datastore cluster by multiple compute clusters first before diving into details of the SIOC mechanism.
Sharing datastore clusters
Sharing datastore clusters across multiple compute clusters is a supported configuration. During virtual machine placement the administrator selects which compute cluster the virtual machine will run in, Storage DRS selects the host that can provide the most resources to that virtual machine. A migration recommendation generated by Storage DRS does not move the virtual machine at host level, consequently a virtual machine cannot move from one compute cluster to another compute cluster by any operation initiated by Storage DRS.
Please remember that the maximum supported number of hosts connected to a datastore is 64. Keep this in mind when sizing the compute cluster or connecting multiple compute clusters to the datastore cluster. As the maximum number of datastores inside a datastore cluster is 32 I think that the number of host connected is the first limit you hit in such a design as the total supported number of paths is 1024 and a host can connect up to 255 LUNs.
If the datastores are formatted with the VMFS, it’s recommended to enable VAAI on the storage Array if supported. One of the important VAAI primitive is the Hardware assisted locking, also called Atomic Test and Set (ATS).
ATS replaces the need for hosts to place a SCSI-2 disk lock on the LUN while updating the metadata or growing a file. A SCSI-2 disk lock command locks out other host from doing I/O to the entire LUN, while ATS modifies the metadata or any other sector on the disk without the use of a SCSI-2 disk lock. This locking was the focus of many best practices around the connectivity of datastores. To reduce the amount of locking, the best practice was to reduce the number of host attached. By using newly formatted VMFS5 volumes in combination with a VAAI-enabled storage array, scsi-2 disk lock commands are a thing of the past. Upgraded VMFS5 volumes or VMFS3 volumes will fall back to using SCSI-2 disk locks if the ATS command fails. For more information about VAAI and ATS please read the KB article 1021976.
Note: If your array doesn’t support VAAI, be aware that SCSI-2 disk lock commands can impact scaling of the architecture.
Storage DRS IO Load balancing and Storage IO Control
When enabling the IO Metric on the datastore cluster, Storage DRS automatically enables Storage IO Control (SIOC) on all datastores in the cluster. Storage DRS uses the IO injector from SIOC to determine the capabilities of a datastore, however by enabling SIOC it also provides a method to fairly distribute I/O resources during times of contention.
SIOC uses virtual disk shares in order to distribute storage resources fairly and are applied on a datastore wide level. The virtual disk shares of the virtual machine running on that datastore are relative to the virtual disk shares of other virtual machines using that same datastore. To be more specific, SIOC is a host-level module and aggregates the per-host views into a single datastore view in terms of observed latency.
If the observed latency exceeds the SIOC level latency threshold, each host sets its own IO queue length based on the total virtual disks shares of the virtual machines in that host using the datastore. As SIOC and its shares are datastore focused cluster membership of the host has no impact on detecting the latency threshold violation and managing the I/O stream to the datastore.
Previous articles in the SDRS short series Architecture and design of Datastore clusters:
Part1: Architecture and design of datastore clusters.
Part2: Partially connected datastore clusters.
Part3: Impact of load balancing on datastore cluster configuration.
Part4: Storage DRS and Multi-extents datastores.
Connecting multiple DRS clusters to a single Storage DRS datastore cluster.
2 min read