vSphere 5 Archives - frankdenneman.nl

Partially connected datastore clusters

October 7, 2011 by frankdenneman

The first article in the series about architecture and design decisions series focuses on the connectivity of the datastores within the datastore cluster. Connectivity between ESXi hosts and datastores in the datastore cluster affects initial placement and load balancing decisions made by DRS and Storage DRS. Although connecting a datastore to all ESXi hosts inside a cluster is a common practice, we still come across partially connected datastores in virtual environments.
What is the impact of a partially connected datastore, member of a datastore cluster, connected to a DRS cluster? What interoperability problems can you expect and what is the impact of this design on DRS load balancing operations and SDRS load balancing operations?
Let’s start with the basic terminology.

Fully connected datastore clusters
A fully connected datastore cluster is when the storage is attached to all ESX servers in a cluster. This is a recommendation, but it is not enforced.
Partially connected datastore clusters
If a datastore is connected to a subset of ESXi hosts inside the DRS cluster, the datastore cluster is treated as a partially connected datastore cluster.

Now what happens if the DRS cluster is connected to partially connected datastores? It’s important to understand that the goal of both DRS and SDRS is resource availability, key to offering resource availability is to provide or have as much as mobility as possible. SDRS will not generate any migration recommendations that will reduce the compatibility of a virtual machine regarding datastore connections. Virtual machine to host compatibility are captured in compatibility lists.
Compatibility list
Inside the cluster a vm-host compatibility list is generated for each virtual machine. The compatibility list determines which ESXi host in the cluster have network and storage configurations that allow the virtual machine to successfully come online. Membership of a Mandatory VM to host affinity rules are also listed in the compatibility list.If the network portgroup or datastore is not available on the host, or the host is not listed in the host group of the mandatory affinity rule, the ESXi server is deemed incompatible to host that virtual machine.
As mentioned, both DRS and SDRS focus on resource availability and resource outage avoidance, therefore SDRS prefers a datastore that is connected to all hosts rather than selecting a datastore that is partially connected. Connecting datastores to a subset of hosts reduce the compatibility list impacting the mobility of the virtual machine reducing the efficiency of DRS and SDRS.
Finding a suitable location or the ability to load balance becomes more challenging when the cluster and datastore cluster are partially connected. During initial placement a selection of a datastore may impact the mobility of the virtual machine amongst the hosts, while selecting a host impacts the mobility of a virtual machines amongst the datastores in the datastore cluster.

VM mobility in partially connected datastore clusters

Let’s explore this impact a little bit further. During the process of migration recommendations, DRS selects a host for a virtual machine that can provide enough resources to satisfy the virtual machines resource entitlement, while lowering the imbalance of the cluster. DRS might come across a low utilized host; other hosts inside the cluster are highly utilized. Unfortunately the lightly utilized host is not connected to the datastore containing the virtual machine files (it might even be lowly utilized due to the poor connection state) and therefore DRS will not consider the host due to the incompatibility. While from a DRS resource load balancing perspective this host might be very attractive option to solve resource imbalance. Also keep in mind the impact of this behavior on VM-Host affinity rules, DRS will not migrate the virtual machine to the partially connected host inside the host group.
Similar happens with SDRS load balancing. Partially connected datastores are not recommended when fully connected datastores are available that do not violate the space SDRS threshold. You might wonder why the space SDRS threshold is explicitly mentioned and not the IO load balanced but that’s because IO load balancing is disabled when a partially connected datastore is detected in the datastore cluster.
IO load balancing
It is important to understand the impact a single partially connected datastore has on the service level of an entire datastore cluster. As SDRS detects a partially connected datastore it will disable the IO load balancing on the entire datastore cluster. Not only on that single partially connected datastore, but the entire cluster. Effectively degrading a complete feature set of your virtual infrastructure.
Temporary partially connectivity – a real threat?
The connectivity status is important when the SDRS interval expires; during the migration recommendation calculation is checks the connectivity. A temporary all-paths-down status or a rezoning procedure might not have effect on SDRS load-balancing behavior, but what if good old murphy decides to give you a visit during the invocation period? Keep this behavior in mind when scheduling maintenance on the storage platform.
Warning messages
SDRS generates a warning and displays it at the SDRS faults tab in the datastores and datastore cluster view

Benefits of partially connected
We cannot identify any direct benefit of partially connecting a datastore of a cluster. Partially connected datastores impact initial placement, disable IO load-balancing and will affect DRS load balancing as well as SDRS space balancing. Therefore a basic design decision would be connect all datastores to all host in the cluster connected to the datastore cluster. If anyone has got a good reason for not connecting a datastore to all the hosts, please leave a comment.

Mem.MinFreePct sliding scale function

July 26, 2011 by frankdenneman

One of the cool “under the hood” improvements vSphere 5 offers is the sliding scale function of the Mem.MinFreePct.
Before diving into the sliding scale function, let’s take a look at the Mem.MinFreePct function itself. MinFreePct determines the amount of memory the VMkernel should keep free. This threshold is subdivided in various memory thresholds, i.e. High, Soft, Hard and Low and is introduced to prevent performance and correctness issues.
The threshold for the low state is required for correctness. In other words, it protects the VMkernel layer from PSOD’s resulting from memory starvation. The soft and hard thresholds are about virtual machine performance and memory starvation prevention. The VMkernel will trigger more drastic memory reclamation techniques when it approaches the Low state. If the amount of free memory is just a bit less than the Min.FreePct threshold, the VMkernel applies ballooning to reclaim memory. The ballooning memory reclamation technique introduces the least amount of performance impact on the virtual machine by working together with the Guest operating system inside the virtual machine, however there is some latency involved with ballooning. Memory compressing helps to avoid hitting the low state without impacting virtual machine performance, but if memory demand is higher than the VMkernels’ ability to reclaim, drastic measures are taken to avoid memory exhaustion and that is swapping. However swapping will introduce VM performance degradations and for this reason this reclamation technique is used when desperate moments require drastic measurements. For more information about reclamation techniques I recommend reading the “disable ballooning” article.
vSphere 4.1 allowed the user to change the default MinFreePct value of 6% to a different value and introduced a dynamic threshold of the Soft, Hard and Low state to set appropriate thresholds and prevent virtual machine performance issues while protecting VMkernel correctness. By default vSphere 4.1 thresholds was set to the following values:

Free memory state	Threshold	Reclamation mechanism
High	6%	None
Soft	64% of MinFreePct	Balloon, compress
Hard	32% of MinFreePct	Balloon, compress, swap
Low	16% of MinFreePct	Swap

Using a default MinFreePct value of 6% can be inefficient in times where 256GB or 512GB systems are becoming more and more mainstream. A 6% threshold on a 512GB will result in 30GB idling most of the time. However not all customers use large systems and prefer to scale out than to scale up. In this scenario, a 6% MinFreePCT might be suitable. To have best of both worlds, ESXi 5 uses a sliding scale for determining its MinFreePct threshold.

Free memory state threshold	Range
6%	0-4GB
4%	4-12GB
2%	12-28GB
1%	Remaining memory

Let’s use an example to explore the savings of the sliding scale technique. On a server configured with 96GB RAM, the MinFreePct threshold will be set at 1597.6MB, opposed to 5898.24MB if 6% was used for the complete range 96GB.

Free memory state	Threshold	Range	Result
High	6%	0-4GB	245.96MB
	4%	4-12GB	327.68MB
	2%	12-28GB	327.68MB
	1%	Remaining memory	696.32MB
Total High Threshold			1597.60MB

Due to the sliding scale, the MinFreePct threshold will be set at 1597.96MB, resulting in the following Soft, Hard and low threshold:

Free memory state	Threshold	Reclamation mechanism	Threshold in MB
Soft	64% of MinFreePct	Balloon	1022.69
Hard	32% of MinFreePct	Balloon, compress	511.23
Low	16% of MinFreePct	Balloon, compress, swap	255.62

Although this optimization isn’t as sexy as Storage DRS or one of the other new features introduced by vSphere5 it is a feature of vSphere 5 that helps you drive your environments to higher consolidation ratios.

VMware vSphere 5 Clustering Technical Deepdive

July 16, 2011 by frankdenneman

As of today the paperback versions of the VMware vSphere 5 Clustering Technical Deepdive is available at Amazon. We took the feedback into account when creating this book and are offering a Full Color version and a Black and White edition. Initially we planned to release an Ebook and a Full Color version only, but due to the high production cost associated with Full color publishing, we decided to add a Black and White edition to the line-up as well.
At this stage we do not have plans to produce any other formats. As this is self-publishing release we developed, edited and created everything from scratch. Writing and publishing a book based on new technology has serious impact on one’s life, reducing every social contact to a minimum even family life. As of this, our focus is not on releasing additional formats such as ibooks or Nook at this moment. Maybe at a later stage but VMworld is already knocking on our doors, so little time is left to spend some time with our families.
When producing the book, the page count rapidly exceeded 400 pages using the 4.1 HA and DRS layout. As many readers told us they loved the compactness of the book therefor our goal was to keep the page count increase to a minimum. Adjusting the inner margins of the book was the way to increase the amount of space available for the content. A tip for all who want to start publishing, start with getting accustomed to publisher jargon early in the game, this will save you many failed proof prints! We believe we got the right balance between white-space and content in the book, reducing the amount of pages while still offering the best reading experience. Nevertheless the number of pages grew from 219 to 348.
While writing the book, we received a lot of help and although Duncan listed all the people in his initial blog, I want to use take a moment to thank them again.
First of all I want to thank my co-author Duncan for his hard work creating content, but also spending countless hours on communication with engineering and management.
Anne Holler – DRS and SDRS engineer – Anne really went out of her way to help us understand the products. I frequently received long and elaborate replies regardless of time and day. Thanks Anne!
Next up is Doug – its number Frank not amounts! – Baer. I think most of the time Doug’s comments equaled the amount of content inside the documents. Your commitment to improve the book impressed us very much.
Gabriel Tarasuk-Levin for helping me understand the intricacies of vMotion.
A special thanks goes out to our technical reviewers and editors: Keith Farkas and Elisha Ziskind (HA Engineering), Irfan Ahmad and Rajesekar Shanmugam (DRS and SDRS Engineering), Puneet Zaroo (VMkernel scheduling), Ali Mashtizadeh and Doug Fawley and Divya Ranganathan (EVC Engineering). Thanks for keeping us honest and contributing to this book.
I want to thank VMware management team for supporting us on this project. Doug “VEEAM” Hazelman thanks for writing the foreword!
Availability
This weekend Amazon made both the black and white edition and the full color edition available. Amazon list the black and white edition as: VMware vSphere 5 Clustering Technical Deepdive (Volume 2) [Paperback], whereas the full color edition is listed with Full Color in its subtitle.
Or select the following links to go the desired product page:
Black and white paperback $29.95
Full Color paperback $49.95
For people interested in the ebook: VMware vSphere 5 Clustering Technical Deepdive (price might vary based on location)
If you prefer a European distributor, ComputerCollectief has both books available:
Black and White edition: http://www.comcol.nl/detail/74615.htm
Full Color edition: http://www.comcol.nl/detail/74616.htm
Pick it up, leave a comment and of course feel free to make those great mugshots again and ping them over via Facebook or our Twitter accounts! For those looking to buy in bulk (> 20) contact clusteringdeepdive@gmail.com.