Frank Denneman - Chief Technologist AI at VMware

vSphere 5.1 Clustering Deepdive available

August 28, 2012 by frankdenneman

Duncan and I released the vSphere 5.1 Clustering deepdive book this week. The book contains the new features of the vSphere 5.1 suite. We rewrote the Storage DRS chapter and have added a complete new chapter focusing on Stretched Clusters.
Font changes
The challenge for us was to include all the new content in the book without allowing the book to grow beyond its trademark dimensions. To achieve this we used a different font and decreased the font size, this resulted in a growth of 80 pages, making it 415 pages instead of the 505 pages if we used the previous font. Please note that although we decreased the font size, this did not decrease the legibility of the book.
Special cover
The cover is designed in such a way that you can actually have multiple copies with all different shades of orange, dare I say 50 shades of Orange. 😉
We hope you enjoy the new version of the vSphere clustering deepdive series. It’s available in Paperback and Kindle format.
Paper copy – $ 24.95
Kindle version – $ 7.49

CloudPhysics in a nutshell

August 22, 2012 by frankdenneman

Disclaimer: I’m a technical advisor for CloudPhysics.
I’m very happy to see CloudPhysics coming out of stealth mode this week and making their beta product available to the public. In a nutshell CloudPhysics is bringing Big Data analytics to the IT environment and it will provide you with tools to analyze your datacenter. How does it acquire this dataset and what benefit do you get from it?
The Observer Appliance
To gather all that data, an Observer Appliance needs to run in the virtual infrastructure. And in order to get a valuable dataset that is used for analytics and simulations the Observer needs to be active in as many as virtual infrastructures as possible. Running an appliance that sends operational data to a third party like CloudPhysics can be a security concern. Going into detail about how CloudPhysics designed the system to handle privacy, security and data sharing issues is outside the scope of this article. In short, data extracted from the virtual infrastructure are performance statistics and inventory and configuration settings. All environmental details are scrubbed and no log files or content of disk and memory is gathered.

The User Interface
The data acquired by the Observer Appliance is accessible at https://app.cloudphysics.com. Logging in will give you access to your own data. The beta product provides a user interface that allows you to dive into specific focus areas. The UI provides so-called cards that displays key data points and is a launch point to a more detailed view. This view can contain information about the relationship with other features of the vSphere stack. An example of such a card would be Virtual Machine level reservations. Not only does this card provide you information about the present virtual machine level reservations in your environment in a clear and concise manner, it also displays the impact the reservation has on the High Availability slot size and therefor the consolidation ratio of your cluster. All this information combined in a single screen, no need to navigate through multiple screens and correlate particular metrics.

Correlation of metrics
Correlation of particular settings and understanding the impact each setting has on a complex environment, such as a virtual infrastructure, is time consuming and above all very difficult. This correlation of metrics allows you to save time, but it also helps you understand behavior of your environment. Now you might ask how do you know you can trust if these correlations are correct and this is one of the most interesting things about this product. It’s a combination of product expertise and community driven input.
The two pillars of knowledge
The CloudPhysics team comprises of industry heavy hitters. Some of these persons invented core features of the vSphere stack while working for VMware, while others made their mark at other industry leading companies. The second pillar is the community involvement. In this beta program, registered users can suggest ideas for utility cards. Domain experts will verify the community provided cards on technical accuracy.
Near-future developments
One thing I’m very exited about is the upcoming High Availability and DRS simulation tools. Both HA and DRS can be a challenge to configure as some settings impact the virtual infrastructure on multiple levels. The HA and DRS simulation analyzes current settings and provides you a platform where you can predict the effects of a change on your environment.
VMworld Challenge 2012
Now back to the current status. CloudPhysics is running a VMworld Challenge 2012. The contest allows you to describe the problems you are facing, such as “I’m applying different disk shares in my environment but I cannot see the worst case scenario allocation”. The more card you produce, the more points you score. To increase your score, download the Observer Appliance and take your environment for a test drive. The more activity you generate, the more points you accumulate. How will you benefit from this contest, first of all, if you are located in the U.S. you can win some great prices. (Due to U.S legislation, non-U.S. residents are excluded from winning prizes), but by submitting cards you improve the system and the quality of the reporting tool and simulation tool.
Resource Management as a Service
I started of with a disclaimer, I am a technical advisor to CloudPhysics and you can expect to see more articles about the development of CloudPhysics. As I’m able to work with the inventors of DRS and Storage DRS, a lot of my focus is on resource management. Together with the input of the community, the continuous analysis by domain experts you can expect that this might well turn out as Resource Management as a Service.

VM templates and Storage DRS

August 22, 2012 by frankdenneman

Please note that Storage DRS cannot move VM templates via storage vMotion. This can impact load balancing operations or datastore maintenance mode operations. When initiating Datastore Maintenance mode, the following message is displayed:

As maintenance mode is commonly used for array migrations of datastore upgrade operations (VMFS-3 to VMFS-5), remember to convert the VM template to a virtual machine first before initiating maintenance mode.

My public VMworld schedule

August 20, 2012 by frankdenneman

This year will be an action-packed VMWorld for me, presenting sessions, participating in two panel sessions, hosting a group discussion and available in two “Meet the expert” sessions.
Presenting the following sessions:
INF-STO1545 – Architecting Storage DRS Datastore Clusters
INF-VSP1683 – VMware vSphere Cluster Resource Pools Best Practices
Panel sessions:
(TAM Day) – ASK THE EXPERTS
INF-VSP1504 – Ask the Expert vBloggers
Hosting the GD22 – Resource management (DRS/SDRS) group discussion. I invited Anne Holler (Lead engineer DRS) to host this session together with me.
During Meet the Experts session 13 and session 17 I’m available for short meetings to answer your resource management (DRS\SDRS) questions.
Here is the week schedule of the sessions/events/activities that I will be taking part of, be sure to sign up if you have not already:
Sunday (TAM Day):
14:35 – 15:35 : ASK THE EXPERTS
Monday:
14:30 – 15:30 : INF-VSP1504 – Ask the Expert vBloggers
16:00 – 17:00 : GD22 – Resource Management
Tuesday:
12:30 – 13:30 : INF-STO1545 – Architecting Storage DRS Datastore Clusters (Repeat session)
15:00 – 16:00 : INF-VSP1683 – vSphere Cluster Resource Pools Best Practices
Wednesday:
08:30 – 09:30 : INF-STO1545 – Architecting Storage DRS Datastore Clusters
12:30 – 13:30 : Expert 13
Thursday:
12:00 – 12:00 : Expert 17

DRS and memory balancing in non-overcomitted clusters

August 10, 2012 by frankdenneman

First things first, I normally do not recommend changing advanced settings. Always try to tune system behavior by changing the settings provided by the user interface or try to understand system behavior and how it aligns with your design.
The “problem”
DRS load balancing recommendations could be sub-optimal when no memory overcommitment is preferred.
Some customers prefer not to use memory overcommitment. The clusters contain (just) enough memory capacity to ensure all running virtual machines have their memory backed by physical memory. Nowadays it is not uncommon seeing virtual machines with fairly highly allocated (consumed) memory and due to the use of large pages on hosts with recent CPU architectures, little to no memory is shared. Common scenario with this design is a usual host memory load of 80-85% consumed. In this situation, DRS recommendations may have a detrimental effect on performance as DRS does not consider consumed memory but active memory.
DRS behavior
When analyzing the requirements of a virtual machine during load balancing operations, DRS calculates the memory demand of the virtual machine.
The main memory metric used by DRS to determine the memory demand is memory active. The active memory represents the working set of the virtual machine, which signifies the number of active pages in RAM. By using the working-set estimation, the memory scheduler determines which of the allocated memory pages are actively used by the virtual machine and which allocated pages are idle. To accommodate a sudden rapid increase of the working set, 25% of idle consumed memory is allowed. Memory demand also includes the virtual machine’s memory overhead.
Let’s use an 8 GB virtual machine as example on how DRS calculates the memory demand. The guest OS running in this virtual machine has touched 50% of its memory size since it was booted but only 20% of its memory size is active. This means that the virtual machine has consumed 4096 MB and 1639.2 MB is active.

As mentioned, DRS accommodate a percentage of the idle consumed memory to accommodate a sudden increase of memory use. To calculate the idle consumed memory, the active memory 1639.2 MB is subtracted from the consumed memory, 4096 MB, resulting in a total 2456.8 MB. By default DRS includes 25% of the idle consumed memory, i.e. 614.2 MB.

The virtual machine has a memory overhead of 90 MB. The memory demand DRS uses in it’s load balancing calculation is as follows: 1639.2 MB + 614.2 MB + 90 MB = 2343.4 MB. This means that DRS will select a host that has 2343.4 MB available for this machine and the move to this host improves the load balance of the cluster.
DRS and corner stone of virtualization resource overcommitment
Resource sharing and overcommitment of resources are primary elements of the virtualization. When designing virtual infrastructure it is a challenge to build the environment in such a way that it can handle virtual machine workloads while improving server utilization. Because every workload is not equal, applying resource allocation settings such as shares, reservations and limits can make distinction in priority.
DRS is designed with this corner stone in mind. And that’s makes DRS sometimes a hard act to follow. DRS is all about solving imbalance and providing enough resources to the virtual machines aligned to their demand. This means that DRS balances workload on demand and trust in its core value that overcommitment is allowed. It then relies on the host local scheduler to figure out the priority of the virtual machines. And this behavior is sometimes not in line with the perception of DRS.
A common perception is that DRS is about optimizing performance. This is partially true. As mentioned before DRS looks at the demand of the VM, and will try to mix and match activity of the virtual machines with the available resources in the cluster. As it relies on resource allocation settings, it assumes that priority is defined for each virtual machine and that the host local schedulers can reclaim memory safely. For this reason the DRS memory imbalance metric is tuned to focus on VM active memory to allow efficient sharing of host memory resources. Allowing to run with less cluster memory than the sum of all running virtual machine memory sizes and reclaiming idle consumed memory from lower priority virtual machines for other virtual machines’ active workloads.
Unfortunately DRS does not know when the environment is designed in such a way to avoid overcommitment. Based on the input it can place a virtual machine on a host with virtual machine that have lots of idle consumed memory laying around. Instigating memory reclamation. In most cases this reclamation is hardly noticeable due to the use of the balloon driver. However in the case where all hosts are highly utilized, ballooning might not be as responsive as required, forcing the kernel to compress memory and swap. This means that migrations for the sole purpose of balancing active memory are not useful in environments like these and, if the target host memory is highly consumed, can cause a performance impact on the migrating virtual machine as it waits to obtain memory and on the other virtual machines on the target host as they do processing to allow reclamation of their idle memory.
The solution? You might want to change the 25% idle consumed memory setting
The solution I recommend to start with is to lower the migration threshold by moving the slider to the left. This allows the DRS cluster to have an higher imbalance and allows DRS to be more conservative when recommending migrations.
If this is not satisfactory, then I would suggest changing the DRS advanced option called IdleTax. Please note that this DRS advanced option is not the same setting as the memory kernel setting. Mem.IdleTax.
The DRS IdleTax advanced option (default 75) controls how much consumed idle memory should be added to active memory in estimating memory demand. The calculation is as follows: 100-IdleTax. Default caluculation = 100-75=25
This means that the smaller the value of IdleTax, more consumed idle memory is added to the active memory by DRS for load balancing.

Be aware that the value of IdleTax is a heuristic, tuned to facilitate memory overcommitment; tuning it to a lower value is appropriate for environments not using overcommitment. Note that the option is set per cluster, and would need to be changed for all DRS clusters as appropriate.
Again, try to use a lower migration threshold setting and monitor if this setting provides satisfying results before setting this advanced feature.