Lately I have received a couple of questions about Swap file placement. As I mentioned in the article “Storage DRS and alternative swap file locations”, it is possible to configure the hosts in the DRS cluster to place the virtual machine swapfiles on an alternative datastore. Here are the questions I received and my answer:
Question 1: Will placing a swap file on a local datastore increase my vMotion time?
Yes, as the destination ESXi host cannot connect to the local datastore, the file has to be placed on a datastore that is available for the new ESXi host running the incoming VM.Therefor the destination host creates a new swap file in its swap file destination. vMotion time will increase as a new file needs to be created on the local datastore of the destination host and swapped memory pages potentially need to be copied.
Question 2: Is the swap file an empty file during creation or is it zeroed out?
When a swap file is created an empty file equal to the size of the virtual machine memory configuration. This file is empty and does not contain any zeros.
Please note that if the virtual machine is configured with a reservation than the swap file will be an empty file with the size of (virtual machine memory configuration – VM memory reservation). For example, if a 4GB virtual machine is configured with a 1024MB memory reservation, the size of the swap file will be 3072MB.
Question 3: What happens with the swap file placed on a non-shared datastore during vMotion?
During vMotion, the destination host creates a new swap file in its swap file destination. If the source swap file contains swapped out pages, only those pages are copied over to the destination host.
Question 4: What happens if I have an inconsistent ESXi host configuration of local swap file locations in a DRS cluster?
When selecting the option “Datastore specified by host”, an alternative swap file location has to be configured on each host separately. If one host is not configured with an alternative location, then the swap file will be stored in the working directory of the virtual machine. When that virtual machine is moved to another host configured with an alternative swap file location, the contents of the swap file is copied over to the specified location, regardless of the fact that the destination host can connect to the swap file in the working directory.
Question 5: What happens if my specified alternative swap file location is full and I want to power-on a virtual machine?
If the alternative datastore does not have enough space, the VMkernel tries to store the VM swap file in the working directory of the virtual machine. You need to ensure enough free space is available in the working directory otherwise the VM not allowed to be powered up.
Question 6: Should I place my swap file on a replicated datastore?
Its recommended placing the swap file on a datastore that has replication disabled. Replication of files increases vMotion time. When moving the contents of a swap file into a replicated datastore, the swap file and its contents need to replicated to the replica datastore as well. If synchronous replication is used, each block/page copied from the source datastore to the destination datastore, it needs to wait until the destination datastore receives an acknowledgement from its replication partner datastore (the replica datastore).
Question 7: Should I place my swap file on a datastore with snapshots enabled?
To save storage space and design for the most efficient use of storage capacity, it is recommended not to place the swap files on a datastore with snapshot enabled. The VMkernel places pages in a swap file if it’s there is memory pressure, either by an overcommitted state or the virtual machine requires more memory than it’s configured memory limit. It only retrieves memory from the swap file if it requires that particular page. The VMkernel will not transfer all the pages out of the swap file if the memory pressure on the host is resolved. It keeps unused swapped out pages in the swap file, as transferring unused pages is nothing more than creating system overhead. This means that a swapped out page could stay there as long as possible until the virtual machine is powered-off. Having the possibility of snapshotting idle and unused pages on storage could reduce the pools capacity used for snapshotting useful data.
Question 8: Should I place my swap file on a datastore on a thin provisioned datastore (LUN)?
This is a tricky one and it all depends on the maturity of your management processes. As long as thin provisioned datastore is adequately monitored for utilization and free space and controls are in place that ensures sufficient free space is available to cope with bursts of memory use, than it could be a viable possibility.
The reason for the hesitation is the impact a thin provisioned datastores has on the continuity of the virtual machine.
Placement of swap files by VMkernel is done at the logical level. The VMkernel determines if the swap file can be placed on the datastore based on its file size. That means that it checks the free space of a datastore reported by the ESX host, not the storage array. However the datastore could exist in a heavily over-provisioned datapool.
Once the swap file is created the VMkernel assumes it can store pages in the entire swap file, see question 2 for swap file calculation. As the swap file is just an empty file until the VMkernel places a page in the swap file, the swap file itself takes up a little space on the thin disk datastore. Now this can go on for a long time and nothing will happen. But what if the total reservation consumed, memory overcommit-level and workload spikes on the ESXi host layer are not correlated with the available space in the thin provisioning storage pool? Understand how much space the datastore could possibly obtain and calculate the maximum configured size of all existing swap files on the datastore to avoid an Out-of space condition.
(Alternative) VM swap file locations Q&A – part 2
Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
(Alternative) VM swap file locations Q&A
Lately I have received a couple of questions about Swap file placement. As I mentioned in the article “Storage DRS and alternative swap file locations”, it is possible to configure the hosts in the DRS cluster to place the virtual machine swapfiles on an alternative datastore. Here are the questions I received and my answer:
Question 1: Will placing a swap file on a local datastore increase my vMotion time?
Yes, as the destination ESXi host cannot connect to the local datastore, the file has to be placed on a datastore that is available for the new ESXi host running the incoming VM.Therefor the destination host creates a new swap file in its swap file destination. vMotion time will increase as a new file needs to be created on the local datastore of the destination host and swapped memory pages potentially need to be copied.
Question 2: Is the swap file an empty file during creation or is it zeroed out?
When a swap file is created an empty file equal to the size of the virtual machine memory configuration. This file is empty and does not contain any zeros.
Please note that if the virtual machine is configured with a reservation than the swap file will be an empty file with the size of (virtual machine memory configuration – VM memory reservation). For example, if a 4GB virtual machine is configured with a 1024MB memory reservation, the size of the swap file will be 3072MB.
Question 3: What happens with the swap file placed on a non-shared datastore during vMotion?
During vMotion, the destination host creates a new swap file in its swap file destination. If the source swap file contains swapped out pages, only those pages are copied over to the destination host.
Question 4: What happens if I have an inconsistent ESXi host configuration of local swap file locations in a DRS cluster?
When selecting the option “Datastore specified by host”, an alternative swap file location has to be configured on each host separately. If one host is not configured with an alternative location, then the swap file will be stored in the working directory of the virtual machine. When that virtual machine is moved to another host configured with an alternative swap file location, the contents of the swap file is copied over to the specified location, regardless of the fact that the destination host can connect to the swap file in the working directory.
Question 5: What happens if my specified alternative swap file location is full and I want to power-on a virtual machine?
If the alternative datastore does not have enough space, the VMkernel tries to store the VM swap file in the working directory of the virtual machine. You need to ensure enough free space is available in the working directory otherwise the VM not allowed to be powered up.
Question 6: Should I place my swap file on a replicated datastore?
Its recommended placing the swap file on a datastore that has replication disabled. Replication of files increases vMotion time. When moving the contents of a swap file into a replicated datastore, the swap file and its contents need to replicated to the replica datastore as well. If synchronous replication is used, each block/page copied from the source datastore to the destination datastore, it needs to wait until the destination datastore receives an acknowledgement from its replication partner datastore (the replica datastore).
Question 7: Should I place my swap file on a datastore with snapshots enabled?
To save storage space and design for the most efficient use of storage capacity, it is recommended not to place the swap files on a datastore with snapshot enabled. The VMkernel places pages in a swap file if it’s there is memory pressure, either by an overcommitted state or the virtual machine requires more memory than it’s configured memory limit. It only retrieves memory from the swap file if it requires that particular page. The VMkernel will not transfer all the pages out of the swap file if the memory pressure on the host is resolved. It keeps unused swapped out pages in the swap file, as transferring unused pages is nothing more than creating system overhead. This means that a swapped out page could stay there as long as possible until the virtual machine is powered-off. Having the possibility of snapshotting idle and unused pages on storage could reduce the pools capacity used for snapshotting useful data.
Question 8: Should I place my swap file on a datastore on a thin provisioned datastore (LUN)?
This is a tricky one and it all depends on the maturity of your management processes. As long as thin provisioned datastore is adequately monitored for utilization and free space and controls are in place that ensures sufficient free space is available to cope with bursts of memory use, than it could be a viable possibility.
The reason for the hesitation is the impact a thin provisioned datastores has on the continuity of the virtual machine.
Placement of swap files by VMkernel is done at the logical level. The VMkernel determines if the swap file can be placed on the datastore based on its file size. That means that it checks the free space of a datastore reported by the ESX host, not the storage array. However the datastore could exist in a heavily over-provisioned datapool.
Once the swap file is created the VMkernel assumes it can store pages in the entire swap file, see question 2 for swap file calculation. As the swap file is just an empty file until the VMkernel places a page in the swap file, the swap file itself takes up a little space on the thin disk datastore. Now this can go on for a long time and nothing will happen. But what if the total reservation consumed, memory overcommit-level and workload spikes on the ESXi host layer are not correlated with the available space in the thin provisioning storage pool? Understand how much space the datastore could possibly obtain and calculate the maximum configured size of all existing swap files on the datastore to avoid an Out-of space condition.
(Alternative) VM swap file locations Q&A – part 2
Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
VMware feature request
During presentations I always stress to submit a feature request if you have an idea how to enhance the product or if you feel you are missing a vital product feature. VMware is very interested to hear how the products can be enhanced and improved.
Although it’s always good to talk to your local VMware rep or your favorite VMware blogger, submitted feedback might not reach the correct person on time. In order to have the feedback routed to the correct person using the shortest path available, it is best to submit a feature request via the VMware website.
Unfortunately VMware.com doesn’t have an action button on the front-page, therefor I thought it might be a good idea to publish a short article with the link included. If you have any feedback go to the feature request page and submit your comments.
Thanks!
vMotion bug fixed in vCenter server 5.1.0a
Looking for the “Designing your vMotion network“? Please follow the link.
Last week VMware vCenter Server™ 5.1.0a was released which contains a bugfix for Essential plus license customers. A few readers provided me feedback about being unable to initiate the new vMotion that migrates both host and datastore state of the virtual machine. Due to the feedback and the filed SRs we got to the bottom of the bug pretty quickly and got the bugfix in this release.
vMotion and Storage vMotion
Unable to access the cross-host Storage vMotion feature from the vSphere Web Client with an Essentials Plus license
If you start the migration wizard for a powered on virtual machine with an Essentials Plus license, the Change both host and datastore option in the migration wizard is disabled, and the following error message is displayed:
Storage vMotion is not licensed on this host.
To perform this migration without a license, power off the virtual machine.
This issue is resolved in this release.
https://www.vmware.com/support/vsphere5/doc/vsphere-vcenter-server-510a-release-notes.html
Thanks for the feedback and above all, thanks for filing the SRs providing us useful data. You can download the update here.
HA admission control is not a capacity management tool.
I receive a lot of questions on why HA doesn’t work when virtual machines are not configured with VM-level reservations. If no VM-level reservations are used, the cluster will indicate a fail over capacity of 99%, ignoring the CPU and memory configuration of the virtual machines. Usually my reply is that HA admission control is not a capacity management tool and I noticed I have been using this statement more and more lately. As it doesn’t scale well explaining it on a per customer basis, it might be a good idea to write a blog article about it.
The basics
Sometimes it’s better to review the basics again and understand where the perception of HA and the actual intended purpose of the product part ways.
Let’s start of what HA admission control is designed for. In the availability guide the two following statement can be found: Quote 1:
“vCenter Server uses admission control to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that virtual machine resource reservations are respected.”
Let’s dive in the first quote and especially this statement: “To ensure that sufficient resources are available in a cluster” is the key element, and in particular the word sufficient (resources). What sufficient means for customer A, does not mean sufficient for customer B. As HA does not have an algorithm decoding the meaning of the word sufficient for each customer, HA relies on the customer to set vSphere resource management allocation settings to indicate the importance of resource availability for the virtual machine during resource contention scenarios.
As we are going back to the basics, lets have a quick look at the resource allocation settings that are used in this case, reservations and shares. A reservation indicates the minimum level of resources available to the virtual machine at all times. This reservation guarantees – or protect might be a better word –the availability of physical resources to the virtual machine regardless of the level of contention. No matter how high the contention in the system is, the reservation restricts the VMkernel from reclaiming that particular CPU cycle or memory page.
This means that when a VM is powered on with a reservation, admission control needs to verify if the host can provide these resources at all times. As the VMkernel cannot reclaim those resources, admission control makes sure that when it lets the virtual machine in, it can hold its promise of providing these resources all the time, but also checks if it won’t introduce problems for the VMkernel itself and other virtual machines with a reservation. This is the reason why I like to call admission control the virtual bouncer.
Besides reservation we have shares and shares indicates the relative priority of resource access during contention. A better word to describe this behavior is “opportunistic access”. As the virtual machine is not configured with a reservation, it provides the VMkernel with a more relaxed approach of resource distribution. When resource contention occurs, VMkernel does not need to provide the configured resources all the time, but can distribute the resources based on the activity and the relative priority based on the shares of the virtual machines requesting the resources. Virtual machines configured only with shares will just receive what they can get; there is no restrictive setting for the VMkernel to worry about when running out of resources. Basically the virtual machines will just get what’s left.
In the case of shares, it’s the VMkernel that decides which VM gets how many resources in a relaxed and very social way, where virtual machines configured with a reservation DEMAND to have the reservations available at all times and do not care about the needs of others.
In other words, the VMkernel MUST provide the resources to the virtual machine with reservation first and then divvy up the rest amongst the virtual machines who opted for a opportunistic distribution (shares).
How does this tie in with HA admission control?
The second quote gives us this insight:
“vSphere HA: Ensures that sufficient resources in the cluster are reserved for virtual machine recovery in the event of host failure.”
We know that admission control checks if there is enough resources are available to satisfy the VM-level reservation without interfering with VMkernel operations or VM-level reservations of other virtual machines running on that host. As HA is designed to provide an automated method of host failure recovery, we need to make sure that once a virtual machine is up and running it can continue to run on another host in the cluster if the current hosts fails. Therefor the purpose of HA admission control is to regulate and check if there are enough resources available in the cluster that can satisfy the virtual machine level reservations after a host failure occurs.
Depending on the admission control policy it calculates the capacity required for a failover based on available resources and still comply with the VMkernel resource management rules. Therefor it only needs to look at VM-level reservations, as shares will follow the opportunistic access method.
Semantics of sufficient resources while using shares-only design
In essence, HA will rely on you to determine if the virtual machine will receive the resources you think are sufficient if you use shares. The VMkernel is designed to allow for memory overcommitment while providing performance. HA is just the virtual bouncer that counts the number of heads before it lets the virtual machine in “the club”. If you are on the list for a table, it will get you that table, if you don’t have a reservation HA does not care if you decide to need to sit at a 4-person table with 10 other people fighting for your drinks and food. HA relies on the waiters (resource management) to get you (enough) food as quickly as possible. If you wanted to have a good service and some room at your table, it’s up to you to reserve.
Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman