The latest update of VMware Cloud on AWS introduced a new feature called compute policies. In its initial release, the compute policies provide the ability to configure affinity rules and mobility control based of declarative policies and vSphere tags.
Management of affinity rules
Historically, affinity rules are a part of the cluster configuration. Within VMware Cloud on AWS, cluster configuration is controlled by VMware and thus customers cannot set affinity rules for virtual machines running within the SDDC. Instead of merely pulling the affinity rules configuration outside the cluster configuration, we decided to improve the affinity functionality and work towards a more uniform and consistent experience across multiple clouds.
The road to declarative policies
Within a declarative system, you describe what you want to happen. This is the opposite of imperative operations where you specify actions. Declarative commands define state and to some extent affinity rules are declarative statements. Let’s take VM anti-affinity rules as an example. You want to keep VM1 and VM2 separated and keep them in different fault domains. Instead of providing imperative actions of pinning VM1 to host A and pinning VM2 to host B, you create an anti-affinity rule with VM1 and VM2 as members. You state that these two VMs should not run on the same ESXi host. vCenter (DRS) controls placement and takes the necessary actions to solve any violations of this intent. We want to apply this model to other features.
Instead of logging into vCenter to deal with configuration issues, and manually correct the situation, we want vCenter to manage the functions of your behalf. The way you interact with vCenter, in this more declarative way, is with policies. Instead of specifying more detailed imperative actions, you would declare your intent and the only thing you want to monitor after that is whether the policy is compliant or not.
We have to start somewhere, thus we concentrated on affinity rules (VM-VM and VM-host) and anti-mobility (vMotion disabled) policies. Once we have this more abstract way of interacting with vCenter Server, it provides more advantages. One of them is an additional level of abstraction. And abstraction allows for a more uniform and consistent experience across multiple clouds.
With today’s ability on-prem setup, you configure your cluster for a particular workload and this could inhibit the ability to move your workload to another cluster, on-prem or even to the cloud. To make sure you can easily burst out to VMware Cloud environments, you want this to be seamless. The directions where we are going to is that you do not need to have configurations that are specific to on-prem clusters and in-cloud or at-edge clusters. But ideally you express what you want and it should be the job of the cloud control plane, such as vCenter, to push this configuration to the environment the workload is presently in. So that could be to an on-prem cluster or an in-cloud cluster.
Compute policies are active at vCenter level
Due to this model, the rules are decoupled from cluster level and are now managed at vCenter level. If you would configure a VM-VM anti-affinity rule and you would move the VMs to another cluster, the policy remains active.
At the time of writing, VMware Cloud on AWS allows the customer to create 10 clusters per SDDC. Clusters can span multiple AWS availability zones (AZs). The VM-Host affinity ruleset allows customers to tag the hosts per AZ and tag the VMs that needs to remain in that availability zone. You can move the VMs to hosts between clusters within the same AZ, the compute policy remains active while vCenter ensures the compliance of the rule.
Introduction of firm rules
An interesting fact is that the VM-Host rules are firm rules, these firm rules differ from the traditional soft (should run on) and hard (must run on). They sit in between these two rules. DRS cannot violate these rules, only if the host is placed in maintenance mode. This ensures that during normal operations the rules are never broken while providing VMware the ability to service the SDDC. The only time a host is placed into maintenance mode in VMware Cloud on AWS is during upgrades which are handled by VMware and well communicated before the service window. This allows the customer to generate a strategy for these virtual machines well ahead before the service window.
In the next article, I will go through the steps on how to create a compute policy.