Category: VMware (page 1 of 22)

Stretched Clusters on VMware Cloud on AWS, a Really Big Thing

This week Emad published an excellent article about the stretched cluster functionality of VMware Cloud on AWS. To sum up, you can now deploy a single vSphere cluster across two AWS availability zones.

A trip to Memory Lane
I think the ability to stretch a vSphere cluster across two availability zones is a really big thing. Go back to the days where we had to refactor the application to make it highly available. To reduce application downtime, you typically used clustering software such as Microsoft cluster or Veritas clustering services. But not all applications were fit for this solution.

When we introduced VMware High Availability back in 2006, we brought a big change to the industry. From that point on you could provide crash-consistent failover ability to all your workloads. No need to refactor any application, no need to build outlandish hardware solutions. Just enable a few tickboxes at the infrastructure layer, and every workload running inside a VM is protected. And to this day, HA remains the most popular functionality of vSphere.

Amazon Web Services Resiliency Strategy
Amazon urges you to design your application to be resilient to infrastructure outages. Amazon AWS is hosted in multiple locations worldwide. These locations are composed of regions and Availability Zones. Each region is a separate geographic area that has multiple, isolated locations known as Availability Zones. AWS provides the ability to place instances and data in multiple locations.

And you can take advantage of the safety and reliability of geographic redundancy by spanning your Auto Scaling group across multiple Availability Zones within a region and then attach a load balancer to distribute incoming traffic across those Availability Zones. Incoming traffic is distributed equally across all Availability Zones enabled for your load balancer.

And this works very well if you are refactoring your application or if you are building a complete new cloud-native stack. The challenge we face today is that not all applications lend to getting refactored, or some applications do not require the journey from monolithic to full-FAAS.

Hybrid-Cloud Experience
With stretched clusters in VMware Cloud on AWS, we introduce the same ease of infrastructure resiliency to workloads that run on AWS infrastructure. Merely expand you vSphere cluster to 6 hosts and select multi-az deployment.

After that, the workload in the Cloud SDDC is protected for AZ outages. If something happens, HA detects the failed VMs and restarts them on different physical servers in the remaining AZ without manual human involvement.

The ability to stretch your vSphere cluster across AZs allows you to easily provide resiliency to your workload within the AWS infrastructure without the Herculean effort of refactoring all your applications.

Dying Home Lab – Feedback Welcome

The servers in my home lab are dying on a daily basis. After four years of active duty, I think they have the right to retire. So I need something else. But what? I can’t rent lab space as I work with unreleased ESXi code. I’ve been waiting for the Intel Xeon D 21xx Supermicro systems, but I have the feeling that Elon will reach Mars before we see these systems widely available. The system that I have in mind is the following:

  • Intel Xeon Silver 4108 – 8 Core at 1.8 GHz (85TDP)
  • Supermicro X11SPM-TF (6 DIMMs, 2 x 10 GbE)
  • 4 x Kingston Premier 16GB 2133
  • Intel Optane M.2 2280 32 GB

Intel Xeon Silver 4108 8 Core. I need to have a healthy number of cores in my system to run some test workload. Primarily to understand host and cluster scheduling. I do not need to run performance tests, thus no need for screaming fast CPU cores. TDP value of 85W. I know there is a 4109T with a TDP value of 70W, but they are very hard to get in the Netherlands.

Supermicro X11SPM-TF.Rocksolid Supermicro, 2 x Intel X722 10GbE NICs onboard and IPMI.

Kingston Premier 4 x 16 GB 2133 MHz. DDR4 money is nearing HP Printer Ink prices, 2133 MHz is fast enough for my testing, and I don’t need to test 6 channels of RAM at the moment. The motherboard is equipped with 6 DIMM slots, so if memory prices are reducing, I can expand my system.

Boot Device
Intel Optane M.2 32 GB. ESXi still needs to have a boot device, no need to put in 256 GB SSD.

This is the config I’m considering. What do you think? Any recommendations or alternate views?

Dedicated Hardware in a Public Cloud World

One of the more persistent misconceptions is that the components of VMware’s Software Defined Data Center (SDDC) on VMware Cloud on AWS are virtualized or that the deployed VMs run natively on Amazon. And to be honest, it’s not even weird that most people think this way. After all, Amazon Web Services launched in March 2006, 12 years ago. AWS and Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3) are synonymous with each other. All of a sudden, you can know “run vSphere on AWS”.

To be short and sweet, VMware Cloud on AWS runs on physical hardware, it is not virtualized and running inside EC2 instances!

VMware Cloud is consuming the AWS infrastructure and using a bare-metal service offered by AWS. Of course, it is not as simple as installing vSphere on a bare-metal server and you got yourself a fully elastic cloud service. More than that needs to happen. VMware Cloud on AWS is a partnership between the two companies and both have done some extensive R&D work to make this happen. If you want to know more, Chris Wagner – Principle Architect of the service presented an excellent session (LHC3174BU) at VMworld on how we built it.

Back to the service offering, when deploying an SDDC, by default a four node cluster is erected. Four physical hosts are assigned to a single customer account, and the service installs, patches and rolls out the full SDDC stack of vSphere, vSAN, and NSX. You just have to log on to vCenter and start deploying workloads.

Each ESXi host provides 36 CPU cores of 2.3 GHz (72 threads), 512 GB of RAM and 10.7 TB of raw storage capacity for the virtual machines to consume. As a result, a default vSphere cluster provides 144 CPU cores (288 threads), 2 TB of RAM and 42.8 TB of raw storage capacity. All physical resources!

Due to leveraging the scale of AWS data centers and its operational framework, the VMware Cloud on AWS fleet management service can deploy physical resources on demand! By logging into the console ( you can add and remove physical host to the cluster.

This allows you add physical hardware to the cluster, whenever you need it. No more long procurement process, no more waiting for the vendor to ship the goods. No more racking, stacking in a cold dark datacenter. Just with a few clicks, you get fresh new hardware added to your cluster, fully installed, configured, patched and ready to go. Typically this takes about 10 minutes for VMware Cloud on AWS to add a single physical host to your vSphere cluster. I’ve been to data centers that it took me more than 10 minutes to arrive at the correct cabinet.

If one is not enough, you can add up to 28 ESXi hosts in the cluster. In the example above, I added 10 additional hosts. The console list the host type, the extra capacity added by this action (10 ESXi hosts = 360 Cores, 5 TB RAM and 107 TB of Storage and sums the new cluster capacity.

If you want to isolate specific workloads and add a separate cluster, just go right ahead and select the add cluster option in the console.

In total, a VMware Cloud on AWS customer can deploy up to 10 clusters of each 32 ESXi hosts in a single SDDC. In total two SDDCs can be erected. That means that a customer can have 23040 of physical CPU cores, 327 TB of memory and 6.8 Petabyte of storage. All physical hardware.

You can imagine all this is done by firing off a collection of API-calls to get this process orchestrated. The beauty of having this functionality capacity-by-code is that you can incorporate it into software features, such as vSphere HA and DRS. An upcoming new feature is Elastic DRS. In short, the ability to scale out and scale the cluster with physical hardware whenever workload demand requires it. I will provide a more in-depth view once we release this new feature.

vBrownBag Techtalks VMworld Call for Papers now open

Although the selection process of the submitted VMworld 2018 sessions is still ongoing, vBrownbag announced their call for papers.

As Duncan mentioned in his Call for paper article ‘Good luck, and remember: if you don’t end up getting selected, submit the proposal to a VMUG near you instead. They are always begging for community sessions.’

Think about signing up for the vBrownbag as well. Since last year all the vBrownbag sessions are published in the content catalog. Thus your session is visible for all 23.000+ attendees. Go right ahead and fill out this form.

The public Shaming of Resource Pool-as-a-Folder User

Yesterday there was some public shaming done of Antony Spiteri. He was outed that he was using vSphere resource pool as folders.

A funny thread and he truly deserved all the public shaming by the community members ;). All fun aside, using resource pools as folders are not recommended by VMware. As I described in the new vSphere 6.5 DRS white paper available at vSphere central:

Correct use: Resource pools are an excellent construct to isolate a particular amount of resources for a group of virtual machines without having to micro-manage resource setting for each individual virtual machine. A reservation set at the resource pool level guarantees each virtual machine inside the resource pool access to these resources. Depending on the activity of these virtual machines these virtual machines can operate without any contention.

Incorrect use: Resource pools should not be used as a form of folders within the inventory view of the cluster. Resource pools consume resources from the cluster and distribute these amongst its child objects within the resource pool; this can be additional resource pools and virtual machines. Due to the isolation of resources, using resource pools as folders in a heavily utilized vSphere cluster can lead to an unintended level of performance degradation for some virtual machines inside or outside the resource pool.

Understanding this behavior allows you to design a correct resource pool structure. Currently, I’m working on a new vSphere DRS Resource Pool white paper which sheds some new light on the distribution of resources under normal conditions and under load (the Resource Pool Pie Paradox). I will keep you posted!

Older posts

© 2018

Theme by Anders NorenUp ↑