frankdenneman Frank Denneman is the Machine Learning Chief Technologist at VMware. He is an author of the vSphere host and clustering deep dive series, as well as podcast host for the Unexplored Territory podcast. You can follow him on Twitter @frankdenneman

VMware Cloud on AWS – Elastic DRS preview

2 min read

The VMworld Europe keynote featured the future VMware Cloud on AWS services. In short this services gives VMware customers instant scale and global reach delivered by AWS while continuing to use their own skill set driving and operating VMware SDDC environments on-prem and in-cloud. Avoid the risk that comes with re-platforming, re-architecting current application landscape to run on a different platform while providing the same service. In turn it allows the IT organization to connect the current applications with AWS vast service catalog and use services like RDS, Red Shift, Glacier and many more.
One of the interesting features that is under tech preview is Elastic DRS. Elastic DRS helps to solve one of the toughest challenges an IT architect can face: capacity planning. Major key points of capacity planning are current and future resource demand, failure recovery capacity and maintenance capacity. Finding the right balance between maintaining workload performance versus the downside of CAPEX and OPEX of reserved failover capacity is difficult. By leveraging the IT-at-scale operations of AWS, Elastic DRS transforms vSphere clusters into an agility powerhouse.
 
Rapid scaling ability allows to add additional hosts to the cluster. No more ordering new hardware, racking and stacking, just add the new host to the cluster with a right-click of the mouse. By using native metrics, DRS can detect that the cluster is running out of host resources and presents a recommendation of adding another host. Like regular DRS, you can also put Elastic DRS into automatic mode and allow it to add or remove hosts based on observed load on the cluster.
elastic-drs
Sometimes we forget how extremely complex running IT at super scale is. Automating the install, configuration and operaing one host is interesting, doing this by the dozen is already pushing the limits for a lot of IT organizations. Now think about this doing it in more than a dozen datacenters around the world at the same time while being required to do it instantly when a customer wants this. Undeniably impressive. When joining the team, learning about Elastic DRS was exciting, understanding how this works for all the customers on all the AWS datacenters around the world is just mind-blowing! IT-at-Scale to its finest.
When you have ready-to-go ESXi hosts at your fingertips it allows you to do so many cool things , for example allow DRS to aid and assist vSphere HA. Since ESXi 3.0, vSphere HA has ensured that workloads are restarted on the surviving hosts in the cluster. However, when a host outage is not temporary, but permanently, application performance can be impacted due to the reduction of available host resources on a longer term. Auto remediation helps to address this challenge.
Auto remediation builds upon Elastic DRS and ensures that the available host resources remain consistent during an ESXi host outage. When a host failure is detected, auto remediation adds another hosts to the cluster, ensuring that the workload performance will not be impacted in the long run by a host failure. If partial (hardware) failure occurs, auto remediation ensures that VSAN operations complete before ejecting the degraded host.
 
Another benefit of this framework is the ability to retain similar levels of resources during maintenance. Typically during maintenance operations, hosts are patched and temporarily unavailable to run and service applications. Many IT organizations deal with this situation, by either “oversizing” cluster or by offering SLA’s that provides a reduced service during maintenance hours. With Elastic DRS, the cluster size is not reduced during maintenance operations. This way workloads are not impacted by a loss of resources and continue to perform similarly as to normal operation hours.
To emphasize this is a a technical preview of a service that is not operational yet.
For more info about VMware Cloud on AWS, take a closer look.

frankdenneman Frank Denneman is the Machine Learning Chief Technologist at VMware. He is an author of the vSphere host and clustering deep dive series, as well as podcast host for the Unexplored Territory podcast. You can follow him on Twitter @frankdenneman