BEST PRACTICES

Last week at VMworld and the VCDX defense panels I heard the term “Best Practices” a lot. The term best practice makes me feel happy, shudder and laugh at the same time. Now when it comes to applying best practice I always use the analogy of crossing the road: I am born and raised in the Netherlands and best practice is to look left first, then to the right and finally check left again before crossing the road. This best practice served me well and it helped me avoid being hit by a car/truck/crazy people on bikes and even trams and trolleys. But I ask you does this best practice still apply when I try to cross the street in London? Don’t get me wrong, best practice are useful and very valuable, but to apply a best practice blindly won’t be as lethal as my analogy but it can get you into a lot of trouble.

VMWORLD VCLOUD DIRECTOR LABS

Yesterday the VMworld Labs opened up to the public and if you want to take vCloud Director for a spin I’m recommending doing the following labs: Private cloud – Management: Lab 13 VMware vCloud Director Install and Config Lab 18 VMware vCloud Director Networking Private Cloud - Security: Lab20 VMware vShield Its best to complete Lab 18 vCloud Director Networking before doing VMware vShield Lab (Lab 20) because the terms and knowledge gained in Lab 18 will prepare you for Lab 20. Today the VMworld 2010 speaker sessions started and I strongly recommend Duncan’s session “BC7803 - Planning and Designing an HA Cluster that Maximizes VM Uptime” and Kit Colbert’s “ TA7750 - Understanding Virtualization Memory Management Concepts”. Go check it out.

NUMA, HYPERTHREADING AND NUMA.PREFERHT

I received a lot of questions about Hyperthreading and NUMA in ESX 4.1 after writing the ESX 4.1 NUMA scheduling article. A common misconception is that Hyperthreading is ignored and therefore not used on a NUMA system. This is not entirely true and due to the improved Hyperthreading code on Nehalems, the CPU scheduler is programmed to use the HT feature more aggressively than the previous releases of ESX. The main reason why I think this misconception exists is the way the NUMA load balancer handles vCPU placement of vSMP virtual machine. Before continuing, let’s get our CPU elements nomenclature aligned, I’ve created a diagram showing all the elements:

VM SETTINGS: PREFER PARTIALLY AUTOMATED OVER DISABLED

Due to requirements or constraints it might be necessary to exclude a virtual machine from automatic migration and stop it from moving around by DRS. Use the “Partially automated” setting instead of “Disabled” at the individual virtual machine automation level. Partially automated blocks automated migration by DRS, but keep the initial placement function. During startup, DRS is still able to select the most optimal host for the virtual machine. By selecting the “Disabled” function, the virtual machine is started on the ESX server it is registered and chances of getting an optimal placement are low(er). An exception for this recommendation might be a virtualized vCenter server, most admins like to keep track of the vCenter server in case a disaster happens. After a disaster occurs, for example a datacenter-wide power-outage, they only need to power-up the ESX host on which the vCenter VM is registered and manually power-up the vCenter VM. An alternative to this method is to keep track of the datastore vCenter is placed on and register and power-on the VM on a (random) ESX host after a disaster. Slightly more work than disabling DRS for vCenter, but offers probably better performance of the vCenter Virtual Machine during normal operations. Due to expanding virtual infrastructures and new additional features, vCenter is becoming more and more important for day-to-day operational management today. Assuring good performance outweighs any additional effort necessary after a (hopefully) rare occasion, but both methods have merits.

VOTED NUMBER 6 OF TOP 25 VMWARE BLOGS, WOW!

Eric Siebert of vSphere-land, together with David Davis, Simon Seagrave and John Troyer announced the results of the Top 25 blogs election this week. Using vChat is in my opinion a very cool format and I had the feeling that I was watching the academy awards for Bloggers. Gentlemen, thank you for taking the time and effort to create this entertaining show. Next time I will be viewing this from my TV instead of my 13" laptop screen and making sure I’ll have the popcorn ready. But most of all I want to thank everyone who voted for me. According to you my blog belongs in the top 10 and I’m very proud and honored to be ranked up that high. Thank you very much!

PROVIDER VDC: CLUSTER OR RESOURCE POOL?

Duncan’s article on vCloud Allocation models states that: a provider vDC can be a VMware vSphere Cluster or a Resource Pool … Although vCloud Director offers the ability to map Provider vDCs to Clusters or Resource Pool, it might be better to choose for the less complex solution. This article zooms in on the compute resource management constructs and particularly on making the choice between assigning a VMware Cluster or a Resource Pool to a Provider vDC and placement of Organization vDCs. I strongly suggest visiting Yellow Bricks to read all vCloud Director posts, these posts explain the new environment / cloud model used by VMware very thoroughly.

RESOURCE POOLS AND SIMULTANEOUS VMOTIONS

Many organizations have the bad habit to use resource pools to create a folder structure in the host and cluster view of vCenter. Virtual machines are being placed inside a resource pool to show some kind of relation or sorting order like operating system or types of application. This is not reason why VMware invented resource pools. Resource pools are meant to prioritize virtual machine workloads, guarantee and/or limit the amount of resources available to a group of virtual machines. During design workshops I always try to convince the customer why resource pools should not to be used to create a folder structure. The main object I have for this is the sibling share level of resource pools and virtual machines. Shares specify the priority for the virtual machine or resource pool relative to other resource pools and/or virtual machines with the same parent in the resource hierarchy. The key point is that shares values can be compared directly only among siblings: the ratios of shares of VM6:VM7 tells which VM is higher priority, but the shares of VM4:VM6 does not tell which VM has higher priority. Many articles have been written about this, such as: “The resource pool priority-pie paradox”, (Craig Risinger) “Resource pools and shares” (Duncan Epping), “Don’t add resource pools for fun” (Eric Sloof) and “Resource pools caveats” (Bouke Groenescheij). But another reason not to use resource pools as a folder structure is the limitation resource pools inflict on vMotion operations. Depending on the network speed, vSphere 4.1 allows 8 simultaneous vMotion operations, however simultaneous migrations with vMotion can only occur if the virtual machine is moving between hosts in the same cluster and is not changing its resource pool. This is recently confirmed in Knowledge Base article [1026102](http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1026102&sliceId=1&docTypeID=DT_KB_1_1&dialogID=111208416&stateId=0 0 116148556). Fortunately simultaneous cross-resource-pool vMotions can occur if the virtual machines are migrating to different resource pools, but still one vMotion operation per target resource pool. Because clusters are actually implicit resource pools (the root resource pool), migrations between clusters are also limited to a single concurrent vMotion operation. Using resource pools to create a folder structure can not only impact the availability of resources for the virtual machines, but can also hinder your daily (maintenance) operations if batches of virtual machines are being migrated to other resource pools.

2 DAYS LEFT TO VOTE

Eric Siebert owner of vSphere-land.com started the second round of the bi-annual top 25 VMware virtualization blogs voting. The last voting was back in January and this is your chance to vote for your favorite virtualization bloggers and help determining the top 25 blogs of 2010. This year my blog got nominated for the first time and entered the top 25 (no. 14) and I hope to stay in the top 25 after this voting round. My articles tend to focus primarily on resource management and cover topics such as DRS, CPU and memory scheduler to help you make an informed decision when designing or managing a virtual infrastructure. As noble as this may sound I know that these kinds of topics are not mainstream and I can understand that not everybody is interested to read about these topics week in week out. Fortunately I’ve managed to get a blog post listed in the Top 5 Planet V12n blog post list at least once every month and referred on a regular basis by sites like Yellow-bricks.com (Duncan Epping), Scott Lowe, NTpro.nl (Eric Sloof) and Chad Sakac and of course many others. So it seems I’m doing something right. This is my list of top 10 articles I’ve created this year:

ESX 4.1 NUMA SCHEDULING

VMware has made some changes to the CPU scheduler in ESX 4.1; one of the changes is the support for Wide virtual machines. A wide virtual machine contains more vCPUs than the total amount of available cores in one NUMA node. Wide VM’s will be discussed after a quick rehash of NUMA. NUMA NUMA stands for Non-Uniform Memory Access, which translates into a variance of memory access latencies. Both AMD Opteron and Intel Nehalem are NUMA architectures. A processor and memory form a NUMA node. Access to memory within the same NUMA node is considered local access, access to the memory belonging to the other NUMA node is considered remote access.

VCLOUD DIRECTOR ARCHITECTURE

A vCloud infrastructure consists of several components. Many of you have deployed, managed, installed or designed vSphere environments. The vCloud Director architecture introduces a new and additional layer on top of the vSphere environment. I have created a diagram which depicts architecture mainly to be used by service providers. Service providers are likely to use an internal cloud for their own application and IT department operations and an external cloud environment for their service delivery. The purpose of this diagram is to create a clear overview of the new environment and showing a relational overview of all the components. In essence, which component connects with which other components? The environment consists of a management cluster, an internal resource group and an external resource group. Building blocks (VMware vSphere, SAN and networking) within a vCD environment are often referred to as “resource groups”. To run a VMware vCloud environment, the minimum components to run are: • VMware vCloud Director (vCD) • VMware vShield Manager • VMware Chargeback • VMware vCenter (supporting the Resource groups) • vSphere infrastructure • Supporting infrastructure elements (Networking, SAN) The management cluster contains all the vCD components, such as the vCenters and Update managers managing the resource groups, the vCD cells and Chargeback clusters. Due to the Oracle Database requirement, a physical Oracle cluster is recommended, as RAC clustering is not supported on VMware vSphere. No vCD is used to manage the management cluster, management is done by a dedicated vCenter. Within a resource pod a separate cluster is created for each pVCD to enable the service provider to deliver the different service levels. Duncan wrote a great introduction to VCD recently. http://www.yellow-bricks.com/2010/08/31/vmware-vcloud-director-vcd/. This diagram shows two additional vCenters, one for the internal resource pod and one for the external resource pod. It is advised to isolate internal IT resources from customer IT resources. Managing and deploying internal IT services and external IT services such as customers VM from one vCD can become obscure and complex. A single vCenter is used for both pVCDs as it expected that customer will deploy virtual machines in different service level offerings. By using one vCenter a single Distributed Virtual Switch can be used, spanning both clusters\service offerings. To rehash, this is a abstract high level diagram intended to show the involved elements of a vCloud environment and to show the relation or connections all the components.