vSphere 5.5 Home lab

For a while I’ve been using three Dell R610 servers in my home lab. The machines specs are quite decent, each server equipped with two Intel Xeon 5530 CPUs, 48GB of memory and four 1GB NICs. With a total of 24 cores (48 HT Threads) and 144GB of memory the cluster has more than enough compute power.

However from a bandwidth perspective they are quite limited, 3 Gbit/s SATA and 1GbE network bandwidth is not really pushing the envelope. These limitations do not allow me to properly understand what a customer can expect when running FVP software. In addition I don’t have proper cooling to keep the machines cool and their power consumption is something troubling.

Time for something new, but where to begin?

CPU
Looking at the current lineup of CPUs doesn’t make it easier. Within the same CPU vendor product line multiple types of CPU socket exist, multiple different processor series exist with comparable performance levels. I think I spent most of my time on figuring out which processor to select. Some selection criteria were quite straightforward. I want a single CPU system and at least 6 cores with Hyper-Threading technology. The CPU must have a high clock speed, preferably above 3GHz.

Intel ARK (Automated Relational Knowledge base) provided me the answer. Two candidates stood out; the Intel Core i7 4930 and the Intel Xeon E5 1650 v2. Both 6 core, both HT-enabled, both supporting the advanced technologies such as VT-x, VT-d and EPT. http://ark.intel.com/compare/77780,75780

The main difference between the two CPU that matters the most to me is the higher number of supported memory of the Intel Xeon E5. However the i7-4930 supports 64GB, which should be enough for a long time. But the motherboard provided me the answer

Motherboard
Contrary to the variety of choices at CPU level, there is currently one Motherboard that stands out for me. It looks it almost too good to be true and I’m talking about the SuperMicro X9SRH-7TF. This thing got it all and for a price that is unbelievable. The most remarkable features are the on-board Intel X540 Dual Port 10GbE NIC and the LSI 2308 SAS controller. 8 DIMM slots, Intel C602J chipset and a dedicated IPMI LAN port complete the story. And the best part is that its price is similar of a PCI version of the Intel X540 Dual Port 10GbE NIC. The motherboard only supports Intel E5 Xeons, therefor the CPU selection is narrowed down to one choice, the Intel Xeon E5 1650 v2.

CPU Cooler
The SuperMicro X9SRH-7TF contains an Intel LGA2011 socket with Narrow ILM (Independent Loading Mechanism) mounting. This requires a cooler designed to fit this narrow socket. The goal is to create silent machines and the listed maximum acoustical noise of 17.6 dB(A) of the Noctua NH-U9DX i4 “sounds” promising.

Memory
The server will be equipped with 64GB. Four 16GB DDR3-1600 modules allowing for a future upgrade of memory. The full product name: Kingston ValueRAM KVR16R11D4/16HA Modules.

Network
Although two 10 GbE NICs provide more than enough bandwidth, I need to test scenarios where 1GbE is used. Unfortunately vSphere 5.5 does not support the 82571 chipset used by the Intel PRO/1000 Pt Dual Port Server Adapter currently inserted in my Dell servers. I need to find an alternative 1 GbE NIC recommendations are welcome.

Power supply
I prefer a power supply that is low noise and fully modular. Therefore I selected the Corsair RM550. Besides a noise-reducing fan the PSU has a Zero RPM Fan Mode, which does not spin the fan until it is under heavy load, reducing the overall noise level of my lab when I’m not stressing the environment.

Case
The case of choice is the Fractal Design Define R4. Simple but elegant design, enough space inside and has some sound reducing features. Instead of the standard black color, I decided to order the titanium grey.

SSD
Due to the PernixDrive program I have access to many different SSD devices. Currently my lab contains Intel DC 3700 100GB and Kingston SSDNOW enterprise e100 200GB drives. Fusion I/O currently not (yet) in the PernixDrive program was so kind to lend me a Fusion I/O IODrive of 3.2 TB, unfortunately I need to return this to Fusion someday.

Overview

Component Type Cost
CPU Intel Xeon E5 1650 v2 540 EUR
CPU Cooler Noctua NH-U9DX i4 67 EUR
Motherboard SuperMicro X9SRH-7TF 482 EUR
Memory Kingston ValueRAM KVR16R11D4/16HA 569 EUR
SSD Intel DC 3700 100GB 203 EUR
Kingston SSDNOW enterprise e100 200GB 579 EUR
Power Supply Corsair RM550 90 EUR
Case Fractal Design Define R4 95 EUR
Price per Server (without disks) 1843 EUR

In total two of these machines are build as a start of my new lab. Later this year more of these machines will be added. I would like to thank Erik Bussink for providing me recommendations and feedback on the component selection of my new vSphere 5.5 Home Lab. I’m sure he will post a new article of his new lab soon.

New book project: vSphere Design Pocketbook v2 – the blog edition

Last year’s vSphere Design Pocketbook – “Tweet sized Design Consideration for Your Software-Defined Datacenter” was a big hit. Over 3000 copies were given away since last VMworld and I don’t even know how many copies are downloaded.

01-cover

vSphere Design Pocketbook platform
In case you missed it, the vSphere Design Pocketbook is a platform for all the virtualization community members to broadcast their knowledge. In the vSphere clustering deepdive series Duncan and I emphasised certain design consideration by calling them out in “Basic design principles” text boxes. Basic design principles provide quick and simple as well as deep and quintessential information to make architectural design decisions. We knew other community members had loads of advice to share and from that idea the vSphere Design Pocketbook was born. And now it is time for a successor!

The design considerations featured in the first book are in tweet-sized format, limited to 200 characters. This edition will expand beyond this limit and allows conveying your thoughts up to the length of a blog article. You can select either an existing content such as a published article or create a new one.

  • Is there a maximum length? Not exactly, use the words necessary to describe your design consideration efficiently. If necessary we will ask you to condense your material.
  • Can I use diagrams? Absolutely! Make sure you provide the diagrams and screenshots that can be printed. At least 220DPI, preferably 300DPI. Looking for guidelines on making great diagrams? Please read this article
  • Will I be credited? We will use the same format as the first book. Your name, twitter handle and, if available, your blog url will be listed. In line with most blog sites, you are requested to provide a short bio of 3 sentences that will be printed along side the article.
  • Do I need to be a blogger? You are not required to have a blog, nor be a vExpert or VCDX. There are no requirements for submitting your design decision articles.

We are looking for content in the following categories:

  • Host design
  • Cluster design
  • vCenter design
  • Networking and Security design
  • Storage design
  • Generic design considerations – “Words of Wisdom”.

To avoid saturation we do not allow more than a total of your top three articles. For example, you can provide us with three design consideration articles for the Host category but you could also choose to provide one article for three different categories. Be aware that we rather see one excellent design consideration article than three mediocre ones.

Project schedule

  • Announcement and Call for Entries (Today)
  • Deadline for Call for Entries (April, 25th)
  • Deadline selection design considerations by judges
  • Book design and print process
  • Book Availability (VMworld 2014)

Once the book is complete we shall publicise the list of people mentioned in the book, we will not share information during the production process of the book.

This book is free!
PernixData generously offered to print the book. If your design consideration is included in the book, you will receive a copy of the book. At the booth at VMworld PernixData will have a copy available for people who submitted a winning design consideration article. A limited number of books will be available for the community. More details will follow. After VMworld an E-book version of the book will be made publicly available.

How to enter?
An online form will be made available soon. In the mean time you can select your top article or write a new one. Stay tuned for the announcement!

Help my DRS cluster is not load balancing!

Unfortunately I still see this cry for help appearing on the VMTN forums and on twitter. And they usually are accompanied by screenshots like this:

01-DRS-unbalanced-memory

This screen doesn’t really show you if your DRS cluster is balanced or not. It just shows if the virtual machine receives the resources they are entitled to. The reason why I don’t use the word demand is that DRS calculates priority based on virtual machine and resource pool resource settings and resource availability.

To understand if the virtual machine received the resources it requires, hover over the bar and find the virtual machine. A new window is displayed with the metric “Entitled Resources Delivered”

02-DRS-VM-resource-entitlement

DRS attempts providing the resources requested by the virtual machine. If the current host is not able to provide the resources, DRS move it to another host that is able to provide the resources. If the virtual machine is receiving the resources it requires then there is no need to move the virtual machine to another hosts. Moves by DRS consume resources as well and you don’t want to waste resources on unnecessary migrations.

To avoid wasting resources, DRS calculates two metrics, the current host load standard deviation and the target host load standard deviation. These metrics indicate how far the current load of the host is removed from the ideal load. The migration threshold determines how far these two metrics can lie apart before indicating that the distribution of virtual machines needs to be reviewed. The web client contains this cool water level image that indicates the overall cluster balance. It can be found at the cluster summary page and should be used as a default indicator of the cluster resource status.

03-DRS-Host Load Standard Deviation

One of main arguments is that a host contain more than CPU and memory resources alone. Multiple virtual machines located on one host, can stress or saturate the network and storage paths extensively, whereas a better distribution of virtual machine across the hosts would also result in a better distribution of resources at the storage and network path layer. And this is a very valid argument, however DRS is designed to take care of CPU and Memory resource distribution and is therefor unable to take these other resource consumption constraints into account.

In reality DRS takes a lot of metrics into account during its load balance task. For more in-depth information I would recommend to read the article: “DRS and memory balancing in non-overcommitted clusters” and “Disabling mingoodness and costbenefit”.

Why I love VSAN!

I receive a lot of questions about VSAN, how it relates to PernixData FVP and if it will impact PernixData in any way.

In my opinion VSAN corroborates the architectural shift away from monolithic storage designs. The move away from storage arrays being the natural object to provide both storage performance and capacity to the virtual infrastructure. It’s another voice telling that there are different ways of providing storage services to your virtual infrastructure. And its great to see many people reconsidering the current design paradigm. And thats what I love about VSAN, now let’s take a closer look at the solutions both products provide, their similarities and their differences.

Similarities
VSAN is similar to PernixData from a design perspective, providing a scale out solution in the compute layer. Both PernixData FVP and VSAN operate at the correct place, at the hypervisor kernel level. By operating at kernel level, both solutions scale out natively at kernel and cluster level, reduce management overhead and provide operation simplicity.

True scale out storage performance
Scale out should be applied where it’s required and the natural layer is the compute layer. As new workloads are introduced, architects look at the compute layer to expand their pool of compute resources, CPU and memory. The missing part up to this point was storage performance. With the success of acceleration platforms such as FVP it is clear that storage performance should be considered as an integral part of the compute layer when designing your environment. Up to now, buying boatloads of capacity with the goal to provide performance was considered to be storage scale out. By placing a clustered solution that harness the power of flash provides the correct tool for the job at the correct place. Storage arrays can still be used for what they were designed for initially, storing data and providing other data services such as replication and snapshotting.

Architecture-FVP Basic Cluster performance scale out

Where both solutions differ
The reason why VSAN does not impact FVP directly is that VSAN provides a persistent data store layer to store virtual machines. Using VSAN either forces you to move the current virtual machines from your current storage array or place new virtual machines on this new datastore.

vsan-vs-fvp

FVP does not create a persistent storage layer. It does not require you to change the configuration of current virtual machines and new virtual machines are still stored on the storage array.

FVP decouples the performance requirements from the capacity requirements from storage arrays. This means that the path to the performance layer is short, where it matters (application to flash resources), and the path to the storage layer is longer which becomes transparent to the application.

FVP leverages server flash and clustered services to provide storage performance where it belongs; at the compute layer.

Investigate your application performance by using FVP monitor capabilities

In the article “Are your storage performance test methods able to test acceleration platforms accurately?” I recommend investigating the behaviour of your application when evaluating an acceleration platform. How could you do this without diving into the deep bowels of the applications, open up a can of command line tools and all other sorts of trace tools? With FVP software you can use FVP monitor capabilities.

FVP Monitor capabilities
We believe that you should be able to simply compare your current storage performance levels with the new accelerated performance levels and FVP software is designed to provide information in one clear and coherent performance graph. FVP Monitor capabilities allows you to easily determine the performance difference running an application natively on a storage array or accelerated by FVP.

00-Before-and-After

How do you use the FVP monitor capabilities? Simply create a FVP cluster but do not assign any flash resources to it. FVP software captures the I/O activity of virtual machines that are a member of the FVP cluster.

Creating a Monitor FVP cluster
The first step is to install FVP in your virtual architecture by installing the kernel extension module software (vib package) on the ESXi hosts inside the vSphere cluster. Install the PernixData Management Software on any windows machine. Go to the FVP Cluster menu option in the vSphere web client or if you are using the vSphere client, select the cluster and go to the PernixData tab. Select the Create Cluster option.Assign your Monitor cluster a name and provide a description, then select the vSphere cluster that contains the ESXi hosts enriched with the FVP software.

01-Create-Monitor-Mode-FVP-Cluster

Please note that these hosts contain Flash devices but these will be assigned to another FVP cluster that accelerates application workload. Once the Monitor cluster is created, the summary page indicates that the cluster does not contain any flash devices and for this purpose that’s ok.

02-Summary-Monitor-Mode-FVP-Cluster

The next step is selecting the virtual machine that runs the application that you are investigating. For this exercise I’m using the virtual machine running vCenter server.

03-Add-VM-to-Monitor-Mode-FVP-Cluster

Please note that although the UI allows you to select a Write Policy, no read or write acceleration will be performed, due to the absence of Flash resources in this cluster.

Time to run the application. FVP measures the Latency, IOPS and Throughput of the virtual machine. To view these performance metrics, select the virtual machine and go to the monitor tab. Select the FVP Performance option. To keep the length of this article to a minimum only latency performance graphs are shown, an extensive white paper covering this feature shall be published soon.

The Latency graph shows the VM observed latency and the datastore latency. The VM Observed latency is the effective amount seen by the virtual machine, due to absence of flash, the VM Observed latency overlays the Datastore latency. I disabled the metrics Local Flash and Network Flash as they were flat lining at the bottom.

04-VM-Observed-Latency-Monitor-Mode-FVP-Cluster

FVP allows you to determine the native performance of the application provided by the storage array. The screenshot above shows the latency observed by the virtual machine, but FVP also allows you to determine the number of IOPS and throughput. Each allowing the user to select a breakdown on Datastore and Flash, Read and Write or create a custom view.

As the Monitor FVP cluster provided a native performance baseline, it is now time to determine the difference in performance by moving the virtual machine to the FVP cluster containing flash resources.

05-Flash-Cluster-Overview

To move a virtual machine between FVP clusters, select the destination FVP cluster and select the option Add VMs….By default the FVP user interface displays the virtual machine that are not a member of other FVP clusters, to view all virtual machines in the vSphere cluster, enable the option Show all VMs at the top left corner. Select the virtual machine (in this exercise, vCenter) and select the appropriate write policy.

06-Add-VM-to-FVP-Cluster

It is interesting to know that FVP collects performance history per cluster, host and virtual machine, when selecting the FVP Performance graphs at virtual machine level, FVP shows all the know data regardless of membership of FVP Cluster.

At 1.48 the transition between FVP clusters was complete and the performance graph shows the immediate improvement on latency, the VM observed latency is closer to the flash performance then the datastore performance.

07-Early transition

By hovering over the data points, a breakdown of the selected metrics is displayed.
Although the datastore latency is in hovering between 3 to 6 milliseconds, the VM observed latency is close to sub millisecond latencies, low millisecond latencies.

08-Breakdown

The FVP Performance graphs can be displayed in different time ranges, from the most recent 10 minutes all the way up to one month.

09-Time Range

These different time ranges allow you to run tests for longer periods of time and have detailed graphs helping you determine workloads running in a non-accelerated environment and the performance gains the FVP platform provides.

10-1-hour