Category: Home Lab (page 1 of 2)

Dying Home Lab – Feedback Welcome

The servers in my home lab are dying on a daily basis. After four years of active duty, I think they have the right to retire. So I need something else. But what? I can’t rent lab space as I work with unreleased ESXi code. I’ve been waiting for the Intel Xeon D 21xx Supermicro systems, but I have the feeling that Elon will reach Mars before we see these systems widely available. The system that I have in mind is the following:

  • Intel Xeon Silver 4108 – 8 Core at 1.8 GHz (85TDP)
  • Supermicro X11SPM-TF (6 DIMMs, 2 x 10 GbE)
  • 4 x Kingston Premier 16GB 2133
  • Intel Optane M.2 2280 32 GB

CPU
Intel Xeon Silver 4108 8 Core. I need to have a healthy number of cores in my system to run some test workload. Primarily to understand host and cluster scheduling. I do not need to run performance tests, thus no need for screaming fast CPU cores. TDP value of 85W. I know there is a 4109T with a TDP value of 70W, but they are very hard to get in the Netherlands.

Motherboard
Supermicro X11SPM-TF.Rocksolid Supermicro, 2 x Intel X722 10GbE NICs onboard and IPMI.

Memory
Kingston Premier 4 x 16 GB 2133 MHz. DDR4 money is nearing HP Printer Ink prices, 2133 MHz is fast enough for my testing, and I don’t need to test 6 channels of RAM at the moment. The motherboard is equipped with 6 DIMM slots, so if memory prices are reducing, I can expand my system.

Boot Device
Intel Optane M.2 32 GB. ESXi still needs to have a boot device, no need to put in 256 GB SSD.

This is the config I’m considering. What do you think? Any recommendations or alternate views?

New Home Lab Hardware – Dual Socket Xeon v4

 

A new challenge – a new system for the home lab

About one year ago my home lab was expanded with a third server and a fresh coat of networking. During this upgrade, which you can read at “When your Home Lab turns into a Home DC” I faced the dilemma of adding new a new generation of CPU (Haswell) or expanding the hosts with another Ivy Bridge system. This year I’ve proceeded to expand my home lab with a dual Xeon system, and decided to invest in the latest and greatest hardware available. Like most good tools, you buy them for the next upcoming job, but in the end, you will use it for countless other projects. I expect the same thing with this year’s home lab ‘augmentation’. Initially, the dual socket system will be used to test and verify the theory published in the upcoming book “vSphere 6 Host resource deep dive” and the accompanying VMworld presentation (session id 8430), but I have a feeling that it’s going to become my default test platform. Besides the dual socket system, the Intel Xeon 1650 v2 servers are expanded with more memory and a Micron 9100 PCIe NVMe SSD 1.2 TB flash device. Listed below is the bill of materials of the dual socket system:

Amount Component Type Cost in Euro
2 CPU Intel Xeon E5 2630 v4 1484 (742)
2 CPU Cooler Noctua NH-U12DX i4 CPU Cooler 118 (59)
1 Motherboard Supermicro X10DRi-T 623
8 Memory Kingston KVR24R17D8/16MA – 16 GB 2400 MHz CL17 760 (95)
1 Flash Device Micron 9100 PCIe NVMe SSD 1.2 TB Sample
1 Flash Device Intel PC P3700 PCIe NVMe SSD 400 GB Sample
1 Flash Device Intel SSD PC S3700 100 GB 170
1 Ethernet Controller HP NC365T 4-port Ethernet Server Adapter 182
1 Case Fan Noctua NF-A14 FLX FAN 140 MM 25
1 Case Fractal Design Define XL R2 – Titanium 135
Total Cost 3497 EUR

Update: I received a lot of questions about the cost of this system, I’ve listed the price in EURO’s. With todays exchange rate Euro to USD 1.1369 it’s about 3976 U.S. Dollar.

 

Dual socket configuration

Building a dual socket system is still interesting, for me it brought back the feeling of the times when I built a dual Celeron system. Some might remember the fantastic Abit BP6 motherboard with the Intel BX440 chip. Do you know today’s virtual machines still use a virtualized BX440 chipset? But I digress. Building a dual socket system is less difficult than in those days when you had to drill a hole in CPU to attach it to a specific voltage, but it still has some minor challenges.

Form Factor
A dual-socket design requires more motherboard real estate than a single socket system. However there are some dual socket motherboards that are offered in the popular ATX form factor format, however, concessions are made by omitting some components. Typically this results in a reduced number of PCIe slots or the absolute minimum number of DIMM slots supported by the CPUs. Typically, you end up with selecting an E-ATX motherboard or if you really feel adventures EE-ATX or a proprietary format of the motherboard manufacturer. I wanted to have a decent amount of PCI-E slots as it needs to fit both the Intel and the Micron NVMe devices as well as the Quad 1 GB NIC. On the plus side, there seems to be a decent amount of PC cases available that support this E-ATX format.

One of them is the Fractal Design Define R4, but there are many others. I selected the Fractal R4 as it’s the same case design all the other servers use, only slightly bigger to fit the motherboard. The build quality of this case is some of the best I’ve seen. Due to all the noise-reducing material its quite heavy. Although it states it supports E-ATX on their spec sheet I assume they’ve only focused on the size of the motherboard. Unfortunately, the chassis does not contain the necessary mounting holes to secure the motherboard properly. In total 4 mounting points cannot be used, as the E-ATX uses in total 10 mounting points it should not be a big problem, however, it is missing a crucial one, the one in the top left corner. This might lead to problems when installing the DIMMs or the CPUs. I solved this by drilling a hole in the chassis to mount a brass grommet, but next time I would rather go for a different case. The red circles indicate the missing grommets.

Missing_Grommets

Power Supply
Dual Socket systems are considered heavy load configurations and require the dual 8 Pin EPS 12V connectors to be populated. Make sure that your power supply kit contains these connectors. Pictured below are the 12v 8-pin power connectors located on the motherboard.

8-pins-dual socket-cpu

When researching power supplies, I noticed that other people prefer to use 700 or 1000 watt power supplies. I don’t believe you need to go that extreme. The amount of watts required all depends on what configuration you want to run.  In my design I’m not going to run dual SLI video card, it will ‘only’ contain the Intel Xeon 2640 v4 CPUs, 3 PCIe devices and 8 DDR4 2400 MHz Modules. Although it sounds like an extreme configuration already, it’s actually not that bad. Let’s do the math.

With a TDP value of 85 watts each, the CPU’s consume a maximum of 170 W. The PCIe devices increases the power requirement to 207 W. According to Intel the DC P3700 NVMe device consumes 12 W on write, 9 of read. The quad NIC Ethernet controller is reported to consume 5 W. Micron states that the active power consumption of the P9100 1.2TB PCIe device is between 7 and 21 W. DDR4 DIMM voltage is set to 1.2V, compared to the 1.5V DDR3 requires, the reduce voltage which likely translate in a lower power consumption than similar DDR3 configurations. Unfortunately, Memory vendors do not provide exact power consumption specs. Toms Hardware measured the power consumption of 4 DDR4 modules and discovered the consumption ranged from 6 to 12 W depending on the manufacturer. Worst case scenario, my 8 DIMMS consume 24 W. As the motherboard is quite large I assume it consumes a great deal of power as well, build computer states motherboard power consumption of high-end motherboards is between 45 to 80 Watts. As the board features two X540 10 G NICS, I will add the 12.5 W of power consumption stated by Intel to the overall motherboard consumption. In my calculation I assume a consumption of a 100 W . The Intel SSD DC3700 100 GB acting as base OS disk is rated to consume 2.9. This totals to a power consumption of roughly 330 W.

There are some assumptions made therefore I’m playing it safe by using the Corsair RM 550 power supply which provides an output of 550W at 12 Volt. The cooling solutions don’t move the needle that much, but for completeness sake, I’ve included them in the table.

Component Estimated Active Power Consumption Vendor Spec
Intel Xeon E5 2630 v4 85 * 2 = 170 W ark.intel.com
Micron P9100 1.2 TB 21 W Micron Product Brief (PDF)
Kingston KVR24R17D8/16MA ~24 W Toms Hardware review
Intel SSD PC P3700 400 GB 12 W ark.intel.com
HP NC365T 4-port Ethernet Server Adapter 5 W HP.com
Intel SSD PC S3700 100 GB 2.9 W ark.intel.com
Noctua NF-A14 FLX FAN 140 MM 0.96 W Noctua.at
Noctua NH-U12DX i4 CPU Cooler 2 * 0.6 = 1.2 W Noctua.at
Supermicro X10DRi-T ~80 W Buildcomputers.net
Intel X540-AT2 Dual 10 Gb Ethernet Controller 12.5 W ark.intel.com
Total Power Consumption ˜330 W

One thing you might want to consider is the fan noise when buying “Zero RPM Fan Mode” power supply. Typically these power supplies can operate without using the fan up to an X percentage of system load. It will increase the use of the fan when the system loads go up. With my calculation I operate in the 60% system load range, above the threshold of the Zero RPM Fan Mode nut still in the Low Noise mode, while benefiting from the maximum efficiency of the power supply.

RM550-FAN-NOISE

Cooling
With more power consumptions comes the great need of cooling. And as we all know power consumption is the path to the dark side, power consumption leads to heat, heat leads to active cooling. Active cooling leads to noise. Noise leads to suffering. Or something like that. To reduce the noise generated by the home lab, I use Noctua cooling. High-quality low noise, it cost a pretty penny, but as always, quality products cost more.  In the UP (Uni-Processor) servers, I use the Noctua NH-U9DX-I4, which is an absolute behemoth. Due to the dual CPU setup, I selected the Noctua NH-U12DX i4 CPU Cooler which is specified as a slim design. In retrospect, I could have gone with the U9DX-I4 as well. Detailed information on both CPU coolers: https://www.quietpc.com/nh-udxi4

03-Noctua_Dual_Socket_configuration

Please ensure that your choice of cooler supports the 2011-3 socket configuration. According to Wikipedia: The 2011-3 socket use the so-called Independent Loading Mechanism (ILM) retention device that holds the CPU in place on the motherboard. 2011-3. Two types of ILM exist, with different shapes and heatsink mounting hole patterns. The  square ILM (80×80 mm mounting pattern), and narrow ILM (56×94 mm mounting pattern). Square ILM is the standard type, while the narrow one is alternatively available for space-constrained applications. It’s not a surprise to see the Supermicro X10DRi-T features the arrow ILM configuration. Noctua ships the coolers with mounting kits for both ILM configuration, check your motherboard specs and the supported ILM configuration of your cooling solution before ordering.

05-Narrow_ILM-vs-Square_ILM

 

Micron 9100 PCIe NVMe SSD 1.2 TB

Recently Micron supplied me with three engineering samples of their soon to be released PCIe NVMe SSD device. According to the product brief the 1.2 TB version provides 2.8/1.3 GB/s sequential write speed at a steady state when using a 128KB transfer size, impressive to say the least. The random read/write performance of the device of 4 KB blocks are 700.000 IOPS/180.000 IOPS. Remember the times where you were figuring out if you had to 150 IOPS or 180 IOPS when using a 15K spindle disk. 🙂
06-Engineering_sample
I’m planning to use the devices to create a EMC ScaleIO 2.0 architecture. Paired together with DFTM and a 10Gb network, this will be a very interesting setup. Mark Brookfield published an extensive write up of the ScaleIO 2.0 installation. Expect a couple of blog posts about the performance insights on Scale IO soon.

RAM for Intel Xeon 1650 V2 servers

I’ve purchased some additional RAM to further test the in-memory I/O acceleration of FVP (DFTM) and the impact of various DIMM Per Channel configuration on memory bandwidth. One server will have a configuration of 2 DPC, containing 128 GB of DDR3 1600MHz RAM, the second UP server is also 2 DPC configuration, equipped with 128GB of DDR3 1866 MHz and the Dual Xeon system runs a 1 DPC configuration with DDR4 2400 MHz RAM. The attentive reader will notice that I’ve over-specced the memory for the Intel Xeon v4 as this CPU supports memory up to 2133 MHz. Apparently 2400 MHz memory is produced more than the 2133 MHz equivalent, resulting in cheaper 2400 MHz memory. The mainboard adjusts the memory to the supported frequency accordingly.

07-Memory_posted_by_BIOS

The various memory configurations will also aid in the development of the FVP and Architect coverage. We recently released FVP 3.5 and Architect 1.1 and this release provided the long awaited management virtual appliance. Couple that with the FVP freedom edition (RAM acceleration) and you can speed up your home lab with just a couple of clicks. I will publish an article on this soon.

Home Lab Fundamentals: DNS Reverse Lookup Zones

unless its a time sync issue-750

When starting your home lab, all hints and tips are welcome. The community is full of wisdom, yet sometimes certain topics are taken for granted or are perceived as common knowledge. The Home Lab fundamentals series focusses on these subjects, helping you how to avoid common pitfalls that provide headaches and waste incredible amounts of time.

One thing we always keep learning about vSphere is that both time and DNS needs to be correct. DNS resolution is important to many vSphere components. You can go a long way without DNS and use IP-addresses within your lab, but at one point you will experience weird behavior or installs just stop without any clear explanation.In reality vSphere is build for professional environments where it’s expected that proper networking structure is in place, physical and logical. When reviewing a lot of community questions, blog posts and tweets, it appears that DNS is partially setup, i.e. only forward lookup zones are configured. And although it appears to be ”just enough DNS to get things going, many have experienced that their labs start to behave differently when no reverse lookup zones are present. Time-outs or delays are more frequent, the whole environment isn’t snappy anymore. Ill-configured DNS might give you the idea that the software is crap but in reality, it’s the environment that is just configured crappy. When using DNS, use the four golden rules; forward, reverse, short and full. DNS in a lab environment isn’t difficult to set up and if you want to simulate a proper working vSphere environment then invest time in setting up a DNS structure. It’s worth it! Besides expanding your knowledge, your systems will feel more robust and believe me, you will wait a lot less on systems to respond.

vCenter and DNS
vCenter inventory and search rely heavy on DNS. And since the introduction of vCenter Single Sign-On service (SSO) as a part of the vCenter Server management infrastructure DNS has become a crucial element. SSO is an authentication broker and security token exchange infrastructure. As described in the KB article Upgrading to vCenter Server 5.5 best practices (2053132);

With vCenter Single Sign-On, local operating system users become far less important than the users in a directory service such as Active Directory. As a result, it is not always possible, or even desirable, to keep local operating system users as authenticated users.

This means that you are somewhat pressured into using an ‘external’ identity source for user authentication, even for your lab environment . One of the most popular configurations is the use of Active Directory as an identity source. Active Directory itself uses DNS as the location mechanism for domain controllers and services. If you have configured SSO to use Microsoft Active Directory for authentication, you might have seen some weird behavior when you haven’t created a reverse DNS lookup zone.

Installation of vCenter Server (Appliance) fails if the FQDN and IP addresses used are not resolvable by the DNS server specified during the deployment processThe vSphere 6.0 Documentation Center vSphere DNS requirements state the following:

Ensure that DNS reverse lookup returns a Fully Qualified Domain Name (FQDN) when queried with the IP address of the host machine on which vCenter Server is installed. When you install or upgrade vCenter Server, the installation or upgrade of the Web server component that supports the vSphere Web Client fails if the installer cannot look up the fully qualified domain name of the vCenter Server host machine from its IP address. Reverse lookup is implemented using PTR records.

Before deploying vCenter I recommend to deploy a virtual machine on the first host running a DNS server. The ESXi Embedded Host Client allows you to deploy a virtual machine on an ESXi host without the need of having an operational vCenter first. As I use active Directory as identity source for authentication, I deploy a Windows AD server with DNS before deploying the vCenter Server Appliance (VCSA). Toms IT pro has a great article on how to configure DNS on a Windows 2012 server, but if you want to configure a lightweight DNS server running on Linux, follow the steps Brandon Lee has documented. If you want to explore the interesting world of DNS, you can also opt to use Dynamic DNS to automatically register both the VCSA and ESXi hosts in the DNS server. Dynamic DNS registration is the process by which a DHCP client register its DNS with a name server. For more information please check out William article “Does ESXi Support DDNS (Dynamic DNS)?” . Although he published it in 2013. it’s still a valid configuration in ESXi 6.0.

Flexibility of using DNS
Interestingly enough, having a proper DNS structure in place before deploying the virtual infrastructure provides future flexibility.  One of the more annoying time wasters is the result of using an IP address instead of an FQDN during setup of the VCSA. When you use only an IP-address instead of a Fully Qualified Domain Name (FQDN) during setup, changing the hostname or IP-address will produce this error:

IPv4 configuration for nic0 of this node cannot be edited post deployment.

Kb article 2124422 states the following:

Attempting to change the IP address of the VMware vCenter Server Appliance 6.0 fails with the error: IPv4 configuration for nic0 of this node cannot be edited post deployment. (2124422)

This occurs when the VMware vCenter Server Appliance 6.0 is deployed using an IP address. During the initial configuration of the VMware vCenter Server Appliance, the system name is used as the Primary Network Identifier. If the Primary Network Identifier is an IP address, it cannot be changed after deployment.

This is an expected behavior of the VMware vCenter Server Appliance 6.0. To change the IP address for the VMware vCenter Server Appliance 6.0 that was deployed using an IP address, not a Fully Qualified Domain Name, you must redeploy the appliance with the new IP address information.

Changing the hostname will result in the Platform Service Controller (responsible for SSO) to fail. According to Kb article:

Changing the IP address or host name of the vCenter Server or Platform Service controller cause services to fail (2130599)

Changing the Primary Network Identifier (PNID) of the vCenter Server or PSC is currently not supported and will cause the vSphere services to fail to start. If the vCenter Server or PSC has been deployed with an FQDN or IP as the PNID, you will not be able to change this configuration.
To resolve this issue, use one of these options:

  • Revert to a snapshot or backup prior to the IP address or hostname change.
  • Redeploy the vSphere environment.

This means that you cannot change the IP-address or the host name of the vCenter Appliance. Yet another reason to deploy a proper DNS structure before deploying your VCSA in your lab.

FQDN and vCenter permissions
Even when you have managed to install vCenter without a reverse lookup zone, the absence of DNS pointer records can obstruct proper permission configuration according to (KB article 2127213)

Unable to add Active Directory users or groups to vCenter Server Appliance or vRealize Automation permissions 

Attempting to browse and add users to the vCenter Server permissions (Local Permission: Hosts and Clusters > vCenter >Manage >Permissions)(Global Permissions: Administration > Global Permissions) fails with the error:

Cannot load the users for the selected domain

A workaround for this issue is to ensure that all DNS servers have the Reverse Lookup Zone configured as well as Active Directory Domain Controller (AD DC) Pointer (PTR) records present. Please note that allowing domain authentication (assuming AD) on the ESXi host does not automatically add it to an AD managed DNS zone. You’ll need to manually create the forward lookup (which will give the option for the reverse lookup creation too).

SSH session password delay
When running multiple hosts most of you will recognize the waste of time when (quickly) wanting to log into ESXi via an SSH session. Typically this happens when you start a test and you want to monitor ESXTOP output. You start your ssh session, to save time you type on the command line ssh root@esxi.homelab.com and then you have to wait more than 30 seconds to get a password prompt back. Especially funny when you are chasing a VM and DRS decided to move it to another server when you weren’t paying attention. To get rid of this annoying time waster forever:

DNS name resolution using nslookup takes up to 40 seconds on an ESXi host(KB article 2070192)

When you do not have a reverse lookup zone configured, you may experience a delay of several seconds when logging in to hosts via SSH.

When you’re management machine is not using the same DNS structure, you can apply the quick hack of adding “useDNS no” to the /etc/ssh/sshd.config file on the ESXi host to avoid the 30-second password delay.

Troubleshoot DNS
BuildVirtual.net published an excellent article on how to troubleshoot ESXi Host DNS and Routing related issues. For more information about setting the DNS configuration from the command line, review this section of the VMware vSphere 6.0 Documentation Center

vSphere components moving away from DNS
As DNS is an extra dependency, a lot of newer technologies try to avoid incorporate DNS dependencies. One of those is VMware HA. HA has been redesigned and the new FDM architecture avoided DNS dependencies. Unfortunately not all VMware  official documentation has been updated with this notion: https://kb.vmware.com/kb/1003735 states that ESX 5.x also has this problem but that is not true. Simply put, VMware HA in vSphere 5.x and above does not depend on DNS for operations or configurations.

Home Lab Fundamentals Series:

Up next in this series: vSwitch0 routing

Home Lab Fundamentals: Time Sync

First rule of Home Lab club, don’t talk about time sync! Or so it seems. When starting your home lab, all hints and tips are welcome. The community is full of wisdom, however sometimes certain topics are taken for granted or are perceived as common knowledge. The Home Lab fundamentals series focusses on these subjects, helping you how to avoid the most common pitfalls that provide headaches and waste incredible amounts of time. A ‘time-consuming’ pitfall is dealing with improper time synchronization between the various components in your lab environment.

Most often, the need for time synchronization is seen as an Enterprise requirement but not really necessary for lab environments. Maybe because most think time synchronization is solely necessary for troubleshooting purposes. In some cases, this is true as ensuring correct time notation allows for proper correlation of events. Interestingly enough, this alone should be enough reason to maintain synchronized clocks throughout your lab, but most home labs are just rebuilt when troubleshooting becomes too time-consuming. However time sync is much more expedite troubleshooting and ignoring time drift is a straight path into the rabbit hole. Time synchronization utilities such as NTP are necessary to correct time drift introduced by hardware time drift and guest operating system timekeeping imprecision. When time differs between systems to much it can lead to installation and authentication errors. Unfortunately, time issues are not always easily identifiable, to provide a great example;

“[400] An error occurred while sending an authentication request to the vCenter Single Sign-On server – An error occurred when processing the metadata during vCenter Single Sign-On setup – null.”

This particular issue occurs due to a time skew between the vCenter Server Appliance 6.0 and the external Platform Service Controller. Here are just a few other examples of what can go wrong in your lab due to time skew issues;

  • Adding a host in vCenter Server fails with the error: Failed to configure the VIM account on the host (1029863) Time skew between ESXi host hardware clock and vCenter Server system time. https://kb.vmware.com/kb/1029863
  • After joining the Virtual Center Server Appliance to a domain you cannot see domain when adding user permissions (2011965): This issue occurs when the time skew between the Virtual Center Server Appliance(VCSA) and a related Domain Controller is greater than 5 minutes. https://kb.vmware.com/kb/2011965
  • Cluster level performance graphs show the most recent value as 0: This metric is susceptible to clock skew between the vSphere Client, vCenter Server, and ESX hosts. If any of the hosts have a skewed clock, the entire cluster shows as 0. https://kb.vmware.com/kb/2009550
  • The vCenter Server Appliance installation fails when connecting to an External Platform Services Controller: This issue occurs when the system time on the system hosting the PSC does not match the time of the system where vCenter Server is installed. https://kb.vmware.com/kb/2128811
  • Configuring the NSX SSO Lookup Service fails (2102041): Connectivity issues between the NSX Manager to vCenter Server due to time skew between NSX Manager and vCenter Server. https://kb.vmware.com/kb/2102041
  • Authentication Errors are Caused by Unsynchronized Clocks: If there is too great a time difference between the KDC and a client requesting tickets, the KDC cannot determine whether the request is legitimate or a replay. Therefore, it is vital that the time on all of the computers on a network be synchronized in order for Kerberos authentication to function properly. https://technet.microsoft.com/en-us/library/cc780011(v=ws.10).aspx

Timekeeping best practices by VMware
Simply put, when weird behavior during setup or authentication occurs, check the time between the various components first. VMware released multiple knowledge Base articles and technical documents that contain detailed information and instructions on timekeeping within the various components of the virtual datacenter:

VMware doesn’t provide a separate time keeping best practices document for vCenter, but provides multiple guidelines in the vCenter Server Appliance configuration guide. When installing vCenter on a Windows machine it’s recommended to sync to the PDC emulator within the Active Directory domain. In general, VMware recommends to use native time synchronization software, such as Network Time Protocol (NTP) with the various vSphere components. NTP is typically more accurate than VMware Tools periodic time synchronization and is therefore preferred.

Time synchronization design
There are multiple schools of thoughts when it comes to time sync in a virtual data center. One of the most common ones is to synchronize the virtual datacenter infrastructure components such as ESXi hosts and the VCSA to a collection of external NTP server. Typically provided by http://www.pool.ntp.org/en/ or the US Naval observatory: http://tycho.usno.navy.mil/NTP/. Windows virtual machines sync their time to the Active Directory domain controller running the PDC emulator FSMO role: Time Synchronization in Active Directory Forests http://social.technet.microsoft.com/wiki/contents/articles/18573.time-synchronization-in-active-directory-forests.aspx It’s recommended to point the ESXi hosts to the same time source as the PDC emulator of the active directory. When running Linux best practice is to sync these systems with an NTP server.

Another widely adopted design is to sync ESX servers to the to the Active Directory domain controller running the PDC emulator FSMO role. VCSA time keeping configuration provide two valid options; NTP and hosts. In this scenario, select the host option to ensure time between the host and VCSA is in sync. If the VCSA is using a different time source other than the ESX host, a race condition can occur between time sync operations and can lead to failing the vpxd.

01-VMware vCenter Server 6.0 Update 2 Release Notes

Source: VMware vCenter Server 6.0 Update 1b Release Notes: http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-vcenter-server-60u1b-release-notes.html

But the most interesting thing I witnessed that can easily become a wild-goose chase is the VM tools time synchronization when time on an ESXi host is incorrect. As described earlier, enabling VMware tools time sync on virtual machines was a best practice for a long time. Shifting towards native time synchronization software led VMware to disable the periodic time synchronization option by default. The keyword in the last sentence is PERIODIC. By default VMware tools synchronizes time with the host during the following options:

  • Resuming a suspended virtual machine
  • Migrating a virtual machine using vMotion
  • Taking a snapshot
  • Restoring a snapshot
  • Shrinking the virtual disk
  • Restarting the VMware tools service inside the VM
  • Rebooting the virtual machine

The time synchronization checkbox controls only whether time is periodically resynchronized while the virtual machine is running. Even if this box is unselected, by default VMware Tools synchronizes the virtual machine’s time to the ESXi host clock after the listed events. If the ESXi host time is incorrect it is likely that “unexplainable” errors will occur. I experienced this behavior after migrating a VM with vMotion. I couldn’t log on to a windows server as the time skew prevented me from authenticating.

You can either disable these options by adding rules to the VMX file of each VM or just ensure that the ESXi host is syncing the time with a proper external time source. For more information: Disabling Time Synchronization (1189) https://kb.vmware.com/kb/1189.

No time zone for ESXi
Be aware that as of vSphere 4.1 ESXi hosts are set to Universal Time Coordinated (UTC) time. UTC is interesting as its the successor of Greenwich Mean Time (GMT) but UTC itself is not a time zone, but a time standard. There are plenty of articles about UTC, but the key thing to understand is that it never observes Daylight Saving Time. As UTC is not a time zone, you cannot change the time notation in ESXi itself. The vSphere client, web client and HTML5 client automatically display the time in your local time zone and will take into account the UTC setting on the host. This isn’t bad behavior, just be aware of this so you don’t freak out when you check the time via the command line.

02-date-esxi-command-line

03-Webclient NTP

CMOS clock
ESXi synchronizes its system time with the hardware clock (CMOS/BIOS/ACPI)of the server if the NTP service is not running on the ESXi host. SuperMicro boards allow for NTP synchronization, but most home lab motherboards just provide the time as being configured in the BIOS. When the NTP daemon is started on the ESXi host it synchronizes its system time to the external time source AND it updates the hardware clock as well. I ran a test to verify this behavior. At the time of testing it was 12:37 (GMT+1 | UTC 10:37), NTP turned off and set the time in the BIOS to 6:37 UTC time. After booting the machine the command esxcli system time get confirmed ESX system time retrieved the time from the hardware clock. After starting the NTP Services, the system time was set to the correct time: 10:37.The command esxcli hardware clock get demonstrated that NTP also corrected the BIOS time. A quick BIOS check confirmed esxcli hardware clock get was displaying the BIOS configuration.

04-Insert esxcli screenshot

If your lab is not connected to the internet, confirm the BIOS time with the command esxcli hardware clock get and if necessary use the command esxcli hardware clock set -d (Day) -H (Hour) -m (Minute) -M (Month) -s (Second) -y (Year) to set the correct time.

Please note that ESXCLI reports time with the Z (Zulu) notation, this is the military name for UTC.

Raspberry Pi as a Stratum-1 NTP Server
When having a home lab, you usually face the age old dilemma common sense vs ‘exciting new stuff that you might not need but you would like to have’. You can update your CMOS clock manually or scripted, you can connect to an array of external NTP servers or you can build your own Stratum-1 NTP server using a Raspberry PI with a GPS add-on board .

Up next in this series:Home Lab fundamentals: Reverse DNS

Monitoring power consumption of your home lab with a smart plug

Home labs are interesting beasts, at one hand you would love to have all the compute, storage and network power available, on the other hand you do not want to have a power bill similar to a Google data center.

I have a decent setup, with 4 Xeon servers, two cisco 1GB switches, a 10Gb switch and 3 Synology’s, but I don’t keep everything on all the time. One server acts as the management server, running a Windows DC, vCenter appliance, the PernixMS server and some other stuff. These machines are always on, not only to save time when I want to use my lab but increased stability as well. Due to this, my network gear and storage systems are also on. Which made me wonder how much the need for availability and stability will cost me on a yearly basis. The big Xeon rigs equipped with multiple PCIe devices are usually shut down after tests because I expect them to consume lots of power. Time to stop guessing and start monitoring. As always Home Lab Sensei Erik Bussink pointed me out to a simple solution the Smart Plug Edimax SP-2101W Smart Plug Switch. Please leave a comment if you are using a different solution that is a better alternative to this device.

The device
Nothing much to add about the device itself, it is sleek enough so it will not eat up multiple power outlets.
Edimax_SP2101W

The device is managed via an apple or android app, the following screenshots are taken from an Apple device, you can monitor it with both your iPhone or iPad. You can manage multiple smart plugs from one device. As I’ve spread my lab over two power-groups I’ve installed two power-plugs to monitor my home lab.

IMG_0469

Unfortunately, the app doesn’t allow displaying two smart plugs simultaneously, you have to open each individually. The monitor page shows the real time power consumption registered by the plug. It displays Amps and Watts. Quite cool to see what happens when you power-on devices or even a virtual machine, this monitored server generates a spike of 30 watts when powering on a VM, it quickly returns to a steady state though. Fun to see that ESXi hosts do not consume a steady high state of power.

IMG_0470

The Now button shows the real-time power consumption and the total power consumption registered of today, this week and this month. By providing the price of energy, it calculates the total cost additionally. Unfortunately I haven’t found the option to change the currency sign, so you are stuck with the dollar sign.

IMG_0471

By selecting the Usage button provides you a chart to view the power consumption of that day.
IMG_0472

The app allows you to analyze power consumption trends of your home lab by provides an overview based on 24 hours of data, a week, a month and a full year.

IMG_0473

Conclusion
The smart plugs are a great addition to my home lab, it provides me insights in the consumption and it for me personally have removed the reluctancy of leaving my full lab on. The answer to the question whether you need a smart plug if you run a home lab is in my opinion a straight and simple no. You can estimate cost or you can just ignore it and pay the bill when it arrives. I’m just curious about these things and it helps to clear my conscious.

Older posts

© 2018 frankdenneman.nl

Theme by Anders NorenUp ↑