Frank Denneman - Chief Technologist AI at VMware

How to Create a Windows 11 Bootable USB on Mac OS Monterey

December 21, 2022 by frankdenneman

I need to install Windows 11 on a gaming PC, but I only have a MacBook in my house, as this is my primary machine for work. To make things worse, trying to do this on macOS Monterey is extra difficult due to the heightened security levels that withhold you from running unsigned software. I.e., most free tooling software. However, most tooling is provided by macOS itself. You have to remember the correct steps. And because this is not a process I often do, I decided to document it for easy retrieval, which might help others facing the same challenge.

As I mentioned, most of the tools are installed on macOS. The only missing one is the open-source Windows Imaging Library (wimlib). This tool helps you to split a particular file (install.wim) as it is too large for the filesystem we use on the USB drive. To install wimlib, you need to have Homebrew installed. Homebrew is a package manager for macOS. Some already have it installed, and some don’t, so I will include the install command for Homebrew.

Install Homebrew

Open a terminal window and run the following command:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

This can be a lengthy process. My recent model Macbook Pro took about 6 minutes to complete.

To Install wimlib, run the following command:

brew install wimlib

Insert a USB key that is large enough to contain Windows 11. I use a 64GB USB drive that can also store extra drivers. Newer motherboards are typically equipped with these Intel 2.5 GbE NICs, and Windows 11 do not have the driver built-in. If you want to store these drivers on the USB drive as well to be able to continue the Windows install process.

To discover which disk identifier macOS assigned to the USB driver, run the following command:

diskutil list external

The option “external” only displays mounted volumes. This helps you to spot your USB drive easily. In my case, macOS assigned the identifier “disk5” to it. Note the disk name. We need that for our erasedisk command.

Erase the USB Drive

The next command is going to erase the USB drive using MS-DOS format. Use a Master Boot Record scheme (MBR) as this is necessary to find all the files during the installation of Windows 11. We need to retain the disk’s name and use the identifier. The name is 64GBUSB, and the identifier is disk5

diskutil eraseDisk MS-DOS "64GBUSB" MBR disk5

Mount Windows 11 ISO

The next step is to mount the Windows 11 ISO, you can use finder for that and click on the file, but as we are doing everything in the terminal, the command you can use to mount the iso is:

hdiutil mount Win11_22H2_English_x64v1.iso

I’ve executed this command in the directory in which the ISO file was stored. You can execute this command anywhere but ensure to include the full path to the Windows 11 ISO file.

The main challenge of creating the Windows 11 bootable USB drive is the install size.wim file in combination with the MS-DOS format of the drive itself. The install.wim file is larger than 4GB and thus incompatible with the file system. To solve that, you can compress or split the file using wimlib. The Windows installation process knows how to deal with split files; thus, this is the preferred method, as compressing files impacts the duration of the installation process.

Copy files Windows ISO to USB Drive

The first step is to copy over all the files of the Windows 11 ISO file that we just mounted EXCEPT the install.wim. The easiest way is using the following rsync command:

rsync -avh --progress --exclude=sources/install.wim /Volumes/CCCOMA_X64FRE_EN-US_DV9/ /Volumes/64GBUSB

The progress option shows each file’s copy progress; the first volume is the source (the windows ISO), and the second volume directory is the destination (the USB drive). The exclude option tells rsync to ignore install.vim during the copy process.

Split Install.wim

The last step is to split the install.wim into two parts and place it into the sources folder onto the USB drive. To do so, execute the following command:

wimlib-imagex split /Volumes/CCCOMA_X64FRE_EN-US_DV9/sources/install.wim /Volumes/64GBUSB/sources/install.swm 4000

The key element of this command is the option “4000” this tells the command to split the file into chunks with a maximum size of 4000 MB. The MS-DOS (fat32) maximum file size is 4096MB. You can decide to lower the number if you’re comfortable, but keep it a little bit under the max.

Once this process is complete, you’re done. You can safely unmount the USB drive and use it to install Windows 11.

diskutil unmount /dev/disk5

Unexplored Territory Ep 34 – William Lam Talks Home Labs – Christmas Special

December 21, 2022 by frankdenneman

It’s the end of the year, and everybody is winding down from a hectic year, so we wanted to give you some light stuff to listen to in our last episode. But William had other plans. He is on fire in this episode, dropping one gem after another, sharing a decade-long of home lab wisdom. We asked William what his top 10 home lab gifts would be, and he got gift ideas from stocking stuffers to full-blown systems. Listen via Spotify (https://spoti.fi/3jdOmUp), Apple (https://apple.co/3WpMB50), or online (https://unexploredterritory.tech)

Williams’ Christmas Top 10 Wishlist

Number 10: Velcro cable management

Number 09: Smart Power meter /UPS/ APC Surge Arrest

Number 08: Kubernetes for Administrators, VDI Design Guide Part 2, vSAN 7.0 U3 Deep Dive

Number 07: VMUG Advantage Membership

Number 06: Memory Upgrade (64GB memory on the Intel NUC)

Number 05: Thunderbolt Storage

Number 04: USB/Thunderbolt Networking

Number 03: 10GbE Switch (Netgear/Ubiquiti)

Number 02: Intel NUC Serpent Canyon / Supermicro E302-12D

Number 01: Raspberry Pi / Dell Precision 7770

Articles and solutions discussed during the show:

Follow us on Twitter for updates and news about upcoming episodes: https://twitter.com/UnexploredPod. Last but not least, make sure to hit that subscribe button, rate where ever possible, and share the episode with your friends and colleagues!

Unexplored Territory Podcast 32 – IT giving McLaren Racing the edge

November 28, 2022 by frankdenneman

Edward Green, Head of Commercial Technology at McLaren Racing, keynoted at the VMware Explore tech conference in Barcelona. I had the honor of sitting down with him for a few minutes to talk about the role of IT in F1. Most of us watch the races with multiple screens, the main TV for the race and additional screens to look at the various telemetry feeds. And so you know how much data flows between cars and the teams. But sitting down with Edward and hearing him explain how data transfer feeds and disk sizes impact real training time for the driver is just candy to the ears of every tech-savvy Formula 1 fan. The episode starts with a short interview with Joe Baguley, the racing CTO of VMware, discussing his involvement with the McLaren Racing partnership and his passion for racing!

Listen to Edward and Joe at

Spotify
Apple
Website

Lando Norris, McLaren MCL36, prepares to head to the grid

ML Session at CTEX VMware Explore

November 4, 2022 by frankdenneman

Next week during VMware Explore, VMware is also organizing the Customer Technical Exchange. I’m presenting the session “vSphere Infrastructure for Machine Learning workloads”. I will discuss how vSphere act as a self-service platform for data science teams to easily and quickly deploy ML platforms with acceleration resources. I

CTEX is happening at the Fira Barcelona Gran Via in room CC4 4.2. This is an NDA event. Therefore, you will need to register vi

Next week during VMware Explore, the VMware Office of the CTO is organizing the Customer Technical Exchange. I’m presenting the session “vSphere Infrastructure for Machine Learning workloads”. I will discuss how vSphere act as a self-service platform for data science teams to easily and quickly deploy ML platforms with acceleration resources. I

CTEX is happening on the 8th and 9th of November at the Fira Barcelona Gran Via in room CC4 4.2. This is an NDA event. Therefore, you will need to register via

https://via.vmw.com/CTEXExploreEurope2022-Register.

vSphere 8 CPU Topology for Large Memory Footprint VMs Exceeding NUMA Boundaries

November 3, 2022 by frankdenneman

By default, vSphere manages the vCPU configuration and vNUMA topology automatically. vSphere attempts to keep the VM within a NUMA node until the vCPU count of that VM exceeds the number of physical cores inside a single CPU socket of that particular host. For example, my lab has dual-socket ESXi host configurations, and each host has 20 processor cores per socket. As a result, vSphere creates a VM with a vCPU topology with a unified memory address (UMA) up to the vCPU count of 20. Once I assign 21 vCPU, it creates a vNUMA topology with two virtual NUMA nodes and exposes this to the guest OS for further memory optimization.

You might have noticed that vNUMA topology sizing is structured around vCPU and physical core count. But what happens when the virtual machine configuration fits inside the NUMA node with its vCPU configuration, i.e., has less vCPU than the CPU has physical cores? But the VM requires more memory than the NUMA node can provide. I.e., the VM configuration exceeds the local memory configuration of the NUMA node.

As a test, I’ve created a VM with 12 vCPUs and 384GB. The vCPU configuration fits a single NUMA node (12<20), but the memory configuration of 384GB exceeds the 256GB of each NUMA node.

By default, vSphere creates the vCPU topology and exposes a unified memory address to the guest OS. For scheduling purposes, it creates two separate scheduling constructs to allocate the physical memory from both NUMA nodes but just doesn’t exposes that information to the guest OS. Inconsistent performance results from this situation, as the guest OS just starts to allocate the memory from the beginning of the memory range to the end without knowing anything about the physical origins. As a test, an application is running that allocates 380GB. As all the vCPU run in a single NUMA node, the memory scheduler will do its best and allocate the memory as close to the vCPUs as possible. As a result, the memory scheduler can allocate 226 GB locally (237081912 KB by vm.8823431) while having to allocate the rest from the remote NUMA node (155 GB).

Latency on an Intel for local NUMA nodes hovers around 73ns for a Xeon v4 and 89ns for a Skylake generation. At the same time, remote memory is about 130ns for a v4 and 139ns for a Skylake, AMD Epyc local is 129ns, and remote memory is 205ns. On Intel, that is a performance impact of 73% v4 and 56% on Skylake. On AMD, having to fetch remote memory will slow it down by 56%. That means, in this case, the application fetches 31% of its memory with a 73% latency penalty.

Another problem with this setup is that this VM is now quite an unbalanced noisy neighbor. There is little to do about monster VMs in your system, but this VM has an unbalanced memory footprint. It eats up most of the memory of NUMA node 0. I would rather see a more balanced footprint to utilize other cores on the NUMA node and enjoy local memory access. In this scenario, new virtual machines are forced to retrieve their memory from the other NUMA node if they are scheduled alongside this VM from a vCPU perspective.

To solve this problem, we used to set the advanced setting “numa.consolidate = FALSE” in vSphere 7 and older versions. vSphere 8 provides the option to configure the vCPU topology and, specifically, the vNUMA topology from the UI that solves the aforementioned problem. In the VM options of the VM configuration vSphere 8 includes the option CPU Topology.

By default, the Cores per Socket and NUMA Nodes settings are “assigned at power on,” which is the recommended setting for most workloads. In our case, we want to change it, and first, we have to change the Cores per Socket settings before we can adjust the NUMA Nodes setting. It’s best to distribute the vCPUs equally across the physical NUMA nodes so that each group of vCPU can allocate an equal amount of memory capacity. The VM configuration is set to 6 cores per socket in our test case, creating two vSockets.

The next step is to configure the virtual NUMA nodes. Aligning this with the underlying physical configuration helps to get the best predictable and consistent performance behavior for the virtual machine. In our case, the VM is configured with two NUMA nodes.

That provided me with the following output; notice the following lines:
cpuid.coresPerSocket = “6”
numa.vcpu.coresPerNode = “6”
numaHost 12 VCPUs 2 VPDs 2 PPDs

It shows that the VM is configured with 12 vCPUs, 6 cores per socket, 6 vCPUs per NUMA node. At the ESXi scheduling level, the NUMA scheduler creates two scheduling constructs. A virtual proximity domain (VPD) is a construct used to expose a CPU topology to the guest OS. You can see that it has created two VPDs for this VM. VPD 0 contains VCPU 0 to VCPU 5, and VPD 1 contains VCPU 6 to VCPU 11. A PPD is a physical proximity domain and is used by the NUMA scheduler to assign a PPD to a physical NUMA node.

To check if everything has worked, I looked at the task manager of windows and enabled the NUMA view, which now shows two NUMA nodes.

Running the memory test again on the virtual machine shows a different behavior at the ESXi level. Memory consumption is far more balanced. The VMs NUMA client on NUMA node 0 is consuming 192 GB (201326592 KB), while the NUMA client of the VM on NUMA node 1 is consuming 189 GB (198604800 KB).

The vSphere 8 vCPU topology is a very nice method of helping manage VM configurations that are special cases. Instead of having to set advanced settings, a straightforward UI that can be easily understood by any team member that wasn’t involved with configuring the VM is a big step forward. But I like to stress once more that the default setting is the best for most virtual machines. Do not use this setting in your standard configuration for your virtual machine real estate. Keep it set to default and enjoy our 20 years of NUMA engineering. We got your back most of the time. And for some of the outliers, we now have this brilliant UI.