Category: CPU

05-00-featured-image-Wide-VM

NUMA Deep Dive Part 5: ESXi VMkernel NUMA Constructs

ESXi Server is optimized for NUMA systems and contains a NUMA scheduler and a CPU scheduler. When ESXi runs on a NUMA platform, the VMkernel activates the NUMA scheduler. The primary role of the NUMA scheduler is to optimize the CPU and memory allocation of virtual machines by managing the...

05-00-Featured Image

NUMA Deep Dive Part 4: Local Memory Optimization

If a cache miss occurs, the memory controller responsible for that memory line retrieves the data from RAM. Fetching data from local memory could take 190 cycles, while it could take the CPU a whopping 310 cycles to load the data from remote memory. Creating a NUMA architecture that provides...

04-03-Cluster-On-Die_HCC

NUMA Deep Dive Part 3: Cache Coherency

When people talk about NUMA, most talk about the RAM and the core count of the physical CPU. Unfortunately, the importance of cache coherency in this architecture is mostly ignored. Locating memory close to CPUs increases scalability and reduces latency if data locality occurs. However, a great deal of the...

03-00-Featured_Image

NUMA Deep Dive Part 2: System Architecture

Reviewing the physical layers helps to understand the behavior of the CPU scheduler of the VMkernel. This helps to select a physical configuration that is optimized for performance. This part covers the Intel Xeon microarchitecture and zooms in on the Uncore. Primarily focusing on Uncore frequency management and QPI design...

02-02-UMA Architecture

NUMA Deep Dive Part 1: From UMA to NUMA

Non-uniform memory access (NUMA) is a shared memory architecture used in today's multiprocessing systems. Each CPU is assigned its own local memory and can access memory from other CPUs in the system. Local memory access provides a low latency - high bandwidth performance. While accessing memory owned by the other CPU...