Virtual NUMA | The VMware story

NUMA - Non Uniform Memory Access

NUMA, a term which comes along with Symmetric Multi-Processing or SMP. In the traditional processor architecture, all memory access are done using the same shared memory bus. This works fine with small number of CPUs. But when the number of CPUs increase (after 8 or 12 CPUs), CPU competes for control over the shared bus and creates serious performance issues. 

NUMA was introduced to overcome this problem. In NUMA architecture, using advanced memory controller high speed buses will be used by nodes for processing. Nodes are basically a group of CPUs which owns its own memory and I/O much like an SMP. If a process runs in one node, the memory used from the same node is referred to Local Memory. In some scenarios, the node may not be able to cater needs of all its processes. At that time, the node makes use of memory from other nodes. This is referred to as Remote Memory. This process will be slow and the latency depends on the location of the remote memory.

Now the VMware story. From vSphere 5.0, the NUMA topology was exposed to the VMs so that the guest OS can make effective use of it. It is rather called as vNUMA or virtual NUMA. If a VM has more than 8 vCPUs, the vNUMA will come into the picture. The VM is supposed to get maximum performance when the memory is used from its home node or local node (local memory). Whenever the VM tries to access remote memory, the performance starts to degrade. This can be checked in VMware ESXi hosts using esxtop. In the memory stats of esxtop, each VM will have N%L value, which corresponds to the node locality. If it is 100%, VM is using the local memory. Anything below 100%, VM is using remote memory for its working. If the value is below 80% you got serious vCPU scheduling issues.  The NUMA scheduler internally swaps VM across NUMA nodes to maintain memory locality. It is also important that you size the VM based on the NUMA node size.

How many NUMA nodes does my host have ? You can check that using esxtop. Infact it will be the number of processors/sockets in the ESXi box.


Popular posts from this blog

VMware and Windows Interview Questions: Part 2

VMware and Windows Interview Questions: Part 3

VMware vMotion error at 14%