numactl

Run processes on specific NUMA nodes

TLDR

Run a command on node 0 with memory allocated on node 0 and 1

$ numactl --cpunodebind=[0] --membind=[0,1] -- [command] [command_arguments]

Run a command on CPUs (cores) 0-4 and 8-12 of the current cpuset

$ numactl --physcpubind=[+0-4,8-12] -- [command] [command_arguments]

Run a command with its memory interleaved on all CPUs

$ numactl --interleave=[all] -- [command] [command_arguments]

SYNOPSIS

numactl [options] [command [arguments...]]
numactl --hardware
numactl --show

--hardware, -H
    Reports the NUMA hardware configuration of the system, including available nodes, their memory sizes, and CPU mappings.

--show, -s
    Shows the default NUMA policy that would apply to a new process or the current policy for the running process.

--preferred=nodes, -p nodes
    Sets a preferred node for memory allocation. The system will try to allocate memory from this node first, falling back to other nodes if necessary. nodes can be a comma-separated list or a range (e.g., '0', '0,1', '0-3').

--cpunodebind=nodes, -N nodes
    Binds the process to the CPUs of the specified NUMA nodes. The process will only run on CPUs belonging to these nodes. nodes can be 'all' or a list/range.

--membind=nodes, -m nodes
    Binds memory allocation strictly to the specified NUMA nodes. Memory will only be allocated from these nodes. If allocation fails, it will not fall back to other nodes unless --strict is not used.

--interleave=nodes, -i nodes
    Interleaves memory allocation across the specified NUMA nodes. This is useful for workloads that benefit from spreading memory evenly across multiple nodes to maximize bandwidth.

--localalloc, -l
    Always allocates memory on the NUMA node where the thread is currently executing. This is typically the default policy for new pages but can be explicitly enforced.

--physcpubind=cpus, -C cpus
    Binds the process to a specific set of physical CPUs. cpus can be a comma-separated list or a range (e.g., '0', '0,2', '0-3').

--strict
    Enforces strict memory binding. If a memory allocation request cannot be satisfied on the bound nodes, the allocation will fail instead of falling back to other nodes.

--all
    Applies the specified NUMA policy to all pages in the current process's address space, not just new allocations. This is typically used with `set-*` options.

--offset=offset, -o offset
    Specifies an offset into a shared memory segment for policy application.

--length=length, -L length
    Specifies the length of a shared memory segment for policy application, in bytes.

--shmmode=shmmode, -M shmmode
    Specifies the shared memory mode (e.g., 'shm_open' for shm_open() or 'mmap' for mmap() based segments).

--huge
    Applies the policy to huge pages. Requires system configuration for huge pages.

--set-preferred=node
    Sets the preferred node policy for the current process without executing a new command.

--set-localalloc
    Sets the local allocation policy for the current process without executing a new command.

--set-membind=nodes
    Sets the memory bind policy for the current process without executing a new command.

--set-interleave=nodes
    Sets the interleave policy for the current process without executing a new command.

DESCRIPTION

The numactl command provides a powerful interface for controlling NUMA (Non-Uniform Memory Access) policies for processes or shared memory segments on Linux systems. In NUMA architectures, memory access times can vary depending on which CPU accesses which memory bank. Memory physically closer to a CPU (on the same NUMA node) is faster to access than memory on a different node.

numactl allows users to fine-tune where processes execute (CPU affinity) and from which NUMA nodes memory is allocated (memory affinity). This is critical for optimizing performance in multi-socket server environments, especially for High-Performance Computing (HPC) and large database systems. By intelligently placing processes and their data on the same NUMA node, numactl helps reduce remote memory access latency, thereby improving application throughput and responsiveness.

It can be used to launch a new command with specific NUMA policies or to inspect and modify the NUMA policy of an already running process.

CAVEATS

Using numactl effectively requires a good understanding of the system's NUMA topology and the application's memory access patterns. Misconfigured policies can lead to degraded performance or even application failures (especially with --strict). Policies set by numactl are inherited by child processes by default. Ensure your kernel has NUMA support enabled.

NODES AND CPUS SPECIFICATION

Arguments for nodes and cpus can be specified as a comma-separated list (e.g., '0,2,4'), a range (e.g., '0-3'), or a combination thereof. The keyword 'all' can be used to select all available nodes or CPUs. A leading '+' can be used to add nodes/cpus to the current set (e.g., '+0' to add node 0).

POLICY INHERITANCE

When numactl is used to launch a command, the specified NUMA policies are inherited by that command and any of its child processes. This behavior is crucial for ensuring that an entire application stack adheres to the desired NUMA placement strategies. Policies can be overridden by child processes if they explicitly call NUMA-aware functions or themselves invoke numactl.

MEMORY ALLOCATION POLICIES EXPLAINED

Preferred: Attempts to allocate from the specified node first, but will fall back to other nodes if the preferred node is full or unavailable.

Bind: Strictly allocates from the specified nodes. If these nodes cannot satisfy the request (and --strict is used), the allocation fails.

Interleave: Distributes memory pages evenly across the specified nodes in a round-robin fashion. Useful for applications that access large data sets uniformly across multiple nodes.

Localalloc: Always allocates memory on the NUMA node where the current thread is running. This minimizes latency for data local to the executing CPU.

HISTORY

The advent of multi-socket server architectures led to the development of NUMA (Non-Uniform Memory Access) systems, where processors have faster access to local memory than to remote memory. Linux kernel developers gradually introduced robust NUMA awareness and management capabilities.

The numactl utility emerged as a user-space tool to expose these kernel features. It provides a convenient way for system administrators and developers to apply CPU and memory placement policies without modifying application code. Its development paralleled the increasing need for performance optimization in data centers and HPC environments, becoming an essential tool for maximizing resource utilization on NUMA hardware.