nvidia-smi-mig
Manage NVIDIA MIG partitions
TLDR
Create a compute instance from device 0
List GPU instances
Display help
SYNOPSIS
nvidia-smi mig [command] [options]
nvidia-smi mig -lgi [-i GPU_ID]
nvidia-smi mig -lci [-i GPU_ID]
nvidia-smi mig -lcc [-i GPU_ID]
nvidia-smi mig -lgi-profiles [-i GPU_ID]
nvidia-smi mig -lci-profiles [-i GPU_ID]
nvidia-smi mig -cgi GI_PROFILE_ID [-i GPU_ID] [--count N] [--default-device-file FILE]
nvidia-smi mig -dgi GI_ID [-i GPU_ID] [--force] [--no-confirm]
nvidia-smi mig -cci CI_PROFILE_ID [--gpuidx GPU_IDX | --parentgi GI_ID] [--count N]
nvidia-smi mig -dci CI_ID [--gpuidx GPU_IDX | --parentgi GI_ID] [--force] [--no-confirm]
nvidia-smi mig -mig-mode MODE [-i GPU_ID] [--no-confirm]
PARAMETERS
-lgi
Lists all GPU Instances (GIs) on the system or specified GPU.
-lci
Lists all Compute Instances (CIs) on the system or specified GPU.
-lcc
Lists all MIG compute configurations on the system or specified GPU.
-lgi-profiles
Lists supported GPU Instance profiles for a GPU.
-lci-profiles
Lists supported Compute Instance profiles for a GPU or a parent GPU Instance.
-cgi PROFILE_ID
Creates a GPU Instance using the specified profile ID. Use `-lgi-profiles` to find available IDs.
-dgi GI_ID
Deletes the specified GPU Instance. Use `-lgi` to find existing GI IDs.
-cci PROFILE_ID
Creates a Compute Instance using the specified profile ID. Use `-lci-profiles` to find available IDs.
-dci CI_ID
Deletes the specified Compute Instance. Use `-lci` to find existing CI IDs.
-mig-mode MODE
Sets the MIG mode for a GPU. MODE can be 'ENABLED' or 'DISABLED'. Requires a GPU reset.
-i GPU_ID
Specifies the target GPU by its 0-indexed device ID (e.g., 0, 1).
--gpuidx GPU_IDX
Specifies the target GPU by its index, primarily for CI operations.
--parentgi GI_ID
Specifies the parent GPU Instance for Compute Instance operations.
--count N
Specifies the number of instances to create (for creation commands).
--force
Forces the operation, overriding certain checks (use with caution as it can lead to data loss).
--no-confirm
Skips the confirmation prompt for destructive operations, allowing unattended execution.
DESCRIPTION
The `nvidia-smi mig` command is a powerful subcommand of `nvidia-smi` used to manage NVIDIA's Multi-Instance GPU (MIG) feature. MIG allows a single Ampere or newer GPU to be partitioned into up to seven independent GPU Instances (GIs) and Compute Instances (CIs). Each GI has its own dedicated memory, cache, and compute cores, providing fault isolation and guaranteed quality of service (QoS) for various workloads.
This command facilitates the creation, deletion, and listing of these MIG objects, enabling users to optimize GPU utilization by running diverse workloads concurrently with strong isolation. It's crucial for cloud providers and AI/ML researchers who need to partition high-end GPUs for multiple users or applications.
CAVEATS
The MIG feature is only available on NVIDIA GPUs based on the Ampere architecture (e.g., A100) or newer. Before using MIG, the GPU's MIG mode must be enabled, which often requires a system reboot. Operations like enabling/disabling MIG mode or deleting all instances typically require a GPU reset, which can disrupt running workloads. All `nvidia-smi mig` commands require root privileges to execute successfully.
NOTE ON COMMAND NAME
The command discussed is specifically `nvidia-smi mig`, which is a subcommand of the primary `nvidia-smi` utility. While some references might conceptually use `nvidia-smi-mig`, the literal executable command is invoked by passing `mig` as an argument to `nvidia-smi`.
GPU INSTANCE (GI) AND COMPUTE INSTANCE (CI) PROFILES
MIG utilizes predefined profiles that dictate the memory, compute, and video decode/encode capabilities allocated to each instance. GPU Instance profiles define the partitioning of the physical GPU, while Compute Instance profiles define the compute resources within a created GPU Instance. Users select these profiles based on their workload requirements and the available GPU resources.
DEVICE FILES
When GPU Instances are created, the system typically presents them as separate, independent CUDA-capable devices, each with its own device file (e.g., /dev/nvidia-giNcM). This allows applications to target specific MIG instances as if they were distinct physical GPUs, simplifying workload deployment.
HISTORY
Multi-Instance GPU (MIG) technology was introduced by NVIDIA with its Ampere architecture, first showcased with the A100 GPU in 2020. The `nvidia-smi mig` subcommand was subsequently added to the `nvidia-smi` utility to provide a programmatic interface for managing these new GPU partitioning capabilities, evolving with new GPU generations.
SEE ALSO
nvidia-smi(1), nvcc(1), nvidia-persistenced(8)