accelerate
The command 'accelerate' is not a standard Linux command
TLDR
Print environment information
Interactively create a configuration file
Print the estimated GPU memory cost of running a Hugging Face model with different data types
Test an Accelerate configuration file
Run a model on CPU with Accelerate
Run a model on multi-GPU with Accelerate, with 2 machines
SYNOPSIS
accelerate <command> [options] [args]...
accelerate launch [--config_file FILE] [--num_processes N] [--num_machines N] [--machine_rank RANK] [--main_process_port PORT] [--mixed_precision {no,fp16,bf16}] [--use_deepspeed] script.py [script_args...]
PARAMETERS
launch
Launch accelerated training script with distributed setup.
config
Interactive config wizard or load/save config file.
env
Display current Accelerate environment details.
check
Validate config and environment compatibility.
--config_file
Path to Accelerate config YAML file.
--num_processes
Number of processes (GPUs) per machine (default: num GPUs).
--num_machines
Number of machines in multi-node setup (default: 1).
--machine_rank
Current machine rank (default: 0).
--main_training_function
Main function name in script (default: main).
--mixed_precision
Precision mode: no, fp16, or bf16.
--use_deepspeed
Enable DeepSpeed integration.
--multi_gpu
Force multi-GPU (deprecated, auto-detected).
DESCRIPTION
The accelerate command is the CLI tool for Hugging Face's Accelerate library, which streamlines training PyTorch models on multiple GPUs, TPUs, CPUs, or clusters. It handles device mapping, mixed precision, gradient accumulation, and backends like DeepSpeed or FSDP without changing core training code.
Users configure setups via accelerate config, inspect environments with accelerate env, and launch scripts using accelerate launch. This abstracts distributed training complexities, enabling seamless scaling from single-device prototyping to multi-node production. Ideal for NLP, vision, and large models, it reduces boilerplate and boosts efficiency.
Installation via pip install accelerate adds the command. It integrates with Transformers library but works standalone.
CAVEATS
Requires Python 3.8+, PyTorch 1.10+, and GPU drivers/CUDA if applicable. Not a core Linux utility; install via pip. Config mismatches can cause silent failures. Limited TPU support on certain clouds.
INSTALLATION
pip install accelerate or pip install -U accelerate for updates.
Run accelerate config first time.
BASIC EXAMPLE
accelerate launch --num_processes 4 train.py
Runs train.py on 4 GPUs with auto-DDP.
HISTORY
Released by Hugging Face in July 2021 (v0.1.0) to unify multi-GPU setups. Evolved with PyTorch 2.0 support, FSDP2, and config v2 in 2023. Widely adopted in Transformers ecosystem.
SEE ALSO
torchrun(1), mpirun(1), deepspeed(1)


