cloud-init
Initialize cloud instances on first boot
TLDR
Display the status of the most recent cloud-init run
Wait for cloud-init to finish running and then report status
List available top-level metadata keys to query
Query cached instance metadata for data
Clean logs and artifacts to allow cloud-init to rerun
SYNOPSIS
cloud-init [OPTIONS] SUBCOMMAND [SUBCOMMAND_OPTIONS]
PARAMETERS
--debug
Enable debug logging, showing more verbose output for troubleshooting.
--version
Show program's version number and exit.
--force
Force an action that might otherwise be skipped (e.g., force a clean).
status [-w|--wait] [--format=json]
Display the current status of cloud-init execution. The -w or --wait option waits for cloud-init to complete, while --format=json provides output in JSON format.
clean [--reboot]
Removes cloud-init's state, allowing it to run again on the next boot. Use with caution. The --reboot option can optionally reboot the system afterwards.
analyze [show|dump|blame]
Tools to analyze cloud-init's performance and execution details, showing what modules took time or dumping collected data.
collect-logs [-f|--stdout]
Collects all cloud-init related logs and system information into a single archive for debugging purposes. The -f or --stdout option prints to standard output instead of creating a file.
DESCRIPTION
cloud-init is a widely adopted open-source package that handles early initialization of cloud instances. It runs during the first boot of a virtual machine or container to configure various aspects, making the instance ready for use. Its primary function is to interpret "user data" provided by the cloud provider, which can include shell scripts, YAML files, or other formats to set up users, install packages, configure networking, inject SSH keys, and run arbitrary commands. It operates in distinct stages, ensuring services and configurations are applied in a correct order. By abstracting the underlying cloud platform, cloud-init provides a consistent way to initialize instances across different cloud environments like AWS, Azure, Google Cloud, and OpenStack, simplifying automation and infrastructure-as-code deployments.
CAVEATS
cloud-init primarily executes only on the first boot of an instance. While its state can be reset using cloud-init clean, this is often used for testing and should be approached with caution in production environments as it can lead to unintended reconfigurations. Debugging issues can be challenging due to its early boot execution and complex module system. Proper 'user data' formatting and syntax are critical for successful configuration.
USER DATA
This is the primary input mechanism for cloud-init. Cloud providers allow users to pass arbitrary data (text) to instances during launch. cloud-init parses this data, which can be a shell script, YAML file for declarative configuration, or other formats, to perform tasks such as installing packages, creating users, injecting SSH keys, setting hostnames, and configuring network interfaces.
CONFIGURATION FILES AND MODULES
cloud-init's core configuration is typically found in /etc/cloud/cloud.cfg and related directories. It operates using a modular system, where different "modules" handle specific tasks (e.g., `cloud_final_modules`, `cloud_config_modules`). These modules are executed in predefined stages (e.g., `init`, `config`, `final`) to ensure that dependencies are met and configurations are applied in the correct order during the boot process.
HISTORY
cloud-init was originally developed by Canonical (the creators of Ubuntu) in 2008 to provide a consistent way to provision instances within Amazon EC2. Over time, it evolved to support numerous other cloud platforms and virtualization technologies, including OpenStack, Microsoft Azure, Google Cloud Platform, VMware, and others. Its adoption across major cloud providers has made it a de facto standard for initial instance configuration, significantly simplifying cloud automation and enabling infrastructure-as-code practices for dynamic environments.
SEE ALSO
systemctl(1): Used for controlling the cloud-init service (e.g., enable, start, stop)., journalctl(1): For viewing cloud-init logs, often found under the 'cloud-init' unit., ds-identify(8): A utility that helps identify the cloud datasource, often used by cloud-init., init(8): The parent of all processes; cloud-init runs early in the boot sequence.