gsutil
Manage Google Cloud Storage buckets and objects
TLDR
List all buckets in a project you are logged into
List the objects in a bucket
Download an object from a bucket
Upload an object to a bucket
Rename or move objects in a bucket
Create a new bucket in the project you are logged into
Delete a bucket and remove all the objects in it
SYNOPSIS
gsutil [GLOBAL_OPTIONS] COMMAND [COMMAND_OPTIONS] [ARGUMENTS...]
Common COMMAND examples:
gsutil cp [OPTIONS] SOURCE_URL [DEST_URL...]
gsutil ls [OPTIONS] [URL...]
gsutil mb [OPTIONS] BUCKET_URL
gsutil rb [OPTIONS] BUCKET_URL
gsutil rsync [OPTIONS] SOURCE_URL DEST_URL
PARAMETERS
-m
Enables parallel (multi-threaded/multi-processing) operations for faster transfers of large datasets or many small files.
-q
Quiet mode; suppresses non-error output.
-d
Enables debugging output for troubleshooting.
-D
Enables detailed debugging output, including HTTP headers and payload.
cp
Copy files and objects. Supports copying local files to Cloud Storage, Cloud Storage to local, and Cloud Storage to Cloud Storage.
ls
List buckets and objects. Can list contents of a bucket or all buckets in a project.
mb
Make bucket; creates a new Cloud Storage bucket.
rb
Remove bucket; deletes an empty Cloud Storage bucket.
rm
Remove objects; deletes one or more objects from a bucket. Use with -r for recursive deletion.
rsync
Synchronize contents of two directories or a local directory with a bucket, adding/deleting files as needed to make them identical.
acl
Manage access control lists (ACLs) for buckets and objects, allowing fine-grained permission control.
stat
Display object metadata, such as ETag, content-type, creation time, and checksums.
help
Get help on gsutil global options or specific commands. E.g., 'gsutil help cp'.
version
Display the gsutil version information.
DESCRIPTION
gsutil is a powerful command-line tool, written in Python, for interacting with Google Cloud Storage. It is an essential component of the Google Cloud SDK, enabling users and scripts to perform a wide array of operations on buckets and objects. This includes creating and deleting buckets, uploading and downloading files, synchronizing directories, managing access control lists (ACLs), listing contents, and inspecting object metadata. gsutil is optimized for large-scale operations, featuring capabilities such as parallel (multi-threaded/multi-processing) transfers with the -m flag, automatic retries for transient errors, and cryptographic checksum validation for data integrity. Its design makes it ideal for automating cloud storage tasks in scripts and shell environments.
CAVEATS
gsutil requires proper authentication with Google Cloud, typically managed via the gcloud command-line tool. All operations on Google Cloud Storage incur costs based on storage, network egress, and operations performed. Incorrect permissions (ACLs) can lead to unexpected access denials or, conversely, unintended public access to data. Be mindful of object versioning and lifecycle management policies configured on buckets, which can affect object deletion or modification.
AUTHENTICATION
gsutil leverages the authentication mechanisms configured through the gcloud CLI. Users typically authenticate using gcloud auth login or configure service account keys, which gsutil then uses to authorize requests to Google Cloud Storage.
URI PATHS
Cloud Storage resources are referenced using uniform resource identifiers (URIs) with the format gs://BUCKET_NAME/OBJECT_NAME. For example, gs://my-bucket/my-file.txt or just gs://my-bucket for a bucket.
WILDCARDS
gsutil supports various wildcards in URI paths, including '*' for any number of characters, '?' for a single character, and '[]' for character sets, which is highly useful for operating on multiple objects matching a pattern.
HISTORY
gsutil was developed by Google as the primary command-line interface for Google Cloud Storage. It is distributed as part of the Google Cloud SDK, a collection of tools for managing resources and applications on Google Cloud Platform. Written in Python, gsutil has been continuously evolved since its initial release to support new Cloud Storage features, improve performance, and enhance usability, becoming an indispensable tool for cloud architects and developers.