LinuxCommandLibrary

gsutil

Manage Google Cloud Storage buckets and objects

TLDR

List all buckets in a project you are logged into

$ gsutil ls
copy

List the objects in a bucket
$ gsutil ls -r 'gs://[bucket_name]/[prefix]**'
copy

Download an object from a bucket
$ gsutil cp gs://[bucket_name]/[object_name] [path/to/save_location]
copy

Upload an object to a bucket
$ gsutil cp [object_location] gs://[destination_bucket_name]/
copy

Rename or move objects in a bucket
$ gsutil mv gs://[bucket_name]/[old_object_name] gs://[bucket_name]/[new_object_name]
copy

Create a new bucket in the project you are logged into
$ gsutil mb gs://[bucket_name]
copy

Delete a bucket and remove all the objects in it
$ gsutil rm -r gs://[bucket_name]
copy

SYNOPSIS

gsutil [GLOBAL_OPTIONS] COMMAND [COMMAND_OPTIONS] [ARGUMENTS...]

Common COMMAND examples:
gsutil cp [OPTIONS] SOURCE_URL [DEST_URL...]
gsutil ls [OPTIONS] [URL...]
gsutil mb [OPTIONS] BUCKET_URL
gsutil rb [OPTIONS] BUCKET_URL
gsutil rsync [OPTIONS] SOURCE_URL DEST_URL

PARAMETERS

-m
    Enables parallel (multi-threaded/multi-processing) operations for faster transfers of large datasets or many small files.

-q
    Quiet mode; suppresses non-error output.

-d
    Enables debugging output for troubleshooting.

-D
    Enables detailed debugging output, including HTTP headers and payload.

cp
    Copy files and objects. Supports copying local files to Cloud Storage, Cloud Storage to local, and Cloud Storage to Cloud Storage.

ls
    List buckets and objects. Can list contents of a bucket or all buckets in a project.

mb
    Make bucket; creates a new Cloud Storage bucket.

rb
    Remove bucket; deletes an empty Cloud Storage bucket.

rm
    Remove objects; deletes one or more objects from a bucket. Use with -r for recursive deletion.

rsync
    Synchronize contents of two directories or a local directory with a bucket, adding/deleting files as needed to make them identical.

acl
    Manage access control lists (ACLs) for buckets and objects, allowing fine-grained permission control.

stat
    Display object metadata, such as ETag, content-type, creation time, and checksums.

help
    Get help on gsutil global options or specific commands. E.g., 'gsutil help cp'.

version
    Display the gsutil version information.

DESCRIPTION

gsutil is a powerful command-line tool, written in Python, for interacting with Google Cloud Storage. It is an essential component of the Google Cloud SDK, enabling users and scripts to perform a wide array of operations on buckets and objects. This includes creating and deleting buckets, uploading and downloading files, synchronizing directories, managing access control lists (ACLs), listing contents, and inspecting object metadata. gsutil is optimized for large-scale operations, featuring capabilities such as parallel (multi-threaded/multi-processing) transfers with the -m flag, automatic retries for transient errors, and cryptographic checksum validation for data integrity. Its design makes it ideal for automating cloud storage tasks in scripts and shell environments.

CAVEATS

gsutil requires proper authentication with Google Cloud, typically managed via the gcloud command-line tool. All operations on Google Cloud Storage incur costs based on storage, network egress, and operations performed. Incorrect permissions (ACLs) can lead to unexpected access denials or, conversely, unintended public access to data. Be mindful of object versioning and lifecycle management policies configured on buckets, which can affect object deletion or modification.

AUTHENTICATION

gsutil leverages the authentication mechanisms configured through the gcloud CLI. Users typically authenticate using gcloud auth login or configure service account keys, which gsutil then uses to authorize requests to Google Cloud Storage.

URI PATHS

Cloud Storage resources are referenced using uniform resource identifiers (URIs) with the format gs://BUCKET_NAME/OBJECT_NAME. For example, gs://my-bucket/my-file.txt or just gs://my-bucket for a bucket.

WILDCARDS

gsutil supports various wildcards in URI paths, including '*' for any number of characters, '?' for a single character, and '[]' for character sets, which is highly useful for operating on multiple objects matching a pattern.

HISTORY

gsutil was developed by Google as the primary command-line interface for Google Cloud Storage. It is distributed as part of the Google Cloud SDK, a collection of tools for managing resources and applications on Google Cloud Platform. Written in Python, gsutil has been continuously evolved since its initial release to support new Cloud Storage features, improve performance, and enhance usability, becoming an indispensable tool for cloud architects and developers.

SEE ALSO

gcloud(1), scp(1), rsync(1), curl(1), wget(1)

Copied to clipboard