LinuxCommandLibrary

bq

Interact with Google BigQuery

TLDR

Run query against a BigQuery table using standard SQL, add --dry_run flag to estimate the number of bytes read by the query

$ bq query --nouse_legacy_sql 'SELECT COUNT(*) FROM [DATASET_NAME].[TABLE_NAME]'
copy

Run a parameterized query
$ bq query --use_legacy_sql=false --parameter='ts_value:TIMESTAMP:2016-12-07 08:00:00' 'SELECT TIMESTAMP_ADD(@ts_value, INTERVAL 1 HOUR)'
copy

Create a new dataset or table in the US location
$ bq mk --location=US [dataset_name].[table_name]
copy

List all datasets in a project
$ bq ls --filter labels.[key]:[value] --max_results [integer] --format=prettyjson --project_id [project_id]
copy

Batch load data from a specific file in formats such as CSV, JSON, Parquet, and Avro to a table
$ bq load --location [location] --source_format [CSV|JSON|PARQUET|AVRO] [dataset].[table] [path_to_source]
copy

Copy one table to another
$ bq cp [dataset].[OLD_TABLE] [dataset].[new_table]
copy

Display help
$ bq help
copy

SYNOPSIS

bq [options] [arguments]

PARAMETERS

--apilog=
    Path to the apilog file. Enables logging of BigQuery API requests and responses to the specified file.

--dataset_id=
    Specifies the default dataset to use for unqualified table names.

--format=
    Specifies the output format. Options include prettyjson, json, csv, and pretty.

--location=
    Specifies the location for operations like creating datasets. This is important for regional datasets.

--project_id=
    Specifies the Google Cloud project ID to use.

--transfer_service_account=
    Specifies the service account used for data transfers.

help
    Displays help information about the command.

query
    Executes a SQL query against BigQuery.

load
    Loads data into a BigQuery table from a file or other source.

extract
    Extracts data from a BigQuery table to a file or other destination.

mk
    Creates datasets, tables, or other BigQuery resources.

rm
    Deletes datasets, tables, or other BigQuery resources.

ls
    Lists datasets, tables, or other BigQuery resources.

update
    Updates BigQuery resources.

DESCRIPTION

The bq command-line tool is a Python-based command-line interface for interacting with Google's BigQuery service.
It allows users to manage BigQuery resources such as datasets, tables, and jobs directly from the terminal.
With bq, users can perform various tasks, including running queries, loading data into tables, exporting data from tables, and managing access controls.
It provides a flexible and efficient way to automate BigQuery workflows and integrate them into scripts or other applications.
The bq tool is part of the Google Cloud SDK (Software Development Kit) and requires authentication with a Google Cloud account to access BigQuery resources. The configurations are generally stored into the .bigqueryrc file at the user's home directory.

CAVEATS

Requires the Google Cloud SDK to be installed and configured, including authentication with a Google Cloud account.

AUTHENTICATION

The bq command uses the same authentication mechanism as the Google Cloud SDK. You typically need to authenticate using gcloud auth login before using bq.

CONFIGURATION

The bq tool's settings are often stored in a .bigqueryrc file in your home directory. You can modify settings such as the default project ID and dataset ID in this file or through command-line flags.

SEE ALSO

gcloud(1), gsutil(1)

Copied to clipboard