kaggle-datasets
Manage Kaggle datasets
TLDR
List all datasets owned by a user or organization
Search dataset by name
Download a dataset
Create a public dataset
Download metadata of dataset
Initialize metadata for dataset
Delete a dataset
SYNOPSIS
kaggle datasets [OPTIONS] <COMMAND> [<ARGS>]...
Commands: create, delete, delete-version, download, list, metadata, status, update-flags, version
PARAMETERS
-h, --help
Show help message and exit
-p, --path PATH
Local path to dataset folder (default: current directory)
-m, --kaggle-metadata FILE
Path to Kaggle metadata file (default: ~/.kaggle/kaggle.json)
--dir DIR
Directory for temporary files (default: ~/.cache/kaggle)
DESCRIPTION
The kaggle datasets command is part of the official Kaggle API client, a Python-based tool for interacting with Kaggle's vast dataset repository from the Linux terminal. It enables data scientists and ML practitioners to list available datasets, download them efficiently, create new datasets or versions, update metadata, and manage permissions without using the web interface.
Key features include filtering datasets by keywords, owners, or usability scores; downloading entire datasets or specific files; and versioning for iterative improvements. Authentication requires a kaggle.json API token downloaded from your Kaggle account settings and placed in ~/.kaggle/ with 600 permissions. Installed via pip install kaggle, it supports large file handling and integrates into CI/CD pipelines for reproducible workflows.
Common use cases: bulk downloading competition datasets, automating dataset creation from local files, or querying public datasets by refined criteria like file size or license. It's lightweight, handles retries on network issues, and outputs in JSON or CSV for scripting.
CAVEATS
Requires API credentials in ~/.kaggle/kaggle.json with chmod 600; large downloads may need ample disk space and stable internet; not all datasets are downloadable due to private status or restrictions.
INSTALLATION
pip install kaggle; verify with kaggle --version
AUTHENTICATION
Download kaggle.json from Kaggle > Account > API; mkdir -p ~/.kaggle && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json
COMMON SUBCOMMANDS
list: kaggle datasets list -s "keyword"
download: kaggle datasets download -d username/dataset -p ./data
create: kaggle datasets create -p ./dataset-folder
HISTORY
Released in 2018 as part of Kaggle API v1.5 by Google (Kaggle's parent); evolved with v1.6+ for better versioning and metadata support; widely used in data science since 2020 for automation.


