czkawka-cli
Find and remove duplicate files
TLDR
List duplicate or similar files in specific directories
Find duplicate files in specific directories and delete them (default: NONE)
SYNOPSIS
czkawka-cli [GLOBAL_OPTIONS] COMMAND [COMMAND_OPTIONS] [ARGS]
Example: czkawka-cli duplicate-files --path ./my_pictures --check-method hash --delete-except-oldest --dry-run
PARAMETERS
--help
Displays general help information for the command or a specific subcommand.
--version
Shows the `czkawka-cli` version.
--minimal-cache-size <SIZE_IN_BYTES>
Sets the minimum cache size (in bytes) that must be present for the cache to be written to disk. If smaller, cache is not written.
--no-color
Disables colored output in the console.
--no-console-progress
Turns off the interactive console progress bar during scans.
--path <PATH>...
Specifies one or more directories to recursively scan. This is a common option used by most subcommands.
--excluded-path <EXCLUDED_PATH>...
Specifies one or more directories to exclude from the scan. Common across many subcommands.
--dry-run
Performs a scan and shows potential deletions without actually deleting files. Highly recommended for safety before any deletion operations.
--output <FILE>
Saves the scan results to a specified file, typically in JSON format.
--minimal-file-size <SIZE>
Sets the minimum file size (e.g., `10K`, `1M`, `1G`) for files to be considered. Applicable to file-scanning subcommands.
--maximal-file-size <SIZE>
Sets the maximum file size (e.g., `10M`, `1G`) for files to be considered. Applicable to file-scanning subcommands.
--file-extension <EXT>...
Includes files only with the specified extensions (e.g., `jpg`, `png`). Case-insensitive. Used by file-scanning subcommands.
--excluded-file-extension <EXT>...
Excludes files with the specified extensions from the scan. Used by file-scanning subcommands.
--check-method <METHOD>
(duplicate-files) Defines how duplicates are identified. Options include `size`, `hash`, `size_and_first_bytes`, `size_and_full_comparison`.
--delete-all-duplicates
(duplicate-files, same-music, similar-images, similar-videos) Deletes all found duplicate files without prompting. Use with extreme caution and always with --dry-run first.
--delete-except-first
(duplicate-files, same-music, similar-images, similar-videos) Deletes all duplicates except the first one encountered in each group.
--delete-except-oldest
(duplicate-files, same-music, similar-images, similar-videos) Deletes all duplicates except the oldest one in each group (based on modification date).
--delete-except-newest
(duplicate-files, same-music, similar-images, similar-videos) Deletes all duplicates except the newest one in each group (based on modification date).
--directories-to-delete <PATHS>
(duplicate-files) A comma-separated list of specific paths (files or directories) to delete without any confirmation. Highly dangerous, use with utmost care.
--similarity-threshold <PERCENT>
(similar-images, similar-videos) Sets the minimum similarity percentage (0-100) for items to be considered alike.
--delete-empty-directories
(empty-directories) Deletes all found empty directories.
--delete-empty-files
(empty-files) Deletes all found empty files.
--delete-temporary-files
(temporary-files) Deletes all found temporary files.
--delete-zero-files
(zero-files) Deletes all found zero-sized files.
--delete-broken-files
(broken-files) Deletes all found broken files.
--delete-invalid-symlinks
(invalid-symlinks) Deletes all found invalid symbolic links.
DESCRIPTION
czkawka-cli is the command-line interface for Czkawka, a modern, cross-platform application designed for efficiently finding and managing various types of redundant data on your filesystem. Written in Rust for performance, it supports a wide array of scan modes including duplicate files, empty directories, empty files, large files, broken files, temporary files, zero-sized files, similar images, and even similar music or videos. Its highly optimized algorithms allow it to process large datasets quickly.
The CLI version is particularly useful for scripting, automation, and headless server environments where a graphical interface is not available or desired. It provides granular control over scan paths, exclusion rules, file size filters, and offers various methods for identifying duplicates (e.g., by hash, size, or content comparison). Users can preview results and choose to delete files or directories, making it a powerful tool for disk space reclamation and data organization. Some deletion operations are destructive, so dry-run is recommended.
CAVEATS
czkawka-cli is a powerful tool capable of making significant changes to your filesystem.
1. Always use the --dry-run option first to preview deletions before executing any destructive commands.
2. Be extremely cautious with options like --delete-all-duplicates or --directories-to-delete, as they bypass confirmations.
3. Performance can be high, but scanning very large numbers of files or directories (millions) may still consume considerable time and system resources, especially when calculating hashes for large files.
4. The similarity checks for images/videos are computationally intensive and can be slow depending on the number and size of files.
SUBCOMMANDS
czkawka-cli operates using various subcommands, each targeting a specific type of file redundancy or issue. These include:
- duplicate-files: Finds exact copies of files.
- empty-directories: Locates and optionally removes empty folders.
- empty-files: Identifies and optionally removes zero-byte files.
- same-music: Finds music files with identical audio content.
- similar-images: Detects visually similar images.
- similar-videos: Finds videos with similar content.
- temporary-files: Identifies and removes common temporary files.
- zero-files: Finds files filled entirely with zeros.
- broken-files: Detects corrupted files (e.g., invalid images/archives).
- external-commands: Executes user-defined commands on identified files.
- invalid-symlinks: Finds symbolic links pointing to non-existent targets.
HISTORY
Czkawka was initially developed as a Rust-based alternative to similar tools, aiming for superior performance and a modern user interface. The project quickly gained traction, with its core logic designed to be highly optimized. The command-line interface, czkawka-cli, was created to provide the same powerful functionality without a graphical overhead, making it suitable for server environments, scripting, and integration into automated workflows. Development is ongoing, with regular updates introducing new features and performance enhancements.