LinuxCommandLibrary

czkawka-cli

Find and remove duplicate files

TLDR

List duplicate or similar files in specific directories

$ czkawka-cli [dup|image] --directories [path/to/directory1 path/to/directory2 ...]
copy

Find duplicate files in specific directories and delete them (default: NONE)
$ czkawka-cli dup --directories [path/to/directory1 path/to/directory2 ...] --delete-method [AEN|AEO|ON|OO|HARD|NONE]
copy

SYNOPSIS

czkawka-cli <subcommand> [-d <directories>...] [options]

PARAMETERS

-d, --directories
    Directories to scan (default: current directory)

-D, --delete
    Delete selected files without confirmation (dangerous)

-m, --move
    Move files to trash or specified directory

-H, --hard-link
    Replace duplicates with hard links to save space

--cache
    Use caching to speed up rescans

-j, --threads
    Number of threads for parallel processing

-o, --output
    Save results to JSON file

--search-smallest-dir-first
    Optimize by scanning smallest directories first

-M, --min-file-size
    Minimum file size in bytes to consider

-e, --exclude
    Exclude directories or patterns

DESCRIPTION

Czkawka-cli is a fast, multi-functional command-line tool written in Rust for finding and managing duplicate files, similar images/videos/binaries, empty folders/files, temporary files, orphans, broken symlinks, and more. It supports scanning multiple directories, caching results for speed, and actions like deletion, moving, or hardlinking duplicates.

Key features include precise duplicate detection using content hashing (blake3, xxh3), visual similarity for images/videos via imagehash/perceptualhash or FFmpeg, music duplicate detection by tags/audio fingerprinting, and junk file identification (e.g., thumbnails, logs). It excels in large datasets due to multi-threading and efficient algorithms, outperforming tools like fdupes in speed for big drives.

Usage involves specifying a subcommand (e.g., duplicate, similar_images) followed by directories and options. Results can be printed to console, saved to JSON, or acted upon directly. Ideal for server automation, cleanup scripts, and disk space optimization on Linux systems.

CAVEATS

Not all subcommands support all options; check czkawka-cli <subcommand> --help. Deleting/moving requires caution—no undo. High RAM usage on massive scans. FFmpeg needed for video similarity.

SUBCOMMANDS

duplicate, similar_images, big_files, empty_files, temporary, music, orphans, broken_symlinks, gui, cache

INSTALLATION

Available via AUR (Arch), Flatpak, or cargo: cargo install czkawka_cli. Debian/Ubuntu packages in repos.

HISTORY

Developed by Qarmin since 2020 as open-source Rust project on GitHub. Evolved from GUI version, CLI added for scripting. Active development with v10+ releases focusing on speed (blake3 hash) and new detectors like video similarity.

SEE ALSO

fdupes(1), rmlint(1), du(1), find(1)

Copied to clipboard