LinuxCommandLibrary

czkawka_cli

Find and remove duplicate files

TLDR

List duplicate in specific directories and write the results into a file

$ czkawka_cli dup [[-d|--directories]] [path/to/directory1] [[-d|--directories]] [path/to/directory2] [[-f|--file-to-save]] [path/to/results.txt]
copy

Find duplicate files in specific directories and delete them (default: NONE)
$ czkawka_cli dup [[-d|--directories]] [path/to/directory] [[-D|--delete-method]] [AEN|AEO|ON|OO|HARD|NONE]
copy

Find similar looking image with a specific similarity level (default: High)
$ czkawka_cli image [[-d|--directories]] [path/to/directory] [[-s|--similarity-preset]] [Minimal|VerySmall|Small|Medium|High|VeryHigh|Original] [[-f|--file-to-save]] [path/to/results.txt]
copy

Display help
$ czkawka_cli [[-h|--help]]
copy

SYNOPSIS

czkawka_cli <subcommand> [OPTIONS] --directories DIR1:DIR2:...

PARAMETERS

duplicate
    Find duplicate files by hash/size/name.

similar_images
    Find visually similar images using perceptual hashing.

similar_videos
    Find similar videos by thumbnails.

similar_music
    Find similar audio tracks by tags/waveform.

binary
    Find duplicate binary files.

big_files
    Find largest files.

--directories
    Colon-separated list of directories to scan.
Required.

--size-min
    Minimum file size to consider (e.g., 10M).

--search-method
    Hash method: Name, Size, Hash, PartialHash1/2.

--delete-mode
    Action: None, Trash, Hardlink, Remove.

--export-to-json
    Export results to JSON file.

--export-to-csv
    Export results to CSV file.

--threads
    Number of threads (default: CPU cores).

--print-progress
    Show scanning progress.

--help
    Show help for subcommand.

--version
    Print version.

DESCRIPTION

Czkawka_cli is the command-line interface to Czkawka, a fast, multi-threaded Rust-based utility for identifying and managing duplicate files, similar media, and various junk on Linux systems.

It supports multiple search modes including exact duplicates (by hash, size, or name), visually similar images/videos, similar music tracks, large files, empty directories/files, temporary files, broken/orphan symlinks, and more. Results can be printed to console, exported to CSV/JSON, or acted upon with safe delete options like permanent deletion, trash, or hardlinks.

Key advantages: blazing fast performance via parallel processing, low memory usage, accurate hashing (xxHash, Blake3), and configurable filters for size, date, and depth. Ideal for server environments or automated scripts where GUI is unavailable. Always preview results before deletion to avoid data loss.

CAVEATS

Deletion options are powerful; always use --print-progress and review output first. No undo for permanent delete. Scans can be CPU/memory intensive on large drives.

INSTALLATION

apt install czkawka (Debian/Ubuntu), pacman -S czkawka-cli (Arch), or cargo install czkawka_cli.

EXAMPLE

czkawka_cli duplicate --directories /home:/mnt/data --size-min 1M --export-to-json results.json

HISTORY

Developed by Qarmin starting 2020 as Rust rewrite of GUI Czkawka. CLI added for headless use. Actively maintained on GitHub with releases up to v10+; popular in Arch/Manjaro repos.

SEE ALSO

fdupes(1), rdfind(1), dupeGuru(1), fslint(1)

Copied to clipboard