word-list-compress
Compress word lists to improve efficiency
SYNOPSIS
word-list-compress [-d | -u] [-o output_file] [input_file]
PARAMETERS
-d, --decompress
Decompresses the specified cracklib word list. Without this option, the command compresses the input.
-u, --uncompress
Alias for -d; decompresses the word list.
-o FILE, --output=FILE
Writes the output (compressed or decompressed) to the specified FILE instead of standard output. If not provided, output goes to stdout.
input_file
The path to the input word list file. If omitted, word-list-compress reads from standard input (stdin).
DESCRIPTION
The word-list-compress command is a utility from the cracklib library, designed to efficiently store and retrieve large word lists, typically used for password strength checking. It employs a delta-encoding compression scheme, which significantly reduces the size of a sorted word list by storing only the differences between consecutive words. This makes it possible to manage very large dictionaries with a smaller memory footprint and faster lookup times, crucial for performance in applications like password validators. The command can both compress a standard newline-separated word list into the specialized cracklib binary format and decompress it back to a human-readable text format. The compressed format is specifically designed for use by other cracklib utilities, such as cracklib-check and cracklib-dict-create.
CAVEATS
The input word list for compression must be sorted lexicographically for the delta-encoding scheme to be effective. The compressed output is a binary format specific to cracklib and is not directly human-readable. Decompression requires sufficient memory to hold the dictionary in its decompressed form.
INPUT/OUTPUT BEHAVIOR
When no input_file is specified, word-list-compress reads from stdin. Similarly, if no output file is specified with -o, the result is written to stdout. This allows for piping and redirection with other commands.
COMPRESSED FORMAT
The output of compression is a highly optimized, binary format designed for rapid dictionary lookups by cracklib functions. It contains a magic number, offsets, and delta-encoded word data. This format is not intended for direct human inspection.
HISTORY
word-list-compress is a fundamental component of the cracklib suite, which originated from Alec Muffett's crack password guessing program in the early 1990s. Its primary purpose has always been to facilitate the efficient storage and manipulation of large password dictionaries. The command's core functionality, based on delta-encoding for compression, has remained largely consistent, reflecting its robust and effective design for managing security-critical word lists.
SEE ALSO
cracklib-check(8), cracklib-dict-create(8), cracklib-packer(8), passwd(1), pwconv(8)