LinuxCommandLibrary

look

Find lines beginning with a specified string

TLDR

Search for lines beginning with a specific prefix in a specific file

$ look [prefix] [path/to/file]
copy

Case-insensitively search only on alphanumeric characters
$ look [[-f|--ignore-case]] [[-d|--alphanum]] [prefix] [path/to/file]
copy

Specify a string termination character (space by default)
$ look [[-t|--terminate]] [,]
copy

Search in /usr/share/dict/words (--alphanum and --ignore-case are assumed)
$ look [prefix]
copy

SYNOPSIS

look [OPTIONS] STRING [FILE...]

PARAMETERS

-d
    Compares only alphanumeric characters, ignoring case and other special characters. This mimics traditional dictionary lookup behavior where punctuation and capitalization are often disregarded for sorting.

-f
    Ignores the case of letters when comparing, making the search case-insensitive. This applies to both the search STRING and the content of the FILEs.

-t char
    Specifies a character that marks the end of a word. Only characters up to (but not including) this termination character are considered for the search string and the file entries. This is useful for files where words might have suffixes or additional data on the same line, and you only want to match the primary word part.

-V
    Displays version information for the look command and exits.

DESCRIPTION

The look command is a specialized utility designed for quickly searching for words within sorted text files, typically dictionary files.

It performs a fast binary search on the specified file(s), making it highly efficient for large word lists. By default, it searches for words that begin with a given string in /usr/share/dict/words or files specified by the WORDS environment variable. If additional file arguments are provided, look searches those files instead.

Its primary use case is for quick dictionary-style lookups, where you need to find all entries that start with a particular prefix. Unlike general-purpose search tools like grep, look requires the input file to be sorted, which allows it to use a much faster binary search algorithm. It offers options to ignore case and to use dictionary-style comparisons, making it versatile for various text datasets.

CAVEATS

  • File Must Be Sorted: look fundamentally relies on its input file(s) being sorted (lexicographically, or according to dictionary rules if -d is used). If the file is not correctly sorted, the command will produce incorrect or unpredictable results.
  • Start-of-Word Matching Only: It only finds entries where the search STRING appears at the very beginning of a word. It does not perform a general substring search.
  • Default Dictionary Path: The default dictionary file, typically /usr/share/dict/words, might not exist on all systems or may require installing a specific dictionary package.

ENVIRONMENT VARIABLES

The look command respects the WORDS environment variable. If WORDS is set to a path, look will use that file as its default dictionary instead of /usr/share/dict/words when no explicit file arguments are provided.

INPUT FILE FORMAT

For look to function correctly, each word or entry in the input file(s) must be on its own line. The file must also be sorted according to the collation rules of the current locale, unless the -d (dictionary order) or -f (fold case) options are used to modify the comparison logic.

HISTORY

look is a venerable Unix utility, with roots dating back to the early BSD systems. Its design leverages the efficiency of binary search, a fundamental algorithm in computer science, to provide rapid lookups in sorted data. This made it particularly valuable in environments where quick access to large word lists or dictionaries was essential, often predating the widespread use of more general-purpose text processing tools for such specific tasks. It is typically found as part of the bsd-games package or util-linux on modern Linux distributions, maintaining its original purpose as a fast word-search tool.

SEE ALSO

grep(1), sort(1), comm(1), dict(1), sed(1), awk(1)

Copied to clipboard