LinuxCommandLibrary

wc

Count words, lines, characters, or bytes

TLDR

Count all lines in a file

$ wc [[-l|--lines]] [path/to/file]
copy

Count all words in a file
$ wc [[-w|--words]] [path/to/file]
copy

Count all bytes in a file
$ wc [[-c|--bytes]] [path/to/file]
copy

Count all characters in a file (taking multi-byte characters into account)
$ wc [[-m|--chars]] [path/to/file]
copy

Count all lines, words and bytes from stdin
$ [find .] | wc
copy

Count the length of the longest line in number of characters
$ wc [[-L|--max-line-length]] [path/to/file]
copy

SYNOPSIS

wc [OPTION]... [FILE]...

PARAMETERS

-c, --bytes
    Prints the byte counts. This counts the exact number of bytes in each input file.

-m, --chars
    Prints the character counts. For multi-byte character encodings (like UTF-8), this can differ from the byte count.

-l, --lines
    Prints the newline counts. This effectively counts the number of lines in each input file.

-w, --words
    Prints the word counts. Words are defined as non-zero-length strings separated by whitespace.

-L, --max-line-length
    Prints the maximum display width of a line. This measures the number of columns required to display the longest line.

--files0-from=F
    Reads input from the files specified by NUL-terminated names in file F. Useful for processing a large number of files listed by another command (e.g., find -print0).

--help
    Displays a help message and exits.

--version
    Displays version information and exits.

DESCRIPTION

The wc command, short for "word count", is a fundamental Linux utility used to count newlines, words, and bytes (or characters) in files or standard input. By default, when invoked without any options, wc displays all three counts (lines, words, bytes) for each specified file, followed by a total summary if multiple files are provided. Its versatility makes it invaluable for various tasks, from quickly checking the size of log files to counting entries in a list, or even determining the length of text in shell scripts. It's often used in conjunction with other commands via pipes, for instance, to count the number of lines output by grep or ls. Understanding wc's options allows users to tailor its output to specific needs, making it a powerful tool for text processing and data analysis.

CAVEATS

The definition of a 'word' by wc is any sequence of non-whitespace characters, separated by whitespace. This might not always align with linguistic definitions of a word.
The -c (bytes) and -m (characters) options yield different results for files containing multi-byte characters (e.g., UTF-8 encoded text). -c counts the raw bytes, while -m counts the actual characters.
On some older systems, the default behavior might differ slightly, or certain options might not be available, especially regarding character counting.

STANDARD INPUT USAGE

When no FILE is specified, or when FILE is given as a single hyphen (-), wc reads from standard input. This makes it highly effective in pipelines, allowing it to process the output of other commands. For example, ls -l | wc -l would count the number of files and directories listed by ls -l.

MULTIPLE FILE PROCESSING

When multiple files are provided as arguments, wc processes each file individually, displaying the counts for each one. After processing all files, it provides a final 'total' line, summarizing the counts across all input files. This is very useful for getting combined statistics from a group of related files.

HISTORY

The wc command has been a standard utility in Unix-like operating systems since the early days of Unix. It dates back to at least Version 1 AT&T Unix. Its simplicity and utility have ensured its continued presence in modern systems, typically as part of the GNU Core Utilities, a collection of essential command-line tools.

SEE ALSO

cat(1), grep(1), awk(1), cut(1), sort(1), uniq(1)

Copied to clipboard