wc
Count words, lines, characters, or bytes
TLDR
Count all lines in a file
Count all words in a file
Count all bytes in a file
Count all characters in a file (taking multi-byte characters into account)
Count all lines, words and bytes from stdin
Count the length of the longest line in number of characters
SYNOPSIS
wc [OPTION]... [FILE]...
PARAMETERS
-c, --bytes
Prints the byte counts. This counts the exact number of bytes in each input file.
-m, --chars
Prints the character counts. For multi-byte character encodings (like UTF-8), this can differ from the byte count.
-l, --lines
Prints the newline counts. This effectively counts the number of lines in each input file.
-w, --words
Prints the word counts. Words are defined as non-zero-length strings separated by whitespace.
-L, --max-line-length
Prints the maximum display width of a line. This measures the number of columns required to display the longest line.
--files0-from=F
Reads input from the files specified by NUL-terminated names in file F. Useful for processing a large number of files listed by another command (e.g., find -print0).
--help
Displays a help message and exits.
--version
Displays version information and exits.
DESCRIPTION
The wc command, short for "word count", is a fundamental Linux utility used to count newlines, words, and bytes (or characters) in files or standard input. By default, when invoked without any options, wc displays all three counts (lines, words, bytes) for each specified file, followed by a total summary if multiple files are provided. Its versatility makes it invaluable for various tasks, from quickly checking the size of log files to counting entries in a list, or even determining the length of text in shell scripts. It's often used in conjunction with other commands via pipes, for instance, to count the number of lines output by grep or ls. Understanding wc's options allows users to tailor its output to specific needs, making it a powerful tool for text processing and data analysis.
CAVEATS
The definition of a 'word' by wc is any sequence of non-whitespace characters, separated by whitespace. This might not always align with linguistic definitions of a word.
The -c (bytes) and -m (characters) options yield different results for files containing multi-byte characters (e.g., UTF-8 encoded text). -c counts the raw bytes, while -m counts the actual characters.
On some older systems, the default behavior might differ slightly, or certain options might not be available, especially regarding character counting.
STANDARD INPUT USAGE
When no FILE is specified, or when FILE is given as a single hyphen (-), wc reads from standard input. This makes it highly effective in pipelines, allowing it to process the output of other commands. For example, ls -l | wc -l would count the number of files and directories listed by ls -l.
MULTIPLE FILE PROCESSING
When multiple files are provided as arguments, wc processes each file individually, displaying the counts for each one. After processing all files, it provides a final 'total' line, summarizing the counts across all input files. This is very useful for getting combined statistics from a group of related files.
HISTORY
The wc command has been a standard utility in Unix-like operating systems since the early days of Unix. It dates back to at least Version 1 AT&T Unix. Its simplicity and utility have ensured its continued presence in modern systems, typically as part of the GNU Core Utilities, a collection of essential command-line tools.