cloc
Count lines of code in files
TLDR
Count all the lines of code in a directory
Count all the lines of code in a directory, displaying a progress bar during the counting process
Compare 2 directory structures and count the differences between them
Ignore files that are ignored by VCS, such as files specified in .gitignore
Count all the lines of code in a directory, displaying the results for each file instead of each language
SYNOPSIS
cloc [options] [files_or_directories]...
PARAMETERS
--by-file
Reports results for each file scanned, in addition to the overall language summary. This provides a more granular view of the codebase breakdown.
--exclude-dir=DIR1,DIR2,...
Excludes the specified comma-separated list of directories from the scan. This is useful for ignoring build artifacts, dependency folders (e.g., node_modules), or documentation.
--exclude-ext=EXT1,EXT2,...
Excludes files with the specified comma-separated list of extensions from the scan. Useful for ignoring non-source files or specific types of assets.
--force-lang=<language>,<extension>
Forces files with the given <extension> to be counted as the specified <language>. This is helpful for non-standard file extensions or when cloc misidentifies a language.
--git
Analyzes the files within a Git repository. By default, it processes the current working tree. Can be combined with other Git-related options like --git-diff.
--json
Outputs the results in JavaScript Object Notation (JSON) format. This is ideal for programmatic parsing and integration into other tools or scripts.
--out=<file>
Writes the report output to the specified <file> instead of standard output (stdout).
--quiet
Suppresses all informational messages except the final report summary. Useful for cleaner output when cloc is used in scripts.
--read-lang-def=<file>
Reads language definitions from the specified file. This allows users to add custom language definitions or override existing ones.
--skip-vcs-ignores
Ignores version control system ignore files (e.g., .gitignore, .hgignore). By default, cloc respects these files.
DESCRIPTION
cloc (Count Lines Of Code) is a free, open-source command-line utility designed to quickly count blank lines, comment lines, and physical lines of source code in many programming languages. It offers extensive language support, detecting over 150 different programming languages by default based on file extensions and heuristics. Users can analyze individual files, entire directories, or even content from standard input. cloc is highly versatile, capable of parsing various file types, including compressed archives and remote URLs. It provides detailed summary reports broken down by language, along with options to report results per file. Its ability to integrate with version control systems like Git allows for analysis of repositories, specific commits, or changes between revisions. The output can be customized to various formats, including plain text, CSV, XML, and JSON, making it an invaluable tool for software project analysis, tracking codebase size, and monitoring development progress.
CAVEATS
cloc primarily counts lines and does not assess code quality or complexity. Its accuracy relies on file extensions and heuristics, which might occasionally misidentify languages or line types in highly unusual codebases. Mixed-language files can also pose challenges for precise categorization.
LANGUAGE DETECTION AND PARSING
cloc employs a sophisticated system to identify and count lines of code. It categorizes lines into blank (empty or whitespace-only), comment (lines ignored by the compiler/interpreter), and code (executable or declarative lines). Its extensive internal database of language definitions enables accurate parsing across a vast range of programming, scripting, and markup languages.
VERSION CONTROL SYSTEM INTEGRATION
A powerful feature of cloc is its direct integration with version control systems, notably Git. This allows developers to analyze not just the current codebase, but also historical states, changes between commits (e.g., --git-diff <commit1> <commit2>), or even patch files. This capability is invaluable for tracking project growth, refactoring efforts, and understanding development velocity over time.
FLEXIBLE OUTPUT FORMATS
Beyond its default human-readable text report, cloc supports output in several machine-readable formats: CSV, JSON, and XML. These formats facilitate automation, data analysis, and integration with other tools or dashboards, making cloc a versatile component in CI/CD pipelines and reporting systems.
HISTORY
cloc was created by Al Danford in 2006, written in Perl. It quickly gained popularity for its comprehensive language support, robust parsing capabilities, and cross-platform compatibility. Its continuous development and wide array of output formats have solidified its position as a go-to tool for source code analysis in various development environments.
SEE ALSO
wc(1)Basic line, word, and byte counting utility., sloccount(1)Another tool for counting source lines of code, though often less maintained than cloc., find(1)Used to locate files that can then be piped to cloc for selective analysis., xargs(1)Often used with find to construct command lines for cloc from a list of files.