LinuxCommandLibrary

cloc

Count lines of code in files

TLDR

Count all the lines of code in a directory

$ cloc [path/to/directory]
copy

Count all the lines of code in a directory, displaying a progress bar during the counting process
$ cloc --progress=1 [path/to/directory]
copy

Compare 2 directory structures and count the differences between them
$ cloc --diff [path/to/directory/one] [path/to/directory/two]
copy

Ignore files that are ignored by VCS, such as files specified in .gitignore
$ cloc --vcs git [path/to/directory]
copy

Count all the lines of code in a directory, displaying the results for each file instead of each language
$ cloc --by-file [path/to/directory]
copy

SYNOPSIS

cloc [options] [files_or_directories]...

PARAMETERS

--by-file
    Reports results for each file scanned, in addition to the overall language summary. This provides a more granular view of the codebase breakdown.

--exclude-dir=DIR1,DIR2,...
    Excludes the specified comma-separated list of directories from the scan. This is useful for ignoring build artifacts, dependency folders (e.g., node_modules), or documentation.

--exclude-ext=EXT1,EXT2,...
    Excludes files with the specified comma-separated list of extensions from the scan. Useful for ignoring non-source files or specific types of assets.

--force-lang=<language>,<extension>
    Forces files with the given <extension> to be counted as the specified <language>. This is helpful for non-standard file extensions or when cloc misidentifies a language.

--git
    Analyzes the files within a Git repository. By default, it processes the current working tree. Can be combined with other Git-related options like --git-diff.

--json
    Outputs the results in JavaScript Object Notation (JSON) format. This is ideal for programmatic parsing and integration into other tools or scripts.

--out=<file>
    Writes the report output to the specified <file> instead of standard output (stdout).

--quiet
    Suppresses all informational messages except the final report summary. Useful for cleaner output when cloc is used in scripts.

--read-lang-def=<file>
    Reads language definitions from the specified file. This allows users to add custom language definitions or override existing ones.

--skip-vcs-ignores
    Ignores version control system ignore files (e.g., .gitignore, .hgignore). By default, cloc respects these files.

DESCRIPTION

cloc (Count Lines Of Code) is a free, open-source command-line utility designed to quickly count blank lines, comment lines, and physical lines of source code in many programming languages. It offers extensive language support, detecting over 150 different programming languages by default based on file extensions and heuristics. Users can analyze individual files, entire directories, or even content from standard input. cloc is highly versatile, capable of parsing various file types, including compressed archives and remote URLs. It provides detailed summary reports broken down by language, along with options to report results per file. Its ability to integrate with version control systems like Git allows for analysis of repositories, specific commits, or changes between revisions. The output can be customized to various formats, including plain text, CSV, XML, and JSON, making it an invaluable tool for software project analysis, tracking codebase size, and monitoring development progress.

CAVEATS

cloc primarily counts lines and does not assess code quality or complexity. Its accuracy relies on file extensions and heuristics, which might occasionally misidentify languages or line types in highly unusual codebases. Mixed-language files can also pose challenges for precise categorization.

LANGUAGE DETECTION AND PARSING

cloc employs a sophisticated system to identify and count lines of code. It categorizes lines into blank (empty or whitespace-only), comment (lines ignored by the compiler/interpreter), and code (executable or declarative lines). Its extensive internal database of language definitions enables accurate parsing across a vast range of programming, scripting, and markup languages.

VERSION CONTROL SYSTEM INTEGRATION

A powerful feature of cloc is its direct integration with version control systems, notably Git. This allows developers to analyze not just the current codebase, but also historical states, changes between commits (e.g., --git-diff <commit1> <commit2>), or even patch files. This capability is invaluable for tracking project growth, refactoring efforts, and understanding development velocity over time.

FLEXIBLE OUTPUT FORMATS

Beyond its default human-readable text report, cloc supports output in several machine-readable formats: CSV, JSON, and XML. These formats facilitate automation, data analysis, and integration with other tools or dashboards, making cloc a versatile component in CI/CD pipelines and reporting systems.

HISTORY

cloc was created by Al Danford in 2006, written in Perl. It quickly gained popularity for its comprehensive language support, robust parsing capabilities, and cross-platform compatibility. Its continuous development and wide array of output formats have solidified its position as a go-to tool for source code analysis in various development environments.

SEE ALSO

wc(1)Basic line, word, and byte counting utility., sloccount(1)Another tool for counting source lines of code, though often less maintained than cloc., find(1)Used to locate files that can then be piped to cloc for selective analysis., xargs(1)Often used with find to construct command lines for cloc from a list of files.

Copied to clipboard