codespell
Find and fix spelling errors in code
TLDR
Check for typos in all text files in the current directory, recursively
Correct all typos found in-place
Skip files with names that match the specified pattern (accepts a comma-separated list of patterns using wildcards)
Use a custom dictionary file when checking (--dictionary can be used multiple times)
Do not check words that are listed in the specified file
Do not check the specified words
Print 3 lines of context around, before or after each match
Check file names for typos, in addition to file contents
SYNOPSIS
codespell [OPTIONS] [files...]
codespell [OPTIONS] -r [directories...]
PARAMETERS
-w
Fixes misspelled words in place.
-D
Displays only the errors found (default behavior).
-x
Displays the difference (diff format) for potential fixes.
-c
Displays context around the misspelled word.
-S SKIP_PATTERNS
Skips files or directories matching the given comma-separated patterns.
-I BUILTIN_DICTIONARY
Specifies additional built-in dictionaries to include (e.g., 'rare').
-L BUILTIN_DICTIONARY
Specifies built-in dictionaries to load (overrides default).
-d DICTIONARIES
Specifies a comma-separated list of custom dictionary files to use.
-s SKIP_BUILTIN
Skips specific built-in dictionaries (e.g., 'en_US').
-r
Recursively checks files in the given directories.
--ignore-words-list IGNORE_WORDS_LIST
Provides a comma-separated list of words to ignore.
--check-filenames
Also checks for misspelled words in filenames.
--check-hidden
Includes hidden files/directories during recursive checks.
--version
Shows the program's version number and exit.
DESCRIPTION
codespell is a command-line utility designed to quickly scan source code, documentation, and comments for common spelling mistakes. Unlike general-purpose spell checkers, codespell focuses on a curated list of frequently misspelled words and common programming-related typos, making it highly effective for codebases without generating excessive noise from legitimate variable names or technical jargon. It can identify errors and, optionally, fix them in place. Its efficiency and targeted approach make it an ideal tool for integration into continuous integration (CI) pipelines, pre-commit hooks, or for regular local development checks to maintain code quality and readability.
CAVEATS
False Positives: While optimized for code, codespell may still flag legitimate technical terms, variable names, or acronyms as errors, requiring manual review or adding to an ignore list.
Limited Scope: It primarily focuses on common spelling errors and does not perform grammar checks or understand the linguistic context of the text.
In-place Modification: Using the -w option modifies files directly. It is highly recommended to use codespell with version control (e.g., Git) to easily revert changes if unintended modifications occur.
Dictionary Dependent: Its effectiveness relies on its built-in and user-defined dictionaries. Uncommon or domain-specific typos might be missed if not present in the dictionaries.
<I>INTEGRATION WITH CI/CD:</I>
codespell is frequently integrated into Continuous Integration/Continuous Deployment (CI/CD) pipelines. This ensures that any new or modified code is automatically checked for spelling errors before it's merged, preventing common typos from entering the codebase. Tools like GitHub Actions, GitLab CI/CD, or Jenkins can easily run codespell as part of their build or test stages.
<I>CUSTOM DICTIONARIES:</I>
Users can define custom dictionary files to include project-specific terminology, acronyms, or proper nouns that codespell should recognize as correctly spelled. This helps in reducing false positives and tailoring the tool to specific project needs, enhancing its utility.
<I>ERROR CODES:</I>
codespell supports specific error codes for different types of issues, which can be enabled or disabled (e.g., --enable-error-code, --disable-error-code). This allows for fine-grained control over what kinds of spelling issues are reported.
HISTORY
codespell was created by Lucas C. Villa Real with the first commit appearing in 2013. It originated from the need for a lightweight, code-centric spell checker that could efficiently identify common typos in source code without the overhead and false positives of traditional spell checkers when applied to programming contexts. Its development has been community-driven, expanding its dictionaries and features to better serve developers seeking to improve code readability and reduce technical debt associated with misspellings, especially in comments and documentation. It has gained popularity due to its focused approach and ease of integration into modern development workflows.