git-sed
Apply sed across Git repository revisions
TLDR
Replace the specified text in the current repository
Replace the specified text and then commit the resulting changes with a standard commit message
Replace the specified text, using regex
Replace a specific text in all files under a given directory
SYNOPSIS
git sed [-i] [-E | -r] [-f <scriptfile>] <sed-expression> [<rev-list-options>]
PARAMETERS
<sed-expression>
The sed script or command to apply to each file blob. For example, "s/old_text/new_text/g".
-i
Indicates that the changes should be applied 'in-place' to the blobs. This is the implicit behavior for history rewriting operations.
-E | -r
Use extended regular expressions in the sed command (similar to sed -E or sed -r).
-f <scriptfile>
Read the sed script from the specified <scriptfile> instead of directly from the command line.
<rev-list-options>
Options passed to git rev-list to specify the range of commits whose blobs should be processed. Examples include HEAD, master^..HEAD, --all, or specific paths like --all -- filename.txt.
DESCRIPTION
git-sed is a conceptual utility, often implemented as a standalone script or wrapper, that allows users to apply sed (stream editor) commands to the content of files (blobs) across Git history. Unlike simple file editing, git-sed operates directly on Git objects, enabling systematic modifications to past commits, branches, or the entire repository history.
It's commonly used for tasks such as globally renaming variables, updating file paths, or scrubbing sensitive information (like passwords or tokens) that might have been accidentally committed. Because it rewrites history, git-sed creates new commit SHAs, fundamentally altering the repository's past. This operation often leverages Git's internal history-rewriting capabilities like git filter-branch or git filter-repo, providing a more focused interface for text-based transformations. Due to its powerful nature, it must be used with caution, especially in collaborative environments.
CAVEATS
1. Not a Built-in Command: git-sed is typically a third-party script or a user-contributed utility, not a command officially shipped with Git itself.
2. History Rewriting: This command fundamentally alters Git history by creating new commit objects. This changes commit SHAs and can lead to significant problems if not handled carefully.
3. Collaboration Impact: Rewriting shared history requires coordination with collaborators, as it necessitates force-pushing (e.g., git push --force-with-lease) and may invalidate their local branches.
4. Performance: On large repositories with extensive history or many files, the operation can be very time-consuming and resource-intensive.
5. Destructive Potential: Incorrect usage can lead to data loss or a corrupted repository state. Always back up your repository before performing such operations.
HOW IT WORKS (CONCEPTUAL)
Conceptually, git-sed iterates through the specified commits. For each commit, it identifies the relevant file blobs. It then applies the provided sed expression to the content of these blobs. If the content changes, a new blob object is created, and subsequently, a new tree object and a new commit object are generated to reflect this change. This process continues through the history, effectively rebuilding the affected portion of the commit graph with the new content.
TYPICAL USE CASES
Removing Sensitive Data: Eradicating passwords, API keys, or other confidential information inadvertently committed to history.
Global Refactoring: Changing variable names, function names, or package paths across an entire project's history.
Standardizing Content: Enforcing consistent headers, footers, or license blocks in all relevant files.
Bulk Text Replacement: Any scenario requiring a systematic search-and-replace operation across many versions of a file.
HISTORY
git-sed doesn't have a single, official development history as a standalone project. Instead, it emerged as a common pattern and practical solution among Git users who needed to apply consistent text transformations across their repository's history. Various implementations exist as community-contributed scripts, often wrapping Git's built-in history-rewriting commands like git filter-branch (and more recently, git filter-repo) to streamline the process of applying sed-based changes to blobs. Its widespread use reflects a common need in Git workflows for systematic content modification that goes beyond simple commit amendments.
SEE ALSO
git-filter-branch(1), git-filter-repo(1), git-rebase(1), sed(1)