py-sql-cleaner
Format and extract SQL strings embedded in Python source files
TLDR
SYNOPSIS
py-sql-cleaner command [options] file
DESCRIPTION
py-sql-cleaner is a Python CLI that finds triple-quoted SQL strings inside Python files and reformats or extracts them using SQLGlot. It is intended for ETL and data-engineering projects where SQL is embedded in `.py` files rather than kept in separate `.sql` files.The tool is conservative by default. f-strings, Jinja templates and other runtime placeholders are detected but skipped instead of being rewritten, so formatting cannot silently change a query that depends on interpolation.py-sql-cleaner never connects to a database and never executes SQL. The supported dialects (`generic`, `mysql`, `postgres`, `redshift`) only select SQLGlot's parser and formatter mode, not full database validation.The `check` subcommand exits non-zero when formatting would change a file, making it suitable as a pre-commit hook or CI gate next to `black`, `ruff` and similar formatters.
PARAMETERS
list file
Show every embedded SQL block found in the file.format file
Reformat embedded SQL in place.extract file
Write each SQL block to a separate `.sql` file.check file
Exit non-zero if formatting would change the file.dialects
Print supported SQL dialects.-d, --dialect NAME
Select dialect: `generic`, `mysql`, `postgres`, `redshift`.--dry-run
Print the formatted result instead of writing it.--out-dir DIR
Output directory for `extract`.--version
Print the installed version.
INSTALLATION
CAVEATS
The project is an early MVP. f-strings and templating syntax are intentionally skipped to avoid breaking queries that build SQL at runtime. Dialect selection affects parsing only and does not guarantee a query will execute on the target database.
