csplit
Split a file into sections
TLDR
Split a file in two parts, starting the second one at line 10
Split a file in three parts, starting the latter parts in lines 7 and 23
Start a new part at every 5th line (will fail if number of lines is not divisible by 5)
Start a new part at every 5th line, ignoring exact-division error
Split a file above line 5 and use a custom prefix for the output files (default is xx)
Split a file above the first line matching a regex pattern
SYNOPSIS
csplit [OPTION]... FILE PATTERN...
PARAMETERS
-b, --suffix-format=FORMAT
use sprintf FORMAT (e.g., %02d) for suffixes instead of %02d
-f, --prefix=PREFIX
use PREFIX (default xx) for output filenames
-k, --keep-files
keep all generated files on error
-n, --digits=DIGITS
use at most DIGITS digits in filenames (default 2)
-s, --quiet, --silent
suppress printing file sizes
-z, --elide-empty-files
remove empty output files from consideration
--help
display help and exit
--version
output version information and exit
DESCRIPTION
csplit is a powerful Unix/Linux utility for splitting a file into multiple pieces based on contextual delimiters, such as regular expressions or line numbers, rather than fixed sizes. Unlike split, which divides files into chunks of equal length, csplit identifies split points dynamically.
For example, to split a C source file at each function definition matching regex ^func, use: csplit file.c '/^func/' '{*}'. This creates files like xx00 (before first match), xx01 (first function), up to xxNN, plus a final xxNN+1 for remainder.
Patterns include:
- /REGEXP/ or %REGEXP%: split at lines matching regex (absolute or repeated).
- LINE_NO: split after specific line number.
- {N}: repeat previous pattern N times.
By default, it prints byte counts of output files. Output filenames use xx00 prefix with 2-digit suffixes. Ideal for logs, scripts, or structured text. Handles large files efficiently but creates many small files.
CAVEATS
Creates files sequentially (xx00, xx01,...); may overwrite existing files with same names. Large repeat counts ({*}) can generate excessive files. No built-in regex options like case-insensitivity.
PATTERN SYNTAX
/REGEXP/OFFSET: split at regex match with offset lines before/after.
%REGEXP/OFFSET%: repeat regex globally.
NUM/OFFSET: split after NUM lines.
{N} or {*}: repeat prior pattern N times or until EOF.
EXAMPLE
csplit logfile '/^ERROR:/' '{20}' '%^Date:%'
Splits at first 20 ERROR lines, then repeats Date: patterns.
HISTORY
Part of POSIX.1-2008; GNU version in coreutils since 1980s, evolved for better regex support and options like --keep-files.


