csplit
Split a file into sections
TLDR
Split a file at lines 5 and 23
Split a file every 5 lines (this will fail if the total number of lines is not divisible by 5)
Split a file every 5 lines, ignoring exact-division error
Split a file at line 5 and use a custom prefix for the output files
Split a file at a line matching a regular expression
SYNOPSIS
csplit [OPTION]... FILE PATTERN...
PARAMETERS
-f, --prefix=PREFIX
Use PREFIX instead of `xx'.
-b, --suffix-format=FORMAT
Use sprintf FORMAT instead of `%02d'.
-n, --digits=DIGITS
Use specified number of digits instead of 2.
-s, --quiet, --silent
Do not print counts of output file sizes.
-k, --keep-files
Do not remove output files on errors.
-z, --elide-empty-files
Suppress generation of zero-length output files.
--help
Display help and exit.
--version
Output version information and exit.
PATTERN
The pattern to split the file, can be a line number, regular expression or other specifier
DESCRIPTION
The csplit command in Linux splits a file into sections determined by context lines. It reads the input file, separates it into multiple output files based on given patterns (lines numbers or regular expressions), and names the output files sequentially. csplit is useful for dividing large files into smaller, more manageable chunks for easier processing or analysis. It is especially helpful when dealing with log files or other text-based data where specific delimiters mark the boundaries between sections. Unlike `split`, which divides based on size, csplit uses content for division.
The sections can be defined using line numbers, regular expressions, or a combination of both. The standard output will show the byte size of each splitted file.
CAVEATS
If an error occurs or a HUP, INT, or TERM signal is received, csplit removes the output files it has created, unless the `-k` option is specified.
PATTERN DESCRIPTION
PATTERN can be:
INTEGER: Copy to the next line number but not including it.
/REGEXP/: Copy to but not including a matching line.
%REGEXP%: Skip to but not including a matching line.
{INTEGER}: Repeat the previous pattern specified number of times.