Overview
csplit splits an input file into multiple output files according to specified patterns or line numbers. Each output file contains a consecutive section of the original file, and the filenames are composed of a specified prefix and a numeric suffix.
Key Features
- Splitting based on regular expressions or line numbers
- Ability to specify output filename prefix and suffix format
- Facilitates extraction and management of specific sections in large files
Key Options
Output File Control
Splitting Behavior Control
Generated command:
Try combining the commands.
Description:
`csplit` Executes the command.
Combine the above options to virtually execute commands with AI.
Usage Examples
Splitting a file by line numbers
echo -e "$(seq 1 35)" > test.txt
csplit test.txt 10 20 30
Splits the test.txt file at lines 10, 20, and 30. (e.g., xx00: lines 1-9, xx01: lines 10-19, xx02: lines 20-29, xx03: from line 30 to the end)
Splitting a file by regular expression
echo -e "Line 1\nLine 2\nERROR: First error\nLine 4\nLine 5\nERROR: Second error\nLine 7" > log.txt
csplit -f part_ log.txt '/^ERROR:/' '{*}'
Splits the log.txt file based on lines starting with '^ERROR:', and sets the file prefix to 'part_'. '{*}' means to group all remaining content into a single file.
Splitting with specified prefix and digit count
echo -e "[Section 1]\nContent A\n[Section 2]\nContent B\n[Section 3]\nContent C" > data.log
csplit -f my_file_ -n 3 data.log '/^\[Section \d+\]/' '{*}'
Splits the data.log file by the pattern '[Section N]' and creates filenames like 'my_file_000', 'my_file_001', etc.
Splitting while excluding matching lines
echo -e "Line 1\nERROR: First error\nLine 3\nERROR: Second error" > log.txt
csplit --suppress-match -f no_error_ log.txt '/^ERROR:/' '{*}'
Splits the log.txt file by the pattern '^ERROR:', but excludes lines starting with 'ERROR:' from each split file.
Tips & Precautions
The csplit command is powerful, but caution should be exercised when using regular expressions.
Useful Tips
- When using regular expressions, enclose them in quotes to prevent shell interpretation.
- By default, the line that serves as the splitting criterion becomes the first line of the next file. You can exclude this line using the `--suppress-match` option.
- '{*}' means to group all remaining files into a single file. Without this option, content after the last splitting criterion will be discarded.
Precautions
- Original file is not modified: csplit creates new split files without altering the original file.
- Error handling: Errors may occur if regular expressions do not match or line numbers are invalid. Regular expressions must be written accurately.