Top Qs
Timeline
Chat
Perspective

Cut (Unix)

Shell command for extracting sections of text files From Wikipedia, the free encyclopedia

Remove ads

cut is a shell command that extracts sections from each line of input text — usually from a file. Extraction of line segments can typically be done by bytes (-b), characters (-c), or fields (-f) separated by a delimiter (-d the tab character by default). A range must be provided in each case which consists of one of N, N-M, N- (N to the end of the line), or -M (beginning of the line to M), where N and M are counted from 1 (there is no zeroth value). Since version 6, an error is thrown if you include a zeroth value. Prior to this the value was ignored and assumed to be 1.

Quick facts Original author(s), Developer(s) ...
Remove ads

History

The original Bell Labs version was written by Gottfried W. R. Luderer.[1][2] The command is part of the X/Open Portability Guide since issue 2 of 1987. It was inherited into the first version of POSIX.1 and the Single Unix Specification.[3] It first appeared in AT&T System III UNIX in 1982.[4]

The command is commonly available on Unix and Unix-like operating systems. It is part of the BSD Base System. The version in GNU coreutils was written by David M. Ihnat, David MacKenzie, and Jim Meyering.[5] The command is available for Windows via UnxUtils.[6] The command was ported to the IBM i operating system.[7]

Remove ads

Use

Summarize
Perspective

The command line consists of options and an optional file path. If no path is specified than standard input will be used.

Options include:

-b
Bytes; a list following -b specifies a range of bytes which will be returned, e.g. cut -b1-66 would return the first 66 bytes of a line. NB If used in conjunction with -n, no multi-byte characters will be split. NNB. -b will only work on input lines of less than 1023 bytes
-c
Characters; a list following -c specifies a range of characters which will be returned, e.g. cut -c1-66 would return the first 66 characters of a line
-f
Specifies a field list, separated by a delimiter
list
A comma separated or blank separated list of integer denoted fields, incrementally ordered. The - indicator may be supplied as shorthand to allow inclusion of ranges of fields e.g. 4-6 for ranges 4–6 or 5- as shorthand for field 5 to the end, etc.
-n
Used in combination with -b suppresses splits of multi-byte characters
-d
Delimiter; the character immediately following the -d option is the field delimiter for use in conjunction with the -f option; the default delimiter is tab. Space and other characters with special meanings within the context of the shell in use must be enquoted or escaped as necessary.
-s
Bypasses lines which contain no field delimiters when -f is specified, unless otherwise indicated.
Remove ads

Examples

Summarize
Perspective

Given a file named foo with content:

foo:bar:baz:qux:quux
one:two:three:four:five:six:seven
alpha:beta:gamma:delta:epsilon:zeta:eta:theta:iota:kappa:lambda:mu
the quick brown fox jumps over the lazy dog

To output the fourth through tenth characters of each line:

$ cut -c 4-10 foo
:bar:ba
:two:th
ha:beta
 quick

To output the fifth field through the end of the line of each line using the colon character as the field delimiter:

$ cut -d ":" -f 5- foo
quux
five:six:seven
epsilon:zeta:eta:theta:iota:kappa:lambda:mu
the quick brown fox jumps over the lazy dog

Because the colon is not found in the last line, the entire line is shown.

Option -d specifies a single character delimiter (in the example above it is a colon) which serves as field separator. Option -f which specifies range of fields included in the output (here fields range from five till the end). Option -d presupposes usage of option -f.

To output the third field of each line using space as the field delimiter:

$ cut -d " " -f 3 foo
foo:bar:baz:qux:quux
one:two:three:four:five:six:seven
alpha:beta:gamma:delta:epsilon:zeta:eta:theta:iota:kappa:lambda:mu
brown

Because the space character is not found in the first three lines these entire lines are shown.

To separate two words having any delimiter:

$ line=process.processid
$ cut -d "." -f1 <<< $line
process
$ cut -d "." -f2 <<< $line
processid

See also

  • awk – Text processing programming language
  • grep – Unix command line utility for text search
  • List of POSIX commands
  • paste (Unix) – Shell command for joining files horizontally
  • sed – Standard UNIX utility for editing streams of data

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads