399 lines
15 KiB
HTML
399 lines
15 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<meta name="generator" content="HTML Tidy, see www.w3.org">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 -->
|
|
<!-- Copyright (c) 2001 The Open Group, All Rights Reserved -->
|
|
<title>cut</title>
|
|
</head>
|
|
<body bgcolor="white">
|
|
<script type="text/javascript" language="JavaScript" src="../jscript/codes.js">
|
|
</script>
|
|
|
|
<basefont size="3"> <a name="cut"></a> <a name="tag_04_31"></a><!-- cut -->
|
|
<!--header start-->
|
|
<center><font size="2">The Open Group Base Specifications Issue 6<br>
|
|
IEEE Std 1003.1-2001<br>
|
|
Copyright © 2001 The IEEE and The Open Group, All Rights reserved.</font></center>
|
|
|
|
<!--header end-->
|
|
<hr size="2" noshade>
|
|
<h4><a name="tag_04_31_01"></a>NAME</h4>
|
|
|
|
<blockquote>cut - cut out selected fields of each line of a file</blockquote>
|
|
|
|
<h4><a name="tag_04_31_02"></a>SYNOPSIS</h4>
|
|
|
|
<blockquote class="synopsis">
|
|
<p><code><tt>cut -b</tt> <i>list</i> <b>[</b><tt>-n</tt><b>] [</b><i>file</i> <tt>...</tt><b>]</b><tt><br>
|
|
<br>
|
|
cut -c</tt> <i>list</i> <b>[</b><i>file</i> <tt>...</tt><b>]</b><tt><br>
|
|
<br>
|
|
cut -f</tt> <i>list</i> <b>[</b><tt>-d</tt> <i>delim</i><b>][</b><tt>-s</tt><b>][</b><i>file</i> <tt>...</tt><b>]</b><tt><br>
|
|
</tt></code></p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_03"></a>DESCRIPTION</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>cut</i> utility shall cut out bytes ( <b>-b</b> option), characters ( <b>-c</b> option), or character-delimited fields (
|
|
<b>-f</b> option) from each line in one or more files, concatenate them, and write them to standard output.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_04"></a>OPTIONS</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>cut</i> utility shall conform to the Base Definitions volume of IEEE Std 1003.1-2001, <a href=
|
|
"../basedefs/xbd_chap12.html#tag_12_02">Section 12.2, Utility Syntax Guidelines</a>.</p>
|
|
|
|
<p>The application shall ensure that the option-argument <i>list</i> (see options <b>-b</b>, <b>-c</b>, and <b>-f</b> below) is a
|
|
comma-separated list or <blank>-separated list of positive numbers and ranges. Ranges can be in three forms. The first is two
|
|
positive numbers separated by a hyphen ( <i>low</i>- <i>high</i>), which represents all fields from the first number to the second
|
|
number. The second is a positive number preceded by a hyphen (- <i>high</i>), which represents all fields from field number 1 to
|
|
that number. The third is a positive number followed by a hyphen ( <i>low</i>-), which represents that number to the last field,
|
|
inclusive. The elements in <i>list</i> can be repeated, can overlap, and can be specified in any order, but the bytes, characters,
|
|
or fields selected shall be written in the order of the input data. If an element appears in the selection list more than once, it
|
|
shall be written exactly once.</p>
|
|
|
|
<p>The following options shall be supported:</p>
|
|
|
|
<dl compact>
|
|
<dt><b>-b </b> <i>list</i></dt>
|
|
|
|
<dd>Cut based on a <i>list</i> of bytes. Each selected byte shall be output unless the <b>-n</b> option is also specified. It shall
|
|
not be an error to select bytes not present in the input line.</dd>
|
|
|
|
<dt><b>-c </b> <i>list</i></dt>
|
|
|
|
<dd>Cut based on a <i>list</i> of characters. Each selected character shall be output. It shall not be an error to select
|
|
characters not present in the input line.</dd>
|
|
|
|
<dt><b>-d </b> <i>delim</i></dt>
|
|
|
|
<dd>Set the field delimiter to the character <i>delim</i>. The default is the <tab>.</dd>
|
|
|
|
<dt><b>-f </b> <i>list</i></dt>
|
|
|
|
<dd>Cut based on a <i>list</i> of fields, assumed to be separated in the file by a delimiter character (see <b>-d</b>). Each
|
|
selected field shall be output. Output fields shall be separated by a single occurrence of the field delimiter character. Lines
|
|
with no field delimiters shall be passed through intact, unless <b>-s</b> is specified. It shall not be an error to select fields
|
|
not present in the input line.</dd>
|
|
|
|
<dt><b>-n</b></dt>
|
|
|
|
<dd>Do not split characters. When specified with the <b>-b</b> option, each element in <i>list</i> of the form <i>low</i>-
|
|
<i>high</i> (hyphen-separated numbers) shall be modified as follows:
|
|
|
|
<ul>
|
|
<li>
|
|
<p>If the byte selected by <i>low</i> is not the first byte of a character, <i>low</i> shall be decremented to select the first
|
|
byte of the character originally selected by <i>low</i>. If the byte selected by <i>high</i> is not the last byte of a character,
|
|
<i>high</i> shall be decremented to select the last byte of the character prior to the character originally selected by
|
|
<i>high</i>, or zero if there is no prior character. If the resulting range element has <i>high</i> equal to zero or <i>low</i>
|
|
greater than <i>high</i>, the list element shall be dropped from <i>list</i> for that input line without causing an error.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>Each element in <i>list</i> of the form <i>low</i>- shall be treated as above with <i>high</i> set to the number of bytes in the
|
|
current line, not including the terminating <newline>. Each element in <i>list</i> of the form - <i>high</i> shall be treated
|
|
as above with <i>low</i> set to 1. Each element in <i>list</i> of the form <i>num</i> (a single number) shall be treated as above
|
|
with <i>low</i> set to <i>num</i> and <i>high</i> set to <i>num</i>.</p>
|
|
</dd>
|
|
|
|
<dt><b>-s</b></dt>
|
|
|
|
<dd>Suppress lines with no delimiter characters, when used with the <b>-f</b> option. Unless specified, lines with no delimiters
|
|
shall be passed through untouched.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_05"></a>OPERANDS</h4>
|
|
|
|
<blockquote>
|
|
<p>The following operand shall be supported:</p>
|
|
|
|
<dl compact>
|
|
<dt><i>file</i></dt>
|
|
|
|
<dd>A pathname of an input file. If no <i>file</i> operands are specified, or if a <i>file</i> operand is <tt>'-'</tt> , the
|
|
standard input shall be used.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_06"></a>STDIN</h4>
|
|
|
|
<blockquote>
|
|
<p>The standard input shall be used only if no <i>file</i> operands are specified, or if a <i>file</i> operand is <tt>'-'</tt> .
|
|
See the INPUT FILES section.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_07"></a>INPUT FILES</h4>
|
|
|
|
<blockquote>
|
|
<p>The input files shall be text files, except that line lengths shall be unlimited.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_08"></a>ENVIRONMENT VARIABLES</h4>
|
|
|
|
<blockquote>
|
|
<p>The following environment variables shall affect the execution of <i>cut</i>:</p>
|
|
|
|
<dl compact>
|
|
<dt><i>LANG</i></dt>
|
|
|
|
<dd>Provide a default value for the internationalization variables that are unset or null. (See the Base Definitions volume of
|
|
IEEE Std 1003.1-2001, <a href="../basedefs/xbd_chap08.html#tag_08_02">Section 8.2, Internationalization Variables</a> for
|
|
the precedence of internationalization variables used to determine the values of locale categories.)</dd>
|
|
|
|
<dt><i>LC_ALL</i></dt>
|
|
|
|
<dd>If set to a non-empty string value, override the values of all the other internationalization variables.</dd>
|
|
|
|
<dt><i>LC_CTYPE</i></dt>
|
|
|
|
<dd>Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as
|
|
opposed to multi-byte characters in arguments and input files).</dd>
|
|
|
|
<dt><i>LC_MESSAGES</i></dt>
|
|
|
|
<dd>Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard
|
|
error.</dd>
|
|
|
|
<dt><i>NLSPATH</i></dt>
|
|
|
|
<dd><sup>[<a href="javascript:open_code('XSI')">XSI</a>]</sup> <img src="../images/opt-start.gif" alt="[Option Start]" border="0">
|
|
Determine the location of message catalogs for the processing of <i>LC_MESSAGES .</i> <img src="../images/opt-end.gif" alt=
|
|
"[Option End]" border="0"></dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_09"></a>ASYNCHRONOUS EVENTS</h4>
|
|
|
|
<blockquote>
|
|
<p>Default.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_10"></a>STDOUT</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>cut</i> utility output shall be a concatenation of the selected bytes, characters, or fields (one of the following):</p>
|
|
|
|
<pre>
|
|
<tt>"%s\n", <</tt><i>concatenation of bytes</i><tt>>
|
|
<br>
|
|
"%s\n", <</tt><i>concatenation of characters</i><tt>>
|
|
<br>
|
|
"%s\n", <</tt><i>concatenation of fields and field delimiters</i><tt>>
|
|
</tt>
|
|
</pre>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_11"></a>STDERR</h4>
|
|
|
|
<blockquote>
|
|
<p>The standard error shall be used only for diagnostic messages.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_12"></a>OUTPUT FILES</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_13"></a>EXTENDED DESCRIPTION</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_14"></a>EXIT STATUS</h4>
|
|
|
|
<blockquote>
|
|
<p>The following exit values shall be returned:</p>
|
|
|
|
<dl compact>
|
|
<dt> 0</dt>
|
|
|
|
<dd>All input files were output successfully.</dd>
|
|
|
|
<dt>>0</dt>
|
|
|
|
<dd>An error occurred.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_15"></a>CONSEQUENCES OF ERRORS</h4>
|
|
|
|
<blockquote>
|
|
<p>Default.</p>
|
|
</blockquote>
|
|
|
|
<hr>
|
|
<div class="box"><em>The following sections are informative.</em></div>
|
|
|
|
<h4><a name="tag_04_31_16"></a>APPLICATION USAGE</h4>
|
|
|
|
<blockquote>
|
|
<p>Earlier versions of the <i>cut</i> utility worked in an environment where bytes and characters were considered equivalent
|
|
(modulo <backspace> and <tab> processing in some implementations). In the extended world of multi-byte characters, the
|
|
new <b>-b</b> option has been added. The <b>-n</b> option (used with <b>-b</b>) allows it to be used to act on bytes rounded to
|
|
character boundaries. The algorithm specified for <b>-n</b> guarantees that:</p>
|
|
|
|
<pre>
|
|
<tt>cut -b 1-500 -n file > file1
|
|
cut -b 501- -n file > file2
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>ends up with all the characters in <b>file</b> appearing exactly once in <b>file1</b> or <b>file2</b>. (There is, however, a
|
|
<newline> in both <b>file1</b> and <b>file2</b> for each <newline> in <b>file</b>.)</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_17"></a>EXAMPLES</h4>
|
|
|
|
<blockquote>
|
|
<p>Examples of the option qualifier list:</p>
|
|
|
|
<dl compact>
|
|
<dt>1,4,7</dt>
|
|
|
|
<dd>Select the first, fourth, and seventh bytes, characters, or fields and field delimiters.</dd>
|
|
|
|
<dt>1-3,8</dt>
|
|
|
|
<dd>Equivalent to 1,2,3,8.</dd>
|
|
|
|
<dt>-5,10</dt>
|
|
|
|
<dd>Equivalent to 1,2,3,4,5,10.</dd>
|
|
|
|
<dt>3-</dt>
|
|
|
|
<dd>Equivalent to third to last, inclusive.</dd>
|
|
</dl>
|
|
|
|
<p>The <i>low</i>- <i>high</i> forms are not always equivalent when used with <b>-b</b> and <b>-n</b> and multi-byte characters;
|
|
see the description of <b>-n</b>.</p>
|
|
|
|
<p>The following command:</p>
|
|
|
|
<pre>
|
|
<tt>cut -d : -f 1,6 /etc/passwd
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>reads the System V password file (user database) and produces lines of the form:</p>
|
|
|
|
<pre>
|
|
<tt><</tt><i>user ID</i><tt>>:<</tt><i>home directory</i><tt>>
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>Most utilities in this volume of IEEE Std 1003.1-2001 work on text files. The <i>cut</i> utility can be used to turn
|
|
files with arbitrary line lengths into a set of text files containing the same data. The <a href=
|
|
"../utilities/paste.html"><i>paste</i></a> utility can be used to create (or recreate) files with arbitrary line lengths. For
|
|
example, if <b>file</b> contains long lines:</p>
|
|
|
|
<pre>
|
|
<tt>cut -b 1-500 -n file > file1
|
|
cut -b 501- -n file > file2
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>creates <b>file1</b> (a text file) with lines no longer than 500 bytes (plus the <newline>) and <b>file2</b> that contains
|
|
the remainder of the data from <b>file</b>. (Note that <b>file2</b> is not a text file if there are lines in <b>file</b> that are
|
|
longer than 500 + {LINE_MAX} bytes.) The original file can be recreated from <b>file1</b> and <b>file2</b> using the command:</p>
|
|
|
|
<pre>
|
|
<tt>paste -d "\0" file1 file2 > file
|
|
</tt>
|
|
</pre>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_18"></a>RATIONALE</h4>
|
|
|
|
<blockquote>
|
|
<p>Some historical implementations do not count <backspace>s in determining character counts with the <b>-c</b> option. This
|
|
may be useful for using <i>cut</i> for processing <i>nroff</i> output. It was deliberately decided not to have the <b>-c</b> option
|
|
treat either <backspace>s or <tab>s in any special fashion. The <a href="../utilities/fold.html"><i>fold</i></a>
|
|
utility does treat these characters specially.</p>
|
|
|
|
<p>Unlike other utilities, some historical implementations of <i>cut</i> exit after not finding an input file, rather than
|
|
continuing to process the remaining <i>file</i> operands. This behavior is prohibited by this volume of
|
|
IEEE Std 1003.1-2001, where only the exit status is affected by this problem.</p>
|
|
|
|
<p>The behavior of <i>cut</i> when provided with either mutually-exclusive options or options that do not work logically together
|
|
has been deliberately left unspecified in favor of global wording in <a href="xcu_chap01.html#tag_01_11"><i>Utility Description
|
|
Defaults</i></a> .</p>
|
|
|
|
<p>The OPTIONS section was changed in response to IEEE PASC Interpretation 1003.2 #149. The change represents historical practice
|
|
on all known systems. The original standard was ambiguous on the nature of the output.</p>
|
|
|
|
<p>The <i>list</i> option-arguments are historically used to select the portions of the line to be written, but do not affect the
|
|
order of the data. For example:</p>
|
|
|
|
<pre>
|
|
<tt>echo abcdefghi | cut -c6,2,4-7,1
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>yields <tt>"abdefg"</tt> .</p>
|
|
|
|
<p>A proposal to enhance <i>cut</i> with the following option:</p>
|
|
|
|
<dl compact>
|
|
<dt><b>-o</b></dt>
|
|
|
|
<dd>Preserve the selected field order. When this option is specified, each byte, character, or field (or ranges of such) shall be
|
|
written in the order specified by the <i>list</i> option-argument, even if this requires multiple outputs of the same bytes,
|
|
characters, or fields.</dd>
|
|
</dl>
|
|
|
|
<p>was rejected because this type of enhancement is outside the scope of the IEEE P1003.2b draft standard.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_19"></a>FUTURE DIRECTIONS</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_20"></a>SEE ALSO</h4>
|
|
|
|
<blockquote>
|
|
<p><a href="grep.html"><i>grep</i></a> , <a href="paste.html"><i>paste</i></a> , <a href="xcu_chap02.html#tag_02_05"><i>Parameters
|
|
and Variables</i></a></p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_21"></a>CHANGE HISTORY</h4>
|
|
|
|
<blockquote>
|
|
<p>First released in Issue 2.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_31_22"></a>Issue 6</h4>
|
|
|
|
<blockquote>
|
|
<p>The OPTIONS section is changed to align with the IEEE P1003.2b draft standard.</p>
|
|
|
|
<p>The normative text is reworded to avoid use of the term "must" for application requirements.</p>
|
|
</blockquote>
|
|
|
|
<div class="box"><em>End of informative text.</em></div>
|
|
|
|
<hr>
|
|
<hr size="2" noshade>
|
|
<center><font size="2"><!--footer start-->
|
|
UNIX ® is a registered Trademark of The Open Group.<br>
|
|
POSIX ® is a registered Trademark of The IEEE.<br>
|
|
[ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href=
|
|
"../utilities/contents.html">XCU</a> | <a href="../functions/contents.html">XSH</a> | <a href="../xrat/contents.html">XRAT</a>
|
|
]</font></center>
|
|
|
|
<!--footer end-->
|
|
<hr size="2" noshade>
|
|
</body>
|
|
</html>
|
|
|