568 lines
23 KiB
HTML
568 lines
23 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<meta name="generator" content="HTML Tidy, see www.w3.org">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 -->
|
|
<!-- Copyright (c) 2001 The Open Group, All Rights Reserved -->
|
|
<title>tr</title>
|
|
</head>
|
|
<body bgcolor="white">
|
|
<script type="text/javascript" language="JavaScript" src="../jscript/codes.js">
|
|
</script>
|
|
|
|
<basefont size="3"> <a name="tr"></a> <a name="tag_04_145"></a><!-- tr -->
|
|
<!--header start-->
|
|
<center><font size="2">The Open Group Base Specifications Issue 6<br>
|
|
IEEE Std 1003.1-2001<br>
|
|
Copyright © 2001 The IEEE and The Open Group, All Rights reserved.</font></center>
|
|
|
|
<!--header end-->
|
|
<hr size="2" noshade>
|
|
<h4><a name="tag_04_145_01"></a>NAME</h4>
|
|
|
|
<blockquote>tr - translate characters</blockquote>
|
|
|
|
<h4><a name="tag_04_145_02"></a>SYNOPSIS</h4>
|
|
|
|
<blockquote class="synopsis">
|
|
<p><code><tt>tr</tt> <b>[</b><tt>-c | -C</tt><b>][</b><tt>-s]</tt> <i>string1 string2</i><tt><br>
|
|
<br>
|
|
tr -s</tt> <b>[</b><tt>-c | -C</tt><b>]</b> <i>string1</i><tt><br>
|
|
<br>
|
|
tr -d</tt> <b>[</b><tt>-c | -C</tt><b>]</b> <i>string1</i><tt><br>
|
|
<br>
|
|
tr -ds</tt> <b>[</b><tt>-c | -C</tt><b>]</b> <i>string1 string2</i><tt><br>
|
|
</tt></code></p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_03"></a>DESCRIPTION</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>tr</i> utility shall copy the standard input to the standard output with substitution or deletion of selected characters.
|
|
The options specified and the <i>string1</i> and <i>string2</i> operands shall control translations that occur while copying
|
|
characters and single-character collating elements.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_04"></a>OPTIONS</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>tr</i> utility shall conform to the Base Definitions volume of IEEE Std 1003.1-2001, <a href=
|
|
"../basedefs/xbd_chap12.html#tag_12_02">Section 12.2, Utility Syntax Guidelines</a>.</p>
|
|
|
|
<p>The following options shall be supported:</p>
|
|
|
|
<dl compact>
|
|
<dt><b>-c</b></dt>
|
|
|
|
<dd>Complement the set of values specified by <i>string1</i>. See the EXTENDED DESCRIPTION section.</dd>
|
|
|
|
<dt><b>-C</b></dt>
|
|
|
|
<dd>Complement the set of characters specified by <i>string1</i>. See the EXTENDED DESCRIPTION section.</dd>
|
|
|
|
<dt><b>-d</b></dt>
|
|
|
|
<dd>Delete all occurrences of input characters that are specified by <i>string1</i>.</dd>
|
|
|
|
<dt><b>-s</b></dt>
|
|
|
|
<dd>Replace instances of repeated characters with a single character, as described in the EXTENDED DESCRIPTION section.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_05"></a>OPERANDS</h4>
|
|
|
|
<blockquote>
|
|
<p>The following operands shall be supported:</p>
|
|
|
|
<dl compact>
|
|
<dt><i>string1</i>, <i>string2</i></dt>
|
|
|
|
<dd><br>
|
|
Translation control strings. Each string shall represent a set of characters to be converted into an array of characters used for
|
|
the translation. For a detailed description of how the strings are interpreted, see the EXTENDED DESCRIPTION section.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_06"></a>STDIN</h4>
|
|
|
|
<blockquote>
|
|
<p>The standard input can be any type of file.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_07"></a>INPUT FILES</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_08"></a>ENVIRONMENT VARIABLES</h4>
|
|
|
|
<blockquote>
|
|
<p>The following environment variables shall affect the execution of <i>tr</i>:</p>
|
|
|
|
<dl compact>
|
|
<dt><i>LANG</i></dt>
|
|
|
|
<dd>Provide a default value for the internationalization variables that are unset or null. (See the Base Definitions volume of
|
|
IEEE Std 1003.1-2001, <a href="../basedefs/xbd_chap08.html#tag_08_02">Section 8.2, Internationalization Variables</a> for
|
|
the precedence of internationalization variables used to determine the values of locale categories.)</dd>
|
|
|
|
<dt><i>LC_ALL</i></dt>
|
|
|
|
<dd>If set to a non-empty string value, override the values of all the other internationalization variables.</dd>
|
|
|
|
<dt><i>LC_COLLATE</i></dt>
|
|
|
|
<dd><br>
|
|
Determine the locale for the behavior of range expressions and equivalence classes.</dd>
|
|
|
|
<dt><i>LC_CTYPE</i></dt>
|
|
|
|
<dd>Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as
|
|
opposed to multi-byte characters in arguments) and the behavior of character classes.</dd>
|
|
|
|
<dt><i>LC_MESSAGES</i></dt>
|
|
|
|
<dd>Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard
|
|
error.</dd>
|
|
|
|
<dt><i>NLSPATH</i></dt>
|
|
|
|
<dd><sup>[<a href="javascript:open_code('XSI')">XSI</a>]</sup> <img src="../images/opt-start.gif" alt="[Option Start]" border="0">
|
|
Determine the location of message catalogs for the processing of <i>LC_MESSAGES .</i> <img src="../images/opt-end.gif" alt=
|
|
"[Option End]" border="0"></dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_09"></a>ASYNCHRONOUS EVENTS</h4>
|
|
|
|
<blockquote>
|
|
<p>Default.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_10"></a>STDOUT</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>tr</i> output shall be identical to the input, with the exception of the specified transformations.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_11"></a>STDERR</h4>
|
|
|
|
<blockquote>
|
|
<p>The standard error shall be used only for diagnostic messages.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_12"></a>OUTPUT FILES</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_13"></a>EXTENDED DESCRIPTION</h4>
|
|
|
|
<blockquote>
|
|
<p>The operands <i>string1</i> and <i>string2</i> (if specified) define two arrays of characters. The constructs in the following
|
|
list can be used to specify characters or single-character collating elements. If any of the constructs result in multi-character
|
|
collating elements, <i>tr</i> shall exclude, without a diagnostic, those multi-character elements from the resulting array.</p>
|
|
|
|
<dl compact>
|
|
<dt><i>character</i></dt>
|
|
|
|
<dd>Any character not described by one of the conventions below shall represent itself.</dd>
|
|
|
|
<dt>\<i>octal</i></dt>
|
|
|
|
<dd>Octal sequences can be used to represent characters with specific coded values. An octal sequence shall consist of a backslash
|
|
followed by the longest sequence of one, two, or three-octal-digit characters (01234567). The sequence shall cause the value whose
|
|
encoding is represented by the one, two, or three-digit octal integer to be placed into the array. If the size of a byte on the
|
|
system is greater than nine bits, the valid escape sequence used to represent a byte is implementation-defined. Multi-byte
|
|
characters require multiple, concatenated escape sequences of this type, including the leading <tt>'\'</tt> for each byte.</dd>
|
|
|
|
<dt>\<i>character</i></dt>
|
|
|
|
<dd>The backslash-escape sequences in the Base Definitions volume of IEEE Std 1003.1-2001, Table 5-1, Escape Sequences
|
|
and Associated Actions ( <tt>'\\'</tt> , <tt>'\a'</tt> , <tt>'\b'</tt> , <tt>'\f'</tt> , <tt>'\n'</tt> , <tt>'\r'</tt> ,
|
|
<tt>'\t'</tt> , <tt>'\v'</tt> ) shall be supported. The results of using any other character, other than an octal digit, following
|
|
the backslash are unspecified.</dd>
|
|
|
|
<dt><i>c</i>-<i>c</i></dt>
|
|
|
|
<dd>In the POSIX locale, this construct shall represent the range of collating elements between the range endpoints (as long as
|
|
neither endpoint is an octal sequence of the form \<i>octal</i>), inclusive, as defined by the collation sequence. The characters
|
|
or collating elements in the range shall be placed in the array in ascending collation sequence. If the second endpoint precedes
|
|
the starting endpoint in the collation sequence, it is unspecified whether the range of collating elements is empty, or this
|
|
construct is treated as invalid. In locales other than the POSIX locale, this construct has unspecified behavior.
|
|
|
|
<p>If either or both of the range endpoints are octal sequences of the form \<i>octal</i>, this shall represent the range of
|
|
specific coded values between the two range endpoints, inclusive.</p>
|
|
</dd>
|
|
|
|
<dt>[:<i>class</i>:]</dt>
|
|
|
|
<dd>Represents all characters belonging to the defined character class, as defined by the current setting of the <i>LC_CTYPE</i>
|
|
locale category. The following character class names shall be accepted when specified in <i>string1</i>:
|
|
|
|
<blockquote>
|
|
<table cellpadding="3">
|
|
<tr valign="top">
|
|
<td align="left">
|
|
<p class="tent"><b>alnum</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>blank</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>digit</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>lower</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>punct</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>upper</b></p>
|
|
</td>
|
|
</tr>
|
|
|
|
<tr valign="top">
|
|
<td align="left">
|
|
<p class="tent"><b>alpha</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>cntrl</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>graph</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>print</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>space</b></p>
|
|
</td>
|
|
<td align="left">
|
|
<p class="tent"><b>xdigit</b></p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</blockquote>
|
|
|
|
<p><sup>[<a href="javascript:open_code('XSI')">XSI</a>]</sup> <img src="../images/opt-start.gif" alt="[Option Start]" border="0">
|
|
In addition, character class expressions of the form [: <i>name</i>:] shall be recognized in those locales where the <i>name</i>
|
|
keyword has been given a <b>charclass</b> definition in the <i>LC_CTYPE</i> category. <img src="../images/opt-end.gif" alt=
|
|
"[Option End]" border="0"></p>
|
|
|
|
<p>When both the <b>-d</b> and <b>-s</b> options are specified, any of the character class names shall be accepted in
|
|
<i>string2</i>. Otherwise, only character class names <b>lower</b> or <b>upper</b> are valid in <i>string2</i> and then only if the
|
|
corresponding character class ( <b>upper</b> and <b>lower</b>, respectively) is specified in the same relative position in
|
|
<i>string1</i>. Such a specification shall be interpreted as a request for case conversion. When [: <i>lower</i>:] appears in
|
|
<i>string1</i> and [: <i>upper</i>:] appears in <i>string2</i>, the arrays shall contain the characters from the <b>toupper</b>
|
|
mapping in the <i>LC_CTYPE</i> category of the current locale. When [: <i>upper</i>:] appears in <i>string1</i> and [:
|
|
<i>lower</i>:] appears in <i>string2</i>, the arrays shall contain the characters from the <b>tolower</b> mapping in the
|
|
<i>LC_CTYPE</i> category of the current locale. The first character from each mapping pair shall be in the array for <i>string1</i>
|
|
and the second character from each mapping pair shall be in the array for <i>string2</i> in the same relative position.</p>
|
|
|
|
<p>Except for case conversion, the characters specified by a character class expression shall be placed in the array in an
|
|
unspecified order.</p>
|
|
|
|
<p>If the name specified for <i>class</i> does not define a valid character class in the current locale, the behavior is
|
|
undefined.</p>
|
|
</dd>
|
|
|
|
<dt>[=<i>equiv</i>=]</dt>
|
|
|
|
<dd>Represents all characters or collating elements belonging to the same equivalence class as <i>equiv</i>, as defined by the
|
|
current setting of the <i>LC_COLLATE</i> locale category. An equivalence class expression shall be allowed only in <i>string1</i>,
|
|
or in <i>string2</i> when it is being used by the combined <b>-d</b> and <b>-s</b> options. The characters belonging to the
|
|
equivalence class shall be placed in the array in an unspecified order.</dd>
|
|
|
|
<dt>[<i>x</i>*<i>n</i>]</dt>
|
|
|
|
<dd>Represents <i>n</i> repeated occurrences of the character <i>x</i>. Because this expression is used to map multiple characters
|
|
to one, it is only valid when it occurs in <i>string2</i>. If <i>n</i> is omitted or is zero, it shall be interpreted as large
|
|
enough to extend the <i>string2</i>-based sequence to the length of the <i>string1</i>-based sequence. If <i>n</i> has a leading
|
|
zero, it shall be interpreted as an octal value. Otherwise, it shall be interpreted as a decimal value.</dd>
|
|
</dl>
|
|
|
|
<p>When the <b>-d</b> option is not specified:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>Each input character found in the array specified by <i>string1</i> shall be replaced by the character in the same relative
|
|
position in the array specified by <i>string2</i>. When the array specified by <i>string2</i> is shorter that the one specified by
|
|
<i>string1</i>, the results are unspecified.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>If the <b>-C</b> option is specified, the complements of the characters specified by <i>string1</i> (the set of all characters
|
|
in the current character set, as defined by the current setting of <i>LC_CTYPE ,</i> except for those actually specified in the
|
|
<i>string1</i> operand) shall be placed in the array in ascending collation sequence, as defined by the current setting of
|
|
<i>LC_COLLATE .</i></p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>If the <b>-c</b> option is specified, the complement of the values specified by <i>string1</i> shall be placed in the array in
|
|
ascending order by binary value.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Because the order in which characters specified by character class expressions or equivalence class expressions is undefined,
|
|
such expressions should only be used if the intent is to map several characters into one. An exception is case conversion, as
|
|
described previously.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>When the <b>-d</b> option is specified:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>Input characters found in the array specified by <i>string1</i> shall be deleted.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>When the <b>-C</b> option is specified with <b>-d</b>, all characters except those specified by <i>string1</i> shall be deleted.
|
|
The contents of <i>string2</i> are ignored, unless the <b>-s</b> option is also specified.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>When the <b>-c</b> option is specified with <b>-d</b>, all values except those specified by <i>string1</i> shall be deleted. The
|
|
contents of <i>string2</i> shall be ignored, unless the <b>-s</b> option is also specified.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The same string cannot be used for both the <b>-d</b> and the <b>-s</b> option; when both options are specified, both
|
|
<i>string1</i> (used for deletion) and <i>string2</i> (used for squeezing) shall be required.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>When the <b>-s</b> option is specified, after any deletions or translations have taken place, repeated sequences of the same
|
|
character shall be replaced by one occurrence of the same character, if the character is found in the array specified by the last
|
|
operand. If the last operand contains a character class, such as the following example:</p>
|
|
|
|
<pre>
|
|
<tt>tr -s '[:space:]'
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>the last operand's array shall contain all of the characters in that character class. However, in a case conversion, as
|
|
described previously, such as:</p>
|
|
|
|
<pre>
|
|
<tt>tr -s '[:upper:]' '[:lower:]'
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>the last operand's array shall contain only those characters defined as the second characters in each of the <b>toupper</b> or
|
|
<b>tolower</b> character pairs, as appropriate.</p>
|
|
|
|
<p>An empty string used for <i>string1</i> or <i>string2</i> produces undefined results.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_14"></a>EXIT STATUS</h4>
|
|
|
|
<blockquote>
|
|
<p>The following exit values shall be returned:</p>
|
|
|
|
<dl compact>
|
|
<dt> 0</dt>
|
|
|
|
<dd>All input was processed successfully.</dd>
|
|
|
|
<dt>>0</dt>
|
|
|
|
<dd>An error occurred.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_15"></a>CONSEQUENCES OF ERRORS</h4>
|
|
|
|
<blockquote>
|
|
<p>Default.</p>
|
|
</blockquote>
|
|
|
|
<hr>
|
|
<div class="box"><em>The following sections are informative.</em></div>
|
|
|
|
<h4><a name="tag_04_145_16"></a>APPLICATION USAGE</h4>
|
|
|
|
<blockquote>
|
|
<p>If necessary, <i>string1</i> and <i>string2</i> can be quoted to avoid pattern matching by the shell.</p>
|
|
|
|
<p>If an ordinary digit (representing itself) is to follow an octal sequence, the octal sequence must use the full three digits to
|
|
avoid ambiguity.</p>
|
|
|
|
<p>When <i>string2</i> is shorter than <i>string1</i>, a difference results between historical System V and BSD systems. A BSD
|
|
system pads <i>string2</i> with the last character found in <i>string2</i>. Thus, it is possible to do the following:</p>
|
|
|
|
<pre>
|
|
<tt>tr 0123456789 d
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>which would translate all digits to the letter <tt>'d'</tt> . Since this area is specifically unspecified in this volume of
|
|
IEEE Std 1003.1-2001, both the BSD and System V behaviors are allowed, but a conforming application cannot rely on
|
|
the BSD behavior. It would have to code the example in the following way:</p>
|
|
|
|
<pre>
|
|
<tt>tr 0123456789 '[d*]'
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>It should be noted that, despite similarities in appearance, the string operands used by <i>tr</i> are not regular
|
|
expressions.</p>
|
|
|
|
<p>Unlike some historical implementations, this definition of the <i>tr</i> utility correctly processes NUL characters in its input
|
|
stream. NUL characters can be stripped by using:</p>
|
|
|
|
<pre>
|
|
<tt>tr -d '\000'
|
|
</tt>
|
|
</pre>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_17"></a>EXAMPLES</h4>
|
|
|
|
<blockquote>
|
|
<ol>
|
|
<li>
|
|
<p>The following example creates a list of all words in <b>file1</b> one per line in <b>file2</b>, where a word is taken to be a
|
|
maximal string of letters.</p>
|
|
|
|
<pre>
|
|
<tt>tr -cs "[:alpha:]" "[\n*]" <file1 >file2
|
|
</tt>
|
|
</pre>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The next example translates all lowercase characters in <b>file1</b> to uppercase and writes the results to standard output.</p>
|
|
|
|
<pre>
|
|
<tt>tr "[:lower:]" "[:upper:]" <file1
|
|
</tt>
|
|
</pre>
|
|
</li>
|
|
|
|
<li>
|
|
<p>This example uses an equivalence class to identify accented variants of the base character <tt>'e'</tt> in <b>file1</b>, which
|
|
are stripped of diacritical marks and written to <b>file2</b>.</p>
|
|
|
|
<pre>
|
|
<tt>tr "[=e=]" e <file1 >file2
|
|
</tt>
|
|
</pre>
|
|
</li>
|
|
</ol>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_18"></a>RATIONALE</h4>
|
|
|
|
<blockquote>
|
|
<p>In some early proposals, an explicit option <b>-n</b> was added to disable the historical behavior of stripping NUL characters
|
|
from the input. It was considered that automatically stripping NUL characters from the input was not correct functionality.
|
|
However, the removal of <b>-n</b> in a later proposal does not remove the requirement that <i>tr</i> correctly process NUL
|
|
characters in its input stream. NUL characters can be stripped by using <i>tr</i> <b>-d</b> '\000'.</p>
|
|
|
|
<p>Historical implementations of <i>tr</i> differ widely in syntax and behavior. For example, the BSD version has not needed the
|
|
bracket characters for the repetition sequence. The <i>tr</i> utility syntax is based more closely on the System V and XPG3 model
|
|
while attempting to accommodate historical BSD implementations. In the case of the short <i>string2</i> padding, the decision was
|
|
to unspecify the behavior and preserve System V and XPG3 scripts, which might find difficulty with the BSD method. The assumption
|
|
was made that BSD users of <i>tr</i> have to make accommodations to meet the syntax defined here. Since it is possible to use the
|
|
repetition sequence to duplicate the desired behavior, whereas there is no simple way to achieve the System V method, this was the
|
|
correct, if not desirable, approach.</p>
|
|
|
|
<p>The use of octal values to specify control characters, while having historical precedents, is not portable. The introduction of
|
|
escape sequences for control characters should provide the necessary portability. It is recognized that this may cause some
|
|
historical scripts to break.</p>
|
|
|
|
<p>An early proposal included support for multi-character collating elements. It was pointed out that, while <i>tr</i> does employ
|
|
some syntactical elements from REs, the aim of <i>tr</i> is quite different; ranges, for example, do not have a similar meaning
|
|
(``any of the chars in the range matches", <i>versus</i> "translate each character in the range to the output counterpart"). As
|
|
a result, the previously included support for multi-character collating elements has been removed. What remains are ranges in
|
|
current collation order (to support, for example, accented characters), character classes, and equivalence classes.</p>
|
|
|
|
<p>In XPG3 the [: <i>class</i>:] and [= <i>equiv</i>=] conventions are shown with double brackets, as in RE syntax. However,
|
|
<i>tr</i> does not implement RE principles; it just borrows part of the syntax. Consequently, [: <i>class</i>:] and [=
|
|
<i>equiv</i>=] should be regarded as syntactical elements on a par with [ <i>x</i>* <i>n</i>], which is not an RE bracket
|
|
expression.</p>
|
|
|
|
<p>The standard developers will consider changes to <i>tr</i> that allow it to translate characters between different character
|
|
encodings, or they will consider providing a new utility to accomplish this.</p>
|
|
|
|
<p>On historical System V systems, a range expression requires enclosing square-brackets, such as:</p>
|
|
|
|
<pre>
|
|
<tt>tr '[a-z]' '[A-Z]'
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>However, BSD-based systems did not require the brackets, and this convention is used here to avoid breaking large numbers of BSD
|
|
scripts:</p>
|
|
|
|
<pre>
|
|
<tt>tr a-z A-Z
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>The preceding System V script will continue to work because the brackets, treated as regular characters, are translated to
|
|
themselves. However, any System V script that relied on <tt>"a-z"</tt> representing the three characters <tt>'a'</tt> ,
|
|
<tt>'-'</tt> , and <tt>'z'</tt> have to be rewritten as <tt>"az-"</tt> .</p>
|
|
|
|
<p>The ISO POSIX-2:1993 standard had a <b>-c</b> option that behaved similarly to the <b>-C</b> option, but did not supply
|
|
functionality equivalent to the <b>-c</b> option specified in IEEE Std 1003.1-2001. This meant that historical practice
|
|
of being able to specify <i>tr</i> <b>-d</b>\200-\377 (which would delete all bytes with the top bit set) would have no effect
|
|
because, in the C locale, bytes with the values octal 200 to octal 377 are not characters.</p>
|
|
|
|
<p>The earlier version also said that octal sequences referred to collating elements and could be placed adjacent to each other to
|
|
specify multi-byte characters. However, it was noted that this caused ambiguities because <i>tr</i> would not be able to tell
|
|
whether adjacent octal sequences were intending to specify multi-byte characters or multiple single byte characters.
|
|
IEEE Std 1003.1-2001 specifies that octal sequences always refer to single byte binary values.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_19"></a>FUTURE DIRECTIONS</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_20"></a>SEE ALSO</h4>
|
|
|
|
<blockquote>
|
|
<p><a href="sed.html"><i>sed</i></a></p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_21"></a>CHANGE HISTORY</h4>
|
|
|
|
<blockquote>
|
|
<p>First released in Issue 2.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_145_22"></a>Issue 6</h4>
|
|
|
|
<blockquote>
|
|
<p>The <b>-C</b> operand is added, and the description of the <b>-c</b> operand is changed to align with the IEEE P1003.2b
|
|
draft standard.</p>
|
|
|
|
<p>The normative text is reworded to avoid use of the term "must" for application requirements.</p>
|
|
</blockquote>
|
|
|
|
<div class="box"><em>End of informative text.</em></div>
|
|
|
|
<hr>
|
|
<hr size="2" noshade>
|
|
<center><font size="2"><!--footer start-->
|
|
UNIX ® is a registered Trademark of The Open Group.<br>
|
|
POSIX ® is a registered Trademark of The IEEE.<br>
|
|
[ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href=
|
|
"../utilities/contents.html">XCU</a> | <a href="../functions/contents.html">XSH</a> | <a href="../xrat/contents.html">XRAT</a>
|
|
]</font></center>
|
|
|
|
<!--footer end-->
|
|
<hr size="2" noshade>
|
|
</body>
|
|
</html>
|
|
|