437 lines
15 KiB
HTML
437 lines
15 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<meta name="generator" content="HTML Tidy, see www.w3.org">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 -->
|
|
<!-- Copyright (c) 2001 The Open Group, All Rights Reserved -->
|
|
<title>join</title>
|
|
</head>
|
|
<body bgcolor="white">
|
|
<script type="text/javascript" language="JavaScript" src="../jscript/codes.js">
|
|
</script>
|
|
|
|
<basefont size="3"> <a name="join"></a> <a name="tag_04_71"></a><!-- join -->
|
|
<!--header start-->
|
|
<center><font size="2">The Open Group Base Specifications Issue 6<br>
|
|
IEEE Std 1003.1-2001<br>
|
|
Copyright © 2001 The IEEE and The Open Group, All Rights reserved.</font></center>
|
|
|
|
<!--header end-->
|
|
<hr size="2" noshade>
|
|
<h4><a name="tag_04_71_01"></a>NAME</h4>
|
|
|
|
<blockquote>join - relational database operator</blockquote>
|
|
|
|
<h4><a name="tag_04_71_02"></a>SYNOPSIS</h4>
|
|
|
|
<blockquote class="synopsis">
|
|
<p><code><tt>join</tt> <b>[</b><tt>-a</tt> <i>file_number</i> <tt>| -v</tt> <i>file_number</i><b>][</b><tt>-e</tt>
|
|
<i>string</i><b>][</b><tt>-o</tt> <i>list</i><b>][</b><tt>-t</tt> <i>char</i><b>]<br>
|
|
</b> <tt> </tt> <b>[</b><tt>-1</tt> <i>field</i><b>][</b><tt>-2</tt> <i>field</i><b>]</b>
|
|
<i>file1 file2</i></code></p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_03"></a>DESCRIPTION</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>join</i> utility shall perform an equality join on the files <i>file1</i> and <i>file2</i>. The joined files shall be
|
|
written to the standard output.</p>
|
|
|
|
<p>The join field is a field in each file on which the files are compared. The <i>join</i> utility shall write one line in the
|
|
output for each pair of lines in <i>file1</i> and <i>file2</i> that have identical join fields. The output line by default shall
|
|
consist of the join field, then the remaining fields from <i>file1</i>, then the remaining fields from <i>file2</i>. This format
|
|
can be changed by using the <b>-o</b> option (see below). The <b>-a</b> option can be used to add unmatched lines to the output.
|
|
The <b>-v</b> option can be used to output only unmatched lines.</p>
|
|
|
|
<p>The files <i>file1</i> and <i>file2</i> shall be ordered in the collating sequence of <a href=
|
|
"../utilities/sort.html"><i>sort</i></a> <b>-b</b> on the fields on which they shall be joined, by default the first in each line.
|
|
All selected output shall be written in the same collating sequence.</p>
|
|
|
|
<p>The default input field separators shall be <blank>s. In this case, multiple separators shall count as one field
|
|
separator, and leading separators shall be ignored. The default output field separator shall be a <space>.</p>
|
|
|
|
<p>The field separator and collating sequence can be changed by using the <b>-t</b> option (see below).</p>
|
|
|
|
<p>If the same key appears more than once in either file, all combinations of the set of remaining fields in <i>file1</i> and the
|
|
set of remaining fields in <i>file2</i> are output in the order of the lines encountered.</p>
|
|
|
|
<p>If the input files are not in the appropriate collating sequence, the results are unspecified.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_04"></a>OPTIONS</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>join</i> utility shall conform to the Base Definitions volume of IEEE Std 1003.1-2001, <a href=
|
|
"../basedefs/xbd_chap12.html#tag_12_02">Section 12.2, Utility Syntax Guidelines</a>.</p>
|
|
|
|
<p>The following options shall be supported:</p>
|
|
|
|
<dl compact>
|
|
<dt><b>-a </b> <i>file_number</i></dt>
|
|
|
|
<dd><br>
|
|
Produce a line for each unpairable line in file <i>file_number</i>, where <i>file_number</i> is 1 or 2, in addition to the default
|
|
output. If both <b>-a</b>1 and <b>-a</b>2 are specified, all unpairable lines shall be output.</dd>
|
|
|
|
<dt><b>-e </b> <i>string</i></dt>
|
|
|
|
<dd>Replace empty output fields in the list selected by <b>-o</b> with the string <i>string</i>.</dd>
|
|
|
|
<dt><b>-o </b> <i>list</i></dt>
|
|
|
|
<dd>Construct the output line to comprise the fields specified in <i>list</i>, each element of which shall have one of the
|
|
following two forms:
|
|
|
|
<ol>
|
|
<li>
|
|
<p><i>file_number.field</i>, where <i>file_number</i> is a file number and <i>field</i> is a decimal integer field number</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>0 (zero), representing the join field</p>
|
|
</li>
|
|
</ol>
|
|
|
|
<p>The elements of <i>list</i> shall be either comma-separated or <blank>-separated, as specified in Guideline 8 of the Base
|
|
Definitions volume of IEEE Std 1003.1-2001, <a href="../basedefs/xbd_chap12.html#tag_12_02">Section 12.2, Utility Syntax
|
|
Guidelines</a>. The fields specified by <i>list</i> shall be written for all selected output lines. Fields selected by <i>list</i>
|
|
that do not appear in the input shall be treated as empty output fields. (See the <b>-e</b> option.) Only specifically requested
|
|
fields shall be written. The application shall ensure that <i>list</i> is a single command line argument.</p>
|
|
</dd>
|
|
|
|
<dt><b>-t </b> <i>char</i></dt>
|
|
|
|
<dd>Use character <i>char</i> as a separator, for both input and output. Every appearance of <i>char</i> in a line shall be
|
|
significant. When this option is specified, the collating sequence shall be the same as <a href=
|
|
"../utilities/sort.html"><i>sort</i></a> without the <b>-b</b> option.</dd>
|
|
|
|
<dt><b>-v </b> <i>file_number</i></dt>
|
|
|
|
<dd><br>
|
|
Instead of the default output, produce a line only for each unpairable line in <i>file_number</i>, where <i>file_number</i> is 1 or
|
|
2. If both <b>-v</b>1 and <b>-v</b>2 are specified, all unpairable lines shall be output.</dd>
|
|
|
|
<dt><b>-1 </b> <i>field</i></dt>
|
|
|
|
<dd>Join on the <i>field</i>th field of file 1. Fields are decimal integers starting with 1.</dd>
|
|
|
|
<dt><b>-2 </b> <i>field</i></dt>
|
|
|
|
<dd>Join on the <i>field</i>th field of file 2. Fields are decimal integers starting with 1.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_05"></a>OPERANDS</h4>
|
|
|
|
<blockquote>
|
|
<p>The following operands shall be supported:</p>
|
|
|
|
<dl compact>
|
|
<dt><i>file1</i>, <i>file2</i></dt>
|
|
|
|
<dd>A pathname of a file to be joined. If either of the <i>file1</i> or <i>file2</i> operands is <tt>'-'</tt> , the standard input
|
|
shall be used in its place.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_06"></a>STDIN</h4>
|
|
|
|
<blockquote>
|
|
<p>The standard input shall be used only if the <i>file1</i> or <i>file2</i> operand is <tt>'-'</tt> . See the INPUT FILES
|
|
section.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_07"></a>INPUT FILES</h4>
|
|
|
|
<blockquote>
|
|
<p>The input files shall be text files.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_08"></a>ENVIRONMENT VARIABLES</h4>
|
|
|
|
<blockquote>
|
|
<p>The following environment variables shall affect the execution of <i>join</i>:</p>
|
|
|
|
<dl compact>
|
|
<dt><i>LANG</i></dt>
|
|
|
|
<dd>Provide a default value for the internationalization variables that are unset or null. (See the Base Definitions volume of
|
|
IEEE Std 1003.1-2001, <a href="../basedefs/xbd_chap08.html#tag_08_02">Section 8.2, Internationalization Variables</a> for
|
|
the precedence of internationalization variables used to determine the values of locale categories.)</dd>
|
|
|
|
<dt><i>LC_ALL</i></dt>
|
|
|
|
<dd>If set to a non-empty string value, override the values of all the other internationalization variables.</dd>
|
|
|
|
<dt><i>LC_COLLATE</i></dt>
|
|
|
|
<dd><br>
|
|
Determine the locale of the collating sequence <i>join</i> expects to have been used when the input files were sorted.</dd>
|
|
|
|
<dt><i>LC_CTYPE</i></dt>
|
|
|
|
<dd>Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as
|
|
opposed to multi-byte characters in arguments and input files).</dd>
|
|
|
|
<dt><i>LC_MESSAGES</i></dt>
|
|
|
|
<dd>Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard
|
|
error.</dd>
|
|
|
|
<dt><i>NLSPATH</i></dt>
|
|
|
|
<dd><sup>[<a href="javascript:open_code('XSI')">XSI</a>]</sup> <img src="../images/opt-start.gif" alt="[Option Start]" border="0">
|
|
Determine the location of message catalogs for the processing of <i>LC_MESSAGES .</i> <img src="../images/opt-end.gif" alt=
|
|
"[Option End]" border="0"></dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_09"></a>ASYNCHRONOUS EVENTS</h4>
|
|
|
|
<blockquote>
|
|
<p>Default.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_10"></a>STDOUT</h4>
|
|
|
|
<blockquote>
|
|
<p>The <i>join</i> utility output shall be a concatenation of selected character fields. When the <b>-o</b> option is not
|
|
specified, the output shall be:</p>
|
|
|
|
<pre>
|
|
<tt>"%s%s%s\n", <</tt><i>join field</i><tt>>, <</tt><i>other file1 fields</i><tt>>,
|
|
<</tt><i>other file2 fields</i><tt>>
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>If the join field is not the first field in a file, the <<i>other file fields</i>> for that file shall be:</p>
|
|
|
|
<pre>
|
|
<tt><</tt><i>fields preceding join field</i><tt>>, <</tt><i>fields following join field</i><tt>>
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>When the <b>-o</b> option is specified, the output format shall be:</p>
|
|
|
|
<pre>
|
|
<tt>"%s\n", <</tt><i>concatenation of fields</i><tt>>
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>where the concatenation of fields is described by the <b>-o</b> option, above.</p>
|
|
|
|
<p>For either format, each field (except the last) shall be written with its trailing separator character. If the separator is the
|
|
default ( <blank>s), a single <space> shall be written after each field (except the last).</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_11"></a>STDERR</h4>
|
|
|
|
<blockquote>
|
|
<p>The standard error shall be used only for diagnostic messages.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_12"></a>OUTPUT FILES</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_13"></a>EXTENDED DESCRIPTION</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_14"></a>EXIT STATUS</h4>
|
|
|
|
<blockquote>
|
|
<p>The following exit values shall be returned:</p>
|
|
|
|
<dl compact>
|
|
<dt> 0</dt>
|
|
|
|
<dd>All input files were output successfully.</dd>
|
|
|
|
<dt>>0</dt>
|
|
|
|
<dd>An error occurred.</dd>
|
|
</dl>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_15"></a>CONSEQUENCES OF ERRORS</h4>
|
|
|
|
<blockquote>
|
|
<p>Default.</p>
|
|
</blockquote>
|
|
|
|
<hr>
|
|
<div class="box"><em>The following sections are informative.</em></div>
|
|
|
|
<h4><a name="tag_04_71_16"></a>APPLICATION USAGE</h4>
|
|
|
|
<blockquote>
|
|
<p>Pathnames consisting of numeric digits or of the form <i>string.string</i> should not be specified directly following the
|
|
<b>-o</b> list.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_17"></a>EXAMPLES</h4>
|
|
|
|
<blockquote>
|
|
<p>The <b>-o</b> 0 field essentially selects the union of the join fields. For example, given file <b>phone</b>:</p>
|
|
|
|
<pre>
|
|
<tt>!Name Phone Number
|
|
Don +1 123-456-7890
|
|
Hal +1 234-567-8901
|
|
Yasushi +2 345-678-9012
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>and file <b>fax</b>:</p>
|
|
|
|
<pre>
|
|
<tt>!Name Fax Number
|
|
Don +1 123-456-7899
|
|
Keith +1 456-789-0122
|
|
Yasushi +2 345-678-9011
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>(where the large expanses of white space are meant to each represent a single <tab>), the command:</p>
|
|
|
|
<pre>
|
|
<tt>join -t "<tab>" -a 1 -a 2 -e '(unknown)' -o 0,1.2,2.2 phone fax
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>would produce:</p>
|
|
|
|
<pre>
|
|
<tt>!Name Phone Number Fax Number
|
|
Don +1 123-456-7890 +1 123-456-7899
|
|
Hal +1 234-567-8901 (unknown)
|
|
Keith (unknown) +1 456-789-0122
|
|
Yasushi +2 345-678-9012 +2 345-678-9011
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>Multiple instances of the same key will produce combinatorial results. The following:</p>
|
|
|
|
<pre>
|
|
<tt>fa:
|
|
a x
|
|
a y
|
|
a z
|
|
fb:
|
|
a p
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>will produce:</p>
|
|
|
|
<pre>
|
|
<tt>a x p
|
|
a y p
|
|
a z p
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>And the following:</p>
|
|
|
|
<pre>
|
|
<tt>fa:
|
|
a b c
|
|
a d e
|
|
fb:
|
|
a w x
|
|
a y z
|
|
a o p
|
|
</tt>
|
|
</pre>
|
|
|
|
<p>will produce:</p>
|
|
|
|
<pre>
|
|
<tt>a b c w x
|
|
a b c y z
|
|
a b c o p
|
|
a d e w x
|
|
a d e y z
|
|
a d e o p
|
|
</tt>
|
|
</pre>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_18"></a>RATIONALE</h4>
|
|
|
|
<blockquote>
|
|
<p>The <b>-e</b> option is only effective when used with <b>-o</b> because, unless specific fields are identified using <b>-o</b>,
|
|
<i>join</i> is not aware of what fields might be empty. The exception to this is the join field, but identifying an empty join
|
|
field with the <b>-e</b> string is not historical practice and some scripts might break if this were changed.</p>
|
|
|
|
<p>The 0 field in the <b>-o</b> list was adopted from the Tenth Edition version of <i>join</i> to satisfy international objections
|
|
that the <i>join</i> in the base documents does not support the "full join" or "outer join" described in relational database
|
|
literature. Although it has been possible to include a join field in the output (by default, or by field number using <b>-o</b>),
|
|
the join field could not be included for an unpaired line selected by <b>-a</b>. The <b>-o</b> 0 field essentially selects the
|
|
union of the join fields.</p>
|
|
|
|
<p>This sort of outer join was not possible with the <i>join</i> commands in the base documents. The <b>-o</b> 0 field was chosen
|
|
because it is an upwards-compatible change for applications. An alternative was considered: have the join field represent the union
|
|
of the fields in the files (where they are identical for matched lines, and one or both are null for unmatched lines). This was not
|
|
adopted because it would break some historical applications.</p>
|
|
|
|
<p>The ability to specify <i>file2</i> as <b>-</b> is not historical practice; it was added for completeness.</p>
|
|
|
|
<p>The <b>-v</b> option is not historical practice, but was considered necessary because it permitted the writing of <i>only</i>
|
|
those lines that do not match on the join field, as opposed to the <b>-a</b> option, which prints both lines that do and do not
|
|
match. This additional facility is parallel with the <b>-v</b> option of <a href="../utilities/grep.html"><i>grep</i></a>.</p>
|
|
|
|
<p>Some historical implementations have been encountered where a blank line in one of the input files was considered to be the end
|
|
of the file; the description in this volume of IEEE Std 1003.1-2001 does not cite this as an allowable case.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_19"></a>FUTURE DIRECTIONS</h4>
|
|
|
|
<blockquote>
|
|
<p>None.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_20"></a>SEE ALSO</h4>
|
|
|
|
<blockquote>
|
|
<p><a href="awk.html"><i>awk</i></a> , <a href="comm.html"><i>comm</i></a> , <a href="sort.html"><i>sort</i></a> , <a href=
|
|
"uniq.html"><i>uniq</i></a></p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_21"></a>CHANGE HISTORY</h4>
|
|
|
|
<blockquote>
|
|
<p>First released in Issue 2.</p>
|
|
</blockquote>
|
|
|
|
<h4><a name="tag_04_71_22"></a>Issue 6</h4>
|
|
|
|
<blockquote>
|
|
<p>The obsolescent <b>-j</b> options and the multi-argument <b>-o</b> option are withdrawn in this issue.</p>
|
|
|
|
<p>The normative text is reworded to avoid use of the term "must" for application requirements.</p>
|
|
</blockquote>
|
|
|
|
<div class="box"><em>End of informative text.</em></div>
|
|
|
|
<hr>
|
|
<hr size="2" noshade>
|
|
<center><font size="2"><!--footer start-->
|
|
UNIX ® is a registered Trademark of The Open Group.<br>
|
|
POSIX ® is a registered Trademark of The IEEE.<br>
|
|
[ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href=
|
|
"../utilities/contents.html">XCU</a> | <a href="../functions/contents.html">XSH</a> | <a href="../xrat/contents.html">XRAT</a>
|
|
]</font></center>
|
|
|
|
<!--footer end-->
|
|
<hr size="2" noshade>
|
|
</body>
|
|
</html>
|
|
|