Files
oldlinux-files/Ref-docs/POSIX/susv3/xrat/xbd_chap04.html
2024-02-19 00:21:47 -05:00

498 lines
26 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 -->
<!-- Copyright (c) 2001 The Open Group, All Rights Reserved -->
<title>Rationale</title>
</head>
<body>
<basefont size="3">
<center><font size="2">The Open Group Base Specifications Issue 6<br>
IEEE Std 1003.1-2001<br>
Copyright &copy; 2001 The IEEE and The Open Group</font></center>
<hr size="2" noshade>
<h3><a name="tag_01_04"></a>General Concepts</h3>
<h4><a name="tag_01_04_01"></a>Concurrent Execution</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_02"></a>Directory Protection</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_03"></a>Extended Security Controls</h4>
<p>Allowing an implementation to define extended security controls enables the use of IEEE&nbsp;Std&nbsp;1003.1-2001 in
environments that require different or more rigorous security than that provided in POSIX.1. Extensions are allowed in two areas:
privilege and file access permissions. The semantics of these areas have been defined to permit extensions with reasonable, but not
exact, compatibility with all existing practices. For example, the elimination of the superuser definition precludes identifying a
process as privileged or not by virtue of its effective user ID.</p>
<h4><a name="tag_01_04_04"></a>File Access Permissions</h4>
<p>A process should not try to anticipate the result of an attempt to access data by <i>a priori</i> use of these rules. Rather, it
should make the attempt to access data and examine the return value (and possibly <i>errno</i> as well), or use <a href=
"../functions/access.html"><i>access</i>()</a>. An implementation may include other security mechanisms in addition to those
specified in POSIX.1, and an access attempt may fail because of those additional mechanisms, even though it would succeed according
to the rules given in this section. (For example, the user's security level might be lower than that of the object of the access
attempt.) The supplementary group IDs provide another reason for a process to not attempt to anticipate the result of an access
attempt.</p>
<h4><a name="tag_01_04_05"></a>File Hierarchy</h4>
<p>Though the file hierarchy is commonly regarded to be a tree, POSIX.1 does not define it as such for three reasons:</p>
<ol>
<li>
<p>Links may join branches.</p>
</li>
<li>
<p>In some network implementations, there may be no single absolute root directory; see <i>pathname resolution</i>.</p>
</li>
<li>
<p>With symbolic links, the file system need not be a tree or even a directed acyclic graph.</p>
</li>
</ol>
<h4><a name="tag_01_04_06"></a>Filenames</h4>
<p>Historically, certain filenames have been reserved. This list includes <b>core</b>, <b>/etc/passwd</b>, and so on. Conforming
applications should avoid these.</p>
<p>Most historical implementations prohibit case folding in filenames; that is, treating uppercase and lowercase alphabetic
characters as identical. However, some consider case folding desirable:</p>
<ul>
<li>
<p>For user convenience</p>
</li>
<li>
<p>For ease-of-implementation of the POSIX.1 interface as a hosted system on some popular operating systems</p>
</li>
</ul>
<p>Variants, such as maintaining case distinctions in filenames, but ignoring them in comparisons, have been suggested. Methods of
allowing escaped characters of the case opposite the default have been proposed.</p>
<p>Many reasons have been expressed for not allowing case folding, including:</p>
<ul>
<li>
<p>No solid evidence has been produced as to whether case-sensitivity or case-insensitivity is more convenient for users.</p>
</li>
<li>
<p>Making case-insensitivity a POSIX.1 implementation option would be worse than either having it or not having it, because:</p>
<ul>
<li>
<p>More confusion would be caused among users.</p>
</li>
<li>
<p>Application developers would have to account for both cases in their code.</p>
</li>
<li>
<p>POSIX.1 implementors would still have other problems with native file systems, such as short or otherwise constrained filenames
or pathnames, and the lack of hierarchical directory structure.</p>
</li>
</ul>
</li>
<li>
<p>Case folding is not easily defined in many European languages, both because many of them use characters outside the US ASCII
alphabetic set, and because:</p>
<ul>
<li>
<p>In Spanish, the digraph <tt>"ll"</tt> is considered to be a single letter, the capitalized form of which may be either
<tt>"Ll"</tt> or <tt>"LL"</tt> , depending on context.</p>
</li>
<li>
<p>In French, the capitalized form of a letter with an accent may or may not retain the accent, depending on the country in which
it is written.</p>
</li>
<li>
<p>In German, the sharp ess may be represented as a single character resembling a Greek beta (&szlig;) in lowercase, but as the
digraph <tt>"SS"</tt> in uppercase.</p>
</li>
<li>
<p>In Greek, there are several lowercase forms of some letters; the one to use depends on its position in the word. Arabic has
similar rules.</p>
</li>
</ul>
</li>
<li>
<p>Many East Asian languages, including Japanese, Chinese, and Korean, do not distinguish case and are sometimes encoded in
character sets that use more than one byte per character.</p>
</li>
<li>
<p>Multiple character codes may be used on the same machine simultaneously. There are several ISO character sets for European
alphabets. In Japan, several Japanese character codes are commonly used together, sometimes even in filenames; this is evidently
also the case in China. To handle case insensitivity, the kernel would have to at least be able to distinguish for which character
sets the concept made sense.</p>
</li>
<li>
<p>The file system implementation historically deals only with bytes, not with characters, except for slash and the null byte.</p>
</li>
<li>
<p>The purpose of POSIX.1 is to standardize the common, existing definition, not to change it. Mandating case-insensitivity would
make all historical implementations non-standard.</p>
</li>
<li>
<p>Not only the interface, but also application programs would need to change, counter to the purpose of having minimal changes to
existing application code.</p>
</li>
<li>
<p>At least one of the original developers of the UNIX system has expressed objection in the strongest terms to either requiring
case-insensitivity or making it an option, mostly on the basis that POSIX.1 should not hinder portability of application programs
across related implementations in order to allow compatibility with unrelated operating systems.</p>
</li>
</ul>
<p>Two proposals were entertained regarding case folding in filenames:</p>
<ol>
<li>
<p>Remove all wording that previously permitted case folding.</p>
<dl compact>
<dt>Rationale</dt>
<dd>Case folding is inconsistent with portable filename character set definition and filename definition (all characters except
slash and null). No known implementations allowing all characters except slash and null also do case folding.</dd>
</dl>
</li>
<li>
<p>Change &quot;though this practice is not recommended:&quot; to &quot;although this practice is strongly discouraged.&quot;</p>
<dl compact>
<dt>Rationale</dt>
<dd>If case folding must be included in POSIX.1, the wording should be stronger to discourage the practice.</dd>
</dl>
</li>
</ol>
<p>The consensus selected the first proposal. Otherwise, a conforming application would have to assume that case folding would
occur when it was not wanted, but that it would not occur when it was wanted.</p>
<h4><a name="tag_01_04_07"></a>File Times Update</h4>
<p>This section reflects the actions of historical implementations. The times are not updated immediately, but are only marked for
update by the functions. An implementation may update these times immediately.</p>
<p>The accuracy of the time update values is intentionally left unspecified so that systems can control the bandwidth of a possible
covert channel.</p>
<p>The wording was carefully chosen to make it clear that there is no requirement that the conformance document contain information
that might incidentally affect file update times. Any function that performs pathname resolution might update several
<i>st_atime</i> fields. Functions such as <a href="../functions/getpwnam.html"><i>getpwnam</i>()</a> and <a href=
"../functions/getgrnam.html"><i>getgrnam</i>()</a> might update the <i>st_atime</i> field of some specific file or files. It is
intended that these are not required to be documented in the conformance document, but they should appear in the system
documentation.</p>
<h4><a name="tag_01_04_08"></a>Host and Network Byte Order</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_09"></a>Measurement of Execution Time</h4>
<p>The methods used to measure the execution time of processes and threads, and the precision of these measurements, may vary
considerably depending on the software architecture of the implementation, and on the underlying hardware. Implementations can also
make tradeoffs between the scheduling overhead and the precision of the execution time measurements. IEEE&nbsp;Std&nbsp;1003.1-2001
does not impose any requirement on the accuracy of the execution time; it instead specifies that the measurement mechanism and its
precision are implementation-defined.</p>
<h4><a name="tag_01_04_10"></a>Memory Synchronization</h4>
<p>In older multi-processors, access to memory by the processors was strictly multiplexed. This meant that a processor executing
program code interrogates or modifies memory in the order specified by the code and that all the memory operation of all the
processors in the system appear to happen in some global order, though the operation histories of different processors are
interleaved arbitrarily. The memory operations of such machines are said to be sequentially consistent. In this environment,
threads can synchronize using ordinary memory operations. For example, a producer thread and a consumer thread can synchronize
access to a circular data buffer as follows:</p>
<blockquote>
<pre>
<tt>int rdptr = 0;
int wrptr = 0;
data_t buf[BUFSIZE];
<br>
Thread 1:
while (work_to_do) {
int next;
<br>
buf[wrptr] = produce();
next = (wrptr + 1) % BUFSIZE;
while (rdptr == next)
;
wrptr = next;
}
<br>
Thread 2:
while (work_to_do) {
while (rdptr == wrptr)
;
consume(buf[rdptr]);
rdptr = (rdptr + 1) % BUFSIZE;
}
</tt>
</pre>
</blockquote>
<p>In modern multi-processors, these conditions are relaxed to achieve greater performance. If one processor stores values in
location A and then location B, then other processors loading data from location B and then location A may see the new value of B
but the old value of A. The memory operations of such machines are said to be weakly ordered. On these machines, the circular
buffer technique shown in the example will fail because the consumer may see the new value of <i>wrptr</i> but the old value of the
data in the buffer. In such machines, synchronization can only be achieved through the use of special instructions that enforce an
order on memory operations. Most high-level language compilers only generate ordinary memory operations to take advantage of the
increased performance. They usually cannot determine when memory operation order is important and generate the special ordering
instructions. Instead, they rely on the programmer to use synchronization primitives correctly to ensure that modifications to a
location in memory are ordered with respect to modifications and/or access to the same location in other threads. Access to
read-only data need not be synchronized. The resulting program is said to be data race-free.</p>
<p>Synchronization is still important even when accessing a single primitive variable (for example, an integer). On machines where
the integer may not be aligned to the bus data width or be larger than the data width, a single memory load may require multiple
memory cycles. This means that it may be possible for some parts of the integer to have an old value while other parts have a newer
value. On some processor architectures this cannot happen, but portable programs cannot rely on this.</p>
<p>In summary, a portable multi-threaded program, or a multi-process program that shares writable memory between processes, has to
use the synchronization primitives to synchronize data access. It cannot rely on modifications to memory being observed by other
threads in the order written in the application or even on modification of a single variable being seen atomically.</p>
<p>Conforming applications may only use the functions listed to synchronize threads of control with respect to memory access. There
are many other candidates for functions that might also be used. Examples are: signal sending and reception, or pipe writing and
reading. In general, any function that allows one thread of control to wait for an action caused by another thread of control is a
candidate. IEEE&nbsp;Std&nbsp;1003.1-2001 does not require these additional functions to synchronize memory access since this would
imply the following:</p>
<ul>
<li>
<p>All these functions would have to be recognized by advanced compilation systems so that memory operations and calls to these
functions are not reordered by optimization.</p>
</li>
<li>
<p>All these functions would potentially have to have memory synchronization instructions added, depending on the particular
machine.</p>
</li>
<li>
<p>The additional functions complicate the model of how memory is synchronized and make automatic data race detection techniques
impractical.</p>
</li>
</ul>
<p>Formal definitions of the memory model were rejected as unreadable by the vast majority of programmers. In addition, most of the
formal work in the literature has concentrated on the memory as provided by the hardware as opposed to the application programmer
through the compiler and runtime system. It was believed that a simple statement intuitive to most programmers would be most
effective. IEEE&nbsp;Std&nbsp;1003.1-2001 defines functions that can be used to synchronize access to memory, but it leaves open
exactly how one relates those functions to the semantics of each function as specified elsewhere in IEEE&nbsp;Std&nbsp;1003.1-2001.
IEEE&nbsp;Std&nbsp;1003.1-2001 also does not make a formal specification of the partial ordering in time that the functions can
impose, as that is implied in the description of the semantics of each function. It simply states that the programmer has to ensure
that modifications do not occur &quot;simultaneously&quot; with other access to a memory location.</p>
<h4><a name="tag_01_04_11"></a>Pathname Resolution</h4>
<p>It is necessary to differentiate between the definition of pathname and the concept of pathname resolution with respect to the
handling of trailing slashes. By specifying the behavior here, it is not possible to provide an implementation that is conforming
but extends all interfaces that handle pathnames to also handle strings that are not legal pathnames (because they have trailing
slashes).</p>
<p>Pathnames that end with one or more trailing slash characters must refer to directory paths. Previous versions of
IEEE&nbsp;Std&nbsp;1003.1-2001 were not specific about the distinction between trailing slashes on files and directories, and both
were permitted.</p>
<p>Two types of implementation have been prevalent; those that ignored trailing slash characters on all pathnames regardless, and
those that permitted them only on existing directories.</p>
<p>IEEE&nbsp;Std&nbsp;1003.1-2001 requires that a pathname with a trailing slash character be treated as if it had a trailing
<tt>"/."</tt> everywhere.</p>
<p>Note that this change does not break any conforming applications; since there were two different types of implementation, no
application could have portably depended on either behavior. This change does however require some implementations to be altered to
remain compliant. Substantial discussion over a three-year period has shown that the benefits to application developers outweighs
the disadvantages for some vendors.</p>
<p>On a historical note, some early applications automatically appended a <tt>'/'</tt> to every path. Rather than fix the
applications, the system implementation was modified to accept this behavior by ignoring any trailing slash.</p>
<p>Each directory has exactly one parent directory which is represented by the name <b>dot-dot</b> in the first directory. No other
directory, regardless of linkages established by symbolic links, is considered the parent directory by
IEEE&nbsp;Std&nbsp;1003.1-2001.</p>
<p>There are two general categories of interfaces involving pathname resolution: those that follow the symbolic link, and those
that do not. There are several exceptions to this rule; for example, <i>open</i>( <i>path</i>,O_CREAT|O_EXCL) will fail when
<i>path</i> names a symbolic link. However, in all other situations, the <a href="../functions/open.html"><i>open</i>()</a>
function will follow the link.</p>
<p>What the filename <b>dot-dot</b> refers to relative to the root directory is implementation-defined. In Version&nbsp;7 it refers
to the root directory itself; this is the behavior mentioned in IEEE&nbsp;Std&nbsp;1003.1-2001. In some networked systems the
construction <b>/../hostname/</b> is used to refer to the root directory of another host, and POSIX.1 permits this behavior.</p>
<p>Other networked systems use the construct <b>//hostname</b> for the same purpose; that is, a double initial slash is used. There
is a potential problem with existing applications that create full pathnames by taking a trunk and a relative pathname and making
them into a single string separated by <tt>'/'</tt> , because they can accidentally create networked pathnames when the trunk is
<tt>'/'</tt> . This practice is not prohibited because such applications can be made to conform by simply changing to use
<tt>"//"</tt> as a separator instead of <tt>'/'</tt> :</p>
<ul>
<li>
<p>If the trunk is <tt>'/'</tt> , the full pathname will begin with <tt>"///"</tt> (the initial <tt>'/'</tt> and the separator
<tt>"//"</tt> ). This is the same as <tt>'/'</tt> , which is what is desired. (This is the general case of making a relative
pathname into an absolute one by prefixing with <tt>"///"</tt> instead of <tt>'/'</tt> .)</p>
</li>
<li>
<p>If the trunk is <tt>"/A"</tt> , the result is <tt>"/A//..."</tt> ; since non-leading sequences of two or more slashes are
treated as a single slash, this is equivalent to the desired <tt>"/A/..."</tt> .</p>
</li>
<li>
<p>If the trunk is <tt>"//A"</tt> , the implementation-defined semantics will apply. (The multiple slash rule would apply.)</p>
</li>
</ul>
<p>Application developers should avoid generating pathnames that start with <tt>"//"</tt> . Implementations are strongly encouraged
to avoid using this special interpretation since a number of applications currently do not follow this practice and may
inadvertently generate <tt>"//..."</tt> .</p>
<p>The term &quot;root directory&quot; is only defined in POSIX.1 relative to the process. In some implementations, there may be no
absolute root directory. The initialization of the root directory of a process is implementation-defined.</p>
<h4><a name="tag_01_04_12"></a>Process ID Reuse</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_13"></a>Scheduling Policy</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_14"></a>Seconds Since the Epoch</h4>
<p>Coordinated Universal Time (UTC) includes leap seconds. However, in POSIX time (seconds since the Epoch), leap seconds are
ignored (not applied) to provide an easy and compatible method of computing time differences. Broken-down POSIX time is therefore
not necessarily UTC, despite its appearance.</p>
<p>As of September 2000, 24 leap seconds had been added to UTC since the Epoch, 1 January, 1970. Historically, one leap second is
added every 15 months on average, so this offset can be expected to grow steadily with time.</p>
<p>Most systems' notion of &quot;time&quot; is that of a continuously increasing value, so this value should increase even during leap
seconds. However, not only do most systems not keep track of leap seconds, but most systems are probably not synchronized to any
standard time reference. Therefore, it is inappropriate to require that a time represented as seconds since the Epoch precisely
represent the number of seconds between the referenced time and the Epoch.</p>
<p>It is sufficient to require that applications be allowed to treat this time as if it represented the number of seconds between
the referenced time and the Epoch. It is the responsibility of the vendor of the system, and the administrator of the system, to
ensure that this value represents the number of seconds between the referenced time and the Epoch as closely as necessary for the
application being run on that system.</p>
<p>It is important that the interpretation of time names and seconds since the Epoch values be consistent across conforming
systems; that is, it is important that all conforming systems interpret &quot;536457599 seconds since the Epoch&quot; as 59 seconds, 59
minutes, 23 hours 31 December 1986, regardless of the accuracy of the system's idea of the current time. The expression is given to
ensure a consistent interpretation, not to attempt to specify the calendar. The relationship between <i>tm_yday</i> and the day of
week, day of month, and month is in accordance with the Gregorian calendar, and so is not specified in POSIX.1.</p>
<p>Consistent interpretation of seconds since the Epoch can be critical to certain types of distributed applications that rely on
such timestamps to synchronize events. The accrual of leap seconds in a time standard is not predictable. The number of leap
seconds since the Epoch will likely increase. POSIX.1 is more concerned about the synchronization of time between applications of
astronomically short duration.</p>
<p>Note that <i>tm_yday</i> is zero-based, not one-based, so the day number in the example above is 364. Note also that the
division is an integer division (discarding remainder) as in the C language.</p>
<p>Note also that the meaning of <a href="../functions/gmtime.html"><i>gmtime</i>()</a>, <a href=
"../functions/localtime.html"><i>localtime</i>()</a>, and <a href="../functions/mktime.html"><i>mktime</i>()</a> is specified in
terms of this expression. However, the ISO&nbsp;C standard computes <i>tm_yday</i> from <i>tm_mday</i>, <i>tm_mon</i>, and
<i>tm_year</i> in <a href="../functions/mktime.html"><i>mktime</i>()</a>. Because it is stated as a (bidirectional) relationship,
not a function, and because the conversion between month-day-year and day-of-year dates is presumed well known and is also a
relationship, this is not a problem.</p>
<p>Implementations that implement <b>time_t</b> as a signed 32-bit integer will overflow in 2038. The data size for <b>time_t</b>
is as per the ISO&nbsp;C standard definition, which is implementation-defined.</p>
<p>See also <a href="xbd_chap03.html#tag_01_03_00_15"><i>Epoch</i></a> .</p>
<p>The topic of whether seconds since the Epoch should account for leap seconds has been debated on a number of occasions, and each
time consensus was reached (with acknowledged dissent each time) that the majority of users are best served by treating all days
identically. (That is, the majority of applications were judged to assume a single length-as measured in seconds since the
Epoch-for all days. Thus, leap seconds are not applied to seconds since the Epoch.) Those applications which do care about leap
seconds can determine how to handle them in whatever way those applications feel is best. This was particularly emphasized because
there was disagreement about what the best way of handling leap seconds might be. It is a practical impossibility to mandate that a
conforming implementation must have a fixed relationship to any particular official clock (consider isolated systems, or systems
performing &quot;reruns&quot; by setting the clock to some arbitrary time).</p>
<p>Note that as a practical consequence of this, the length of a second as measured by some external standard is not specified.
This unspecified second is nominally equal to an International System (SI) second in duration. Applications must be matched to a
system that provides the particular handling of external time in the way required by the application.</p>
<h4><a name="tag_01_04_15"></a>Semaphore</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_16"></a>Thread-Safety</h4>
<p>Where the interface of a function required by IEEE&nbsp;Std&nbsp;1003.1-2001 precludes thread-safety, an alternate thread-safe
form is provided. The names of these thread-safe forms are the same as the non-thread-safe forms with the addition of the suffix
&quot;_r&quot;. The suffix &quot;_r&quot; is historical, where the <tt>'r'</tt> stood for &quot;reentrant&quot;.</p>
<p>In some cases, thread-safety is provided by restricting the arguments to an existing function.</p>
<p>See also <a href="xsh_chap02.html#tag_03_02_09_08"><i>Thread-Safety</i></a> .</p>
<h4><a name="tag_01_04_17"></a>Tracing</h4>
<p>Refer to <a href="xsh_chap02.html#tag_03_02_11"><i>Tracing</i></a> .</p>
<h4><a name="tag_01_04_18"></a>Treatment of Error Conditions for Mathematical Functions</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_19"></a>Treatment of NaN Arguments for Mathematical Functions</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_20"></a>Utility</h4>
<p>There is no additional rationale provided for this section.</p>
<h4><a name="tag_01_04_21"></a>Variable Assignment</h4>
<p>There is no additional rationale provided for this section.</p>
<hr size="2" noshade>
<center><font size="2"><!--footer start-->
UNIX &reg; is a registered Trademark of The Open Group.<br>
POSIX &reg; is a registered Trademark of The IEEE.<br>
[ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href=
"../utilities/contents.html">XCU</a> | <a href="../functions/contents.html">XSH</a> | <a href="../xrat/contents.html">XRAT</a>
]</font></center>
<!--footer end-->
<hr size="2" noshade>
</body>
</html>