422 lines
21 KiB
HTML
422 lines
21 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<meta name="generator" content="HTML Tidy, see www.w3.org">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group's rhtm tool v1.2.1 -->
|
|
<!-- Copyright (c) 2001 The Open Group, All Rights Reserved -->
|
|
<title>Rationale</title>
|
|
</head>
|
|
<body>
|
|
|
|
<basefont size="3">
|
|
|
|
<center><font size="2">The Open Group Base Specifications Issue 6<br>
|
|
IEEE Std 1003.1-2001<br>
|
|
Copyright © 2001 The IEEE and The Open Group</font></center>
|
|
|
|
<hr size="2" noshade>
|
|
<h3><a name="tag_02_03"></a>Batch Environment Services and Utilities</h3>
|
|
|
|
<h5><a name="tag_02_03_00_01"></a>Scope of the Batch Environment Option</h5>
|
|
|
|
<p>This section summarizes the deliberations of the IEEE P1003.15 (Batch Environment) working group in the development of the Batch
|
|
Environment option, which covers a set of services and utilities defining a batch processing system.</p>
|
|
|
|
<p>This informative section contains historical information concerning the contents of the amendment and describes why features
|
|
were included or discarded by the working group.</p>
|
|
|
|
<h5><a name="tag_02_03_00_02"></a>History of Batch Systems</h5>
|
|
|
|
<p>The supercomputing technical committee began as a "Birds Of a Feather" (BOF) at the January 1987 Usenix meeting. There was
|
|
enough general interest to form a supercomputing attachment to the /usr/group working groups. Several subgroups rapidly formed. Of
|
|
those subgroups, the batch group was the most ambitious. The first early meetings were spent evaluating user needs and existing
|
|
batch implementations.</p>
|
|
|
|
<p>To evaluate user needs, individuals from the supercomputing community came and presented their needs. Common requests were
|
|
flexibility, interoperability, control of resources, and ease-of-use. Backward-compatibility was not an issue. The working group
|
|
then evaluated some existing systems. The following different systems were evaluated:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>PROD</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Convex Distributed Batch</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>NQS</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>CTSS</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>MDQS from Ballistics Research Laboratory (BRL)</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>Finally, NQS was chosen as a model because it satisfied not only the most user requirements, but because it was public domain,
|
|
already implemented on a variety of hardware platforms, and network-based.</p>
|
|
|
|
<h5><a name="tag_02_03_00_03"></a>Historical Implementations of Batch Systems</h5>
|
|
|
|
<p>Deferred processing of work under the control of a scheduler has been a feature of most proprietary operating systems from the
|
|
earliest days of multi-user systems in order to maximize utilization of the computer.</p>
|
|
|
|
<p>The arrival of UNIX systems proved to be a dilemma to many hardware providers and users because it did not include the
|
|
sophisticated batch facilities offered by the proprietary systems. This omission was rectified in 1986 by NASA Ames Research Center
|
|
who developed the Network Queuing System (NQS) as a portable UNIX application that allowed the routing and processing of batch
|
|
"jobs" in a network. To encourage its usage, the product was later put into the public domain. It was promptly picked up by UNIX
|
|
hardware providers, and ported and developed for their respective hardware and UNIX implementations.</p>
|
|
|
|
<p>Many major vendors, who traditionally offer a batch-dominated environment, ported the public-domain product to their systems,
|
|
customized it to support the capabilities of their systems, and added many customer-requested features.</p>
|
|
|
|
<p>Due to the strong hardware provider and customer acceptance of NQS, it was decided to use NQS as the basis for the POSIX Batch
|
|
Environment amendment in 1987. Other batch systems considered at the time included CTSS, MDQS (a forerunner of NQS from the
|
|
Ballistics Research Laboratory), and PROD (a Los Alamos Labs development). None were thought to have both the functionality and
|
|
acceptability of NQS.</p>
|
|
|
|
<h5><a name="tag_02_03_00_04"></a>NQS Differences from the at utility</h5>
|
|
|
|
<p>The base standard <a href="../utilities/at.html"><i>at</i></a> and <a href="../utilities/batch.html"><i>batch</i></a> utilities
|
|
are not sufficient to meet the batch processing needs in a supercomputing environment and additional functionality in the areas of
|
|
resource management, job scheduling, system management, and control of output is required.</p>
|
|
|
|
<h5><a name="tag_02_03_00_05"></a>Batch Environment Option Definitions</h5>
|
|
|
|
<p>The concept of a batch job is closely related to a session with a session leader. The main difference is that a batch job does
|
|
not have a controlling terminal. There has been much debate over whether to use the term "request" or "job". Job was the final
|
|
choice because of the historical use of this term in the batch environment.</p>
|
|
|
|
<p>The current definition for job identifiers is not sufficient with the model of destinations. The current definition is:</p>
|
|
|
|
<blockquote>
|
|
<pre>
|
|
<tt>sequence_number.originating_host
|
|
</tt>
|
|
</pre>
|
|
</blockquote>
|
|
|
|
<p>Using the model of destination, a host may include multiple batch nodes, the location of which is identified uniquely by a name
|
|
or directory service. If the current definition is used, batch nodes running on the same host would have to coordinate their use of
|
|
sequence numbers, as sequence numbers are assigned by the originating host. The alternative is to use the originating batch node
|
|
name instead of the originating host name.</p>
|
|
|
|
<p>The reasons for wishing to run more than one batch system per host could be the following.</p>
|
|
|
|
<p>A test and production batch system are maintained on a single host. This is most likely in a development facility, but could
|
|
also arise when a site is moving from one version to another. The new batch system could be installed as a test version that is
|
|
completely separate from the production batch system, so that problems can be isolated to the test system. Requiring the batch
|
|
nodes to coordinate their use of sequence numbers creates a dependency between the two nodes, and that defeats the purpose of
|
|
running two nodes.</p>
|
|
|
|
<p>A site has multiple departments using a single host, with different management policies. An example of contention might be in
|
|
job selection algorithms. One group might want a FIFO type of selection, while another group wishes to use a more complex algorithm
|
|
based on resource availability. Again, requiring the batch nodes to coordinate is an unnecessary binding.</p>
|
|
|
|
<p>The proposal eventually accepted was to replace originating host with originating batch node. This supplies sufficient
|
|
granularity to ensure unique job identifiers. If more than one batch node is on a particular host, they each have their own unique
|
|
name.</p>
|
|
|
|
<p>The queue portion of a destination is not part of the job identifier as these are not required to be unique between batch nodes.
|
|
For instance, two batch nodes may both have queues called small, medium, and large. It is only the batch node name that is uniquely
|
|
identifiable throughout the batch system. The queue name has no additional function in this context.</p>
|
|
|
|
<p>Assume there are three batch nodes, each of which has its own name server. On batch node one, there are no queues. On batch node
|
|
two, there are fifty queues. On batch node three, there are forty queues. The system administrator for batch node one does not have
|
|
to configure queues, because there are none implemented. However, if a user wishes to send a job to either batch node two or three,
|
|
the system administrator for batch node one must configure a destination that maps to the appropriate batch node and queue. If
|
|
every queue is to be made accessible from batch node one, the system administrator has to configure ninety destinations.</p>
|
|
|
|
<p>To avoid requiring this, there should be a mechanism to allow a user to separate the destination into a batch node name and a
|
|
queue name. Then, an implementation that is configured to get to all the batch nodes does not need any more configuration to allow
|
|
a user to get to all of the queues on all of the batch nodes. The node name is used to locate the batch node, while the queue name
|
|
is sent unchanged to that batch node.</p>
|
|
|
|
<p>The following are requirements that a destination identifier must be capable of providing:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>The ability to direct a job to a queue in a particular batch node.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The ability to direct a job to a particular batch node.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The ability to group at a higher level than just one queue. This includes grouping similar queues across multiple batch nodes
|
|
(this is a pipe queue).</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The ability to group batch nodes. This allows a user to submit a job to a group name with no knowledge of the batch node
|
|
configuration. This also provides aliasing as a special case. Aliasing is a group containing only one batch node name. The group
|
|
name is the alias.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>In addition, the administrator has the following requirements:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>The ability to control access to the queues.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The ability to control access to the batch nodes.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The ability to control access to groups of queues (pipe queues).</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The ability to configure retry time intervals and durations.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>The requirements of the user are met by destination as explained in the following.</p>
|
|
|
|
<p>The user has the ability to specify a queue name, which is known only to the batch node specified. There is no configuration of
|
|
these queues required on the submitting node.</p>
|
|
|
|
<p>The user has the ability to specify a batch node whose name is network-unique. The configuration required is that the batch node
|
|
be defined as an application, just as other applications such as FTP are configured.</p>
|
|
|
|
<p>Once a job reaches a queue, it can again become a user of the batch system. The batch node can choose to send the job to another
|
|
batch node or queue or both. In other words, the routing is at an application level, and it is up to the batch system to choose
|
|
where the job will be sent. Configuration is up to the batch node where the queue resides. This provides grouping of queues across
|
|
batch nodes or within a batch node. The user submits the job to a queue, which by definition routes the job to other queues or
|
|
nodes or both.</p>
|
|
|
|
<p>A node name may be given to a naming service, which returns multiple addresses as opposed to just one. This provides grouping at
|
|
a batch node level. This is a local issue, meaning that the batch node must choose only one of these addresses. The list of
|
|
addresses is not sent with the job, and once the job is accepted on another node, there is no connection between the list and the
|
|
job. The requirements of the administrator are met by destination as explained in the following.</p>
|
|
|
|
<p>The control of queues is a batch system issue, and will be done using the batch administrative utilities.</p>
|
|
|
|
<p>The control of nodes is a network issue, and will be done through whatever network facilities are available.</p>
|
|
|
|
<p>The control of access to groups of queues (pipe queues) is covered by the control of any other queue. The fact that the job may
|
|
then be sent to another destination is not relevant.</p>
|
|
|
|
<p>The propagation of a job across more than one point-to-point connection was dropped because of its complexity and because all of
|
|
the issues arising from this capability could not be resolved. It could be provided as additional functionality at some time in the
|
|
future.</p>
|
|
|
|
<p>The addition of <i>network</i> as a defined term was done to clarify the difference between a network of batch nodes as opposed
|
|
to a network of hosts. A network of batch nodes is referred to as a batch system. The network refers to the actual host
|
|
configuration. A single host may have multiple batch nodes.</p>
|
|
|
|
<p>In the absence of a standard network naming convention, this option establishes its own convention for the sake of consistency
|
|
and expediency. This is subject to change, should a future working group develop a standard naming convention for network
|
|
pathnames.</p>
|
|
|
|
<h4><a name="tag_02_03_01"></a>Batch General Concepts</h4>
|
|
|
|
<p>During the development of the Batch Environment option, a number of topics were discussed at length which influenced the wording
|
|
of the normative text but could not be included in the final text. The following items are some of the most significant terms and
|
|
concepts of those discussed:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>Small and Consistent Command Set</p>
|
|
|
|
<p>Often, conventional utilities from UNIX systems have a very complicated utility syntax and usage. This can often result in
|
|
confusion and errors when trying to use them. The Batch Environment option utility set, on the other hand, has been paired to a
|
|
small set of robust utilities with an orthogonal calling sequence.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Checkpoint/Restart</p>
|
|
|
|
<p>This feature permits an already executing process to checkpoint or save its contents. Some implementations permit this at both
|
|
the batch utility level (for example, checkpointing this job upon its abnormal termination) or from within the job itself via a
|
|
system call. Support of checkpoint/restart is optional. A conscious, careful effort was made to make the <a href=
|
|
"../utilities/qsub.html"><i>qsub</i></a> utility consistently refer to checkpoint/restart as optional functionality.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Rerunability</p>
|
|
|
|
<p>When a user submits a job for batch processing, they can designate it "rerunnable" in that it will automatically resume
|
|
execution from the start of the job if the machine on which it was executing crashes for some reason. The decision on whether the
|
|
job will be rerun or not is entirely up to the submitter of the job and no decisions will be made within the batch system. A job
|
|
that is rerunnable and has been submitted with the proper checkpoint/restart switch will first be checkpointed and execution begun
|
|
from that point. Furthermore, use of the implementation-defined checkpoint/restart feature will not be defined in this context.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Error Codes</p>
|
|
|
|
<p>All utilities exit with error status zero (0) if successful, one (1) if a user error occurred, and two (2) for an internal Batch
|
|
Environment option error.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Level of Portability</p>
|
|
|
|
<p>Portability is specified at both the user, operator, and administrator levels. A conforming batch implementation prevents
|
|
identical functionality and behavior at all these levels. Additionally, portable batch shell scripts with embedded Batch
|
|
Environment option utilities add an additional level of portability.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Resource Specification</p>
|
|
|
|
<p>A small set of globally understood resources, such as memory and CPU time, is specified. All conforming batch implementations
|
|
are able to process them in a manner consistent with the yet-to-be-developed resource management model. Resources not in this
|
|
amendment set are ignored and passed along as part of the argument stream of the utility.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Queue Position</p>
|
|
|
|
<p>Queue position is the place a job occupies in a queue. It is dependent on a variety of factors such as submission time and
|
|
priority. Since priority may be affected by the implementation of fair share scheduling, the definition of queue position is
|
|
implementation-defined.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Queue ID</p>
|
|
|
|
<p>A numerical queue ID is an external requirement for purposes of accounting. The identification number was chosen over queue name
|
|
for processing convenience.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Job ID</p>
|
|
|
|
<p>A common notion of "jobs" is a collection of processes whose process group cannot be altered and is used for resource
|
|
management and accounting. This concept is implementation-defined and, as such, has been omitted from the batch amendment.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Bytes <i>versus</i> Words</p>
|
|
|
|
<p>Except for one case, bytes are used as the standard unit for memory size. Furthermore, the definition of a word varies from
|
|
machine to machine. Therefore, bytes will be the default unit of memory size.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Regular Expressions</p>
|
|
|
|
<p>The standard definition of regular expressions is much too broad to be used in the batch utility syntax. All that is needed is a
|
|
simple concept of "all''; for example, delete all my jobs from the named queue. For this reason, regular expressions have been
|
|
eliminated from the batch amendment.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Display Privacy</p>
|
|
|
|
<p>How much data should be displayed locally through functions? Local policy dictates the amount of privacy. Library functions must
|
|
be used to create and enforce local policy. Network and local <a href="../utilities/qstat.html"><i>qstat</i></a>s must reflect the
|
|
policy of the server machine.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Remote Host Naming Convention</p>
|
|
|
|
<p>It was decided that host names would be a maximum of 255 characters in length, with at most 15 characters being shown in
|
|
displays. The 255 character limit was chosen because it is consistent with BSD. The 15-character limit was an arbitrary
|
|
decision.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Network Administration</p>
|
|
|
|
<p>Network administration is important, but is outside the scope of the batch amendment. Network administration could be done with
|
|
<i>rsh</i>. However, authentication becomes two-sided.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Network Administration Philosophy</p>
|
|
|
|
<p>Keep it simple. Centralized management should be possible. For example, Los Alamos needs a dumb set of CPUs to be managed by a
|
|
central system <i>versus</i> several independently-managed systems as is the general case for the Batch Environment option.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>Operator Utility Defaults (that is, Default Host, User, Account, and so on)</p>
|
|
|
|
<p>It was decided that usability would override orthogonality and syntactic consistency.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The Batch System Manager and Operator Distinction</p>
|
|
|
|
<p>The distinction between manager and operator is that operators can only control the flow of jobs. A manager can alter the batch
|
|
system configuration in addition to job flow. POSIX makes a distinction between user and system administrator but goes no further.
|
|
The concepts of manager and operator privileges fall under local policy. The distinction between manager and operator is historical
|
|
in batch environments, and the Batch Environment option has continued that distinction.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The Batch System Administrator</p>
|
|
|
|
<p>An administrator is equivalent to a batch system manager.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<h4><a name="tag_02_03_02"></a>Batch Services</h4>
|
|
|
|
<p>This rationale is provided as informative rather than normative text, to avoid placing requirements on implementors regarding
|
|
the use of symbolic constants, but at the same time to give implementors a preferred practice for assigning values to these
|
|
constants to promote interoperability.</p>
|
|
|
|
<p>The <i>Checkpoint</i> and <i>Minimum_Cpu_Interval</i> attributes induce a variety of behavior depending upon their values. Some
|
|
jobs cannot or should not be checkpointed. Other users will simply need to ensure job continuation across planned downtimes; for
|
|
example, scheduled preventive maintenance. For users consuming expensive resources, or for jobs that run longer than the mean time
|
|
between failures, however, periodic checkpointing may be essential. However, system administrators must be able to set minimum
|
|
checkpoint intervals on a queue-by-queue basis to guard against, for example, naive users specifying interval values too small on
|
|
memory-intensive jobs. Otherwise, system overhead would adversely affect performance.</p>
|
|
|
|
<p>The use of symbolic constants, such as NO_CHECKPOINT, was introduced to lend a degree of formalism and portability to this
|
|
option.</p>
|
|
|
|
<p>Support for checkpointing is optional for servers. However, clients must provide for the <b>-c</b> option, since in a
|
|
distributed environment the job may run on a server that does provide such support, even if the host of the client does not support
|
|
the checkpoint feature.</p>
|
|
|
|
<p>If the user does not specify the <b>-c</b> option, the default action is left unspecified by this option. Some implementations
|
|
may wish to do checkpointing by default; others may wish to checkpoint only under an explicit request from the user.</p>
|
|
|
|
<p>The <i>Priority</i> attribute has been made non-optional. All clients already had been required to support the <b>-p</b> option.
|
|
The concept of prioritization is common in historical implementations. The default priority is left to the server to establish.</p>
|
|
|
|
<p>The <i>Hold_Types</i> attribute has been modified to allow for implementation-defined hold types to be passed to a batch
|
|
server.</p>
|
|
|
|
<p>It was the intent of the IEEE P1003.15 working group to mandate the support for the <i>Resource_List</i> attribute in this
|
|
option by referring to another amendment, specifically the IEEE P1003.1a draft standard. However, during the development of
|
|
the IEEE P1003.1a draft standard this was excluded. As such this requirement has been removed from the normative text.</p>
|
|
|
|
<p>The <i>Shell_Path</i> attribute has been modified to accept a list of shell paths that are associated with a host. The name of
|
|
the attribute has been changed to <i>Shell_Path_List</i>.</p>
|
|
|
|
<h4><a name="tag_02_03_03"></a>Common Behavior for Batch Environment Utilities</h4>
|
|
|
|
<p>This section was defined to meet the goal of a "Small and Consistent Command Set" for this option.</p>
|
|
|
|
|
|
<hr size="2" noshade>
|
|
<center><font size="2"><!--footer start-->
|
|
UNIX ® is a registered Trademark of The Open Group.<br>
|
|
POSIX ® is a registered Trademark of The IEEE.<br>
|
|
[ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href=
|
|
"../utilities/contents.html">XCU</a> | <a href="../functions/contents.html">XSH</a> | <a href="../xrat/contents.html">XRAT</a>
|
|
]</font></center>
|
|
|
|
<!--footer end-->
|
|
<hr size="2" noshade>
|
|
</body>
|
|
</html>
|
|
|