Next rev of shell project
This commit is contained in:
@@ -39,130 +39,147 @@ should be `wish`:
|
||||
|
||||
```
|
||||
prompt> ./wish
|
||||
wish>
|
||||
```
|
||||
|
||||
At this point, `wish` is running, and ready to accept commands. Type away!
|
||||
|
||||
The mode above is called *interactive* mode, and allows the user to type
|
||||
commands directly. The shell also supports a *batch mode*, which instead reads
|
||||
input from a batch file and executes commands from therein. Here is how you
|
||||
run the shell with a batch file named `batch.txt`:
|
||||
|
||||
```
|
||||
prompt> ./wish batch.txt
|
||||
```
|
||||
|
||||
You should structure your shell such that it creates a new process for each
|
||||
new command (note that there are a few exceptions to this, which we discuss
|
||||
below). Your basic shell should be able to parse a command and run the
|
||||
program corresponding to the command. For example, if the user types `ls
|
||||
-la /tmp`, your shell should run the program `/bin/ls` with the given
|
||||
arguments `-la` and `/tmp`.
|
||||
new command (the exception are *built-in commands*, discussed below). Your
|
||||
basic shell should be able to parse a command and run the program
|
||||
corresponding to the command. For example, if the user types `ls -la /tmp`,
|
||||
your shell should run the program `/bin/ls` with the given arguments `-la` and
|
||||
`/tmp` (how does the shell know to run `/bin/ls`? It's something called the
|
||||
shell **path**; more on this below).
|
||||
|
||||
You might be wondering how the shell knows to run `/bin/ls` (which means the
|
||||
program binary `ls` is found in the directory `/bin`) when you type `ls`. The
|
||||
shells knows this thanks to a **path** variable that the user sets. The path
|
||||
variable contains the list of all directories to search, in order, when the
|
||||
user types a command. We'll learn more about how to deal with the path below.
|
||||
## Structure
|
||||
|
||||
**Important:** Note that the shell itself does not *implement* `ls` or really
|
||||
many other commands at all (it does implement a few, called *built-ins*,
|
||||
described further below). All it does is find those executables in one of the
|
||||
directories specified by `path` and create a new process to run them. More on
|
||||
this below.
|
||||
### Basic Shell
|
||||
|
||||
The shell is very simple (conceptually): it runs in a while loop, repeatedly
|
||||
asking for input to tell it what command to execute. It then executes that
|
||||
command. The loop continues indefinitely, until the user types the built-in
|
||||
command `exit`, at which point it exits. That's it!
|
||||
|
||||
For reading lines of input, you should use `getline()`. This allows you to
|
||||
obtain arbitrarily long input lines with ease. Generally, the shell will be
|
||||
run in *interactive mode*, where the user types a command (one at a time) and
|
||||
the shell acts on it. However, your shell will also support *batch mode*, in
|
||||
which the shell is given an input file of commands; in this case, the shell
|
||||
should not read user input (from `stdin`) but rather from this file to get the
|
||||
commands to execute.
|
||||
|
||||
To parse the input line into constituent pieces, you might want to use
|
||||
`strtok()`. Read the man page (carefully) for more details.
|
||||
|
||||
To execute commands, look into `fork()`, `exec()`, and `wait()/waitpid()`.
|
||||
See the man pages for these functions, and also read the relevant [book
|
||||
chapter](http://www.ostep.org/cpu-api.pdf) for a brief overview.
|
||||
|
||||
## Built-in Commands
|
||||
You will note that there are a variety of commands in the `exec` family; for
|
||||
this project, you must use `execv`. You should **not** use the `system()`
|
||||
library function call to run a command. Remember that if `execv()` is
|
||||
successful, it will not return; if it does return, there was an error (e.g.,
|
||||
the command does not exist). The most challenging part is getting the
|
||||
arguments correctly specified.
|
||||
|
||||
### Paths
|
||||
|
||||
In our example above, the user typed `ls` but the shell knew to execute the
|
||||
program `/bin/ls`. How does your shell know this?
|
||||
|
||||
It turns out that the user must specify a **path** variable to describe the
|
||||
set of directories to search for executables; the set of directories that
|
||||
comprise the path are sometimes called the *search path* of the shell. The
|
||||
path variable contains the list of all directories to search, in order, when
|
||||
the user types a command.
|
||||
|
||||
**Important:** Note that the shell itself does not *implement* `ls` or other
|
||||
commands (except built-ins). All it does is find those executables in one of
|
||||
the directories specified by `path` and create a new process to run them.
|
||||
|
||||
To check if a particular file exists in a directory and is executable,
|
||||
consider the `access()` system call. For example, when the user types `ls`,
|
||||
and path is set to include both `/bin` and `/usr/bin`, try `access("/bin/ls",
|
||||
X_OK)`. If that fails, try "/usr/bin/ls". If that fails too, it is an error.
|
||||
|
||||
### Built-in Commands
|
||||
|
||||
Whenever your shell accepts a command, it should check whether the command is
|
||||
a **built-in command** or not. If it is, it should not be executed like other
|
||||
programs. Instead, your shell will invoke your implementation of the built-in
|
||||
command. For example, to implement the `exit` built-in command, you simply
|
||||
call `exit(0);` in your C program.
|
||||
call `exit(0);` in your wish source code, which then will exit the shell.
|
||||
|
||||
So far, you have added your own `exit` built-in command. Most Unix shells have
|
||||
many others such as `cd`, `pwd`, etc. In this project, you should implement
|
||||
`exit`, `cd`, `pwd`, and `path`.
|
||||
In this project, you should implement `exit`, `cd`, `pwd`, and `path` as
|
||||
built-in commands.
|
||||
|
||||
The formats for `exit`, `cd`, and `pwd` are:
|
||||
* `exit`: When the user types `exit`, your shell should simply call the `exit`
|
||||
system call with 0 as a parameter. It is an error to pass any arguments to
|
||||
`exit`.
|
||||
|
||||
```
|
||||
[optional-space]exit[optional-space]
|
||||
[optional-space]pwd[optional-space]
|
||||
[optional-space]cd[optional-space]
|
||||
[optional-space]cd[oneOrMoreSpace]dir[optional-space]
|
||||
```
|
||||
* `cd`: `cd` always take one argument (0 or >1 args should be signaled as an
|
||||
error). To change directories, use the `chdir()` system call with the argument
|
||||
supplied by the user; if `chdir` fails, that is also an error.
|
||||
|
||||
When you run `cd` (without arguments), your shell should change the working
|
||||
directory to the path stored in the $HOME environment variable. Use the call
|
||||
`getenv("HOME")` in your `wish` source code to obtain this value.
|
||||
* `pwd`: When a user types `pwd`, your shell should call getcwd() and show the
|
||||
result. It is an error to pass any arguments to `pwd`.
|
||||
|
||||
You do not have to support tilde (~). Although in a typical Unix shell you
|
||||
could go to a user's directory by typing `cd ~username`, in this project you
|
||||
do not have to deal with tilde. You should treat it like a common character,
|
||||
i.e., you should just pass the whole word (e.g. "~username") to chdir(), and
|
||||
chdir will return an error.
|
||||
* `path`: The `path` command takes 0 or more arguments, with each argument
|
||||
separated by whitespace from the others. A typical usage would be like this:
|
||||
`wish> path /bin /usr/bin`, which would add `/bin` and `/usr/bin` to the
|
||||
search path of the shell. If the user sets path to be empty, then the shell
|
||||
should not be able to run any programs (except built-in commands).
|
||||
|
||||
Basically, when a user types `pwd`, you simply call getcwd(), and show the
|
||||
result. When a user changes the current working directory (e.g. \"cd
|
||||
somepath\"), you simply call chdir(). Hence, if you run your shell, and then
|
||||
run pwd, it should look like this:
|
||||
### Redirection
|
||||
|
||||
```
|
||||
% cd
|
||||
% pwd
|
||||
/afs/cs.wisc.edu/u/m/j/username
|
||||
% echo $PWD
|
||||
/u/m/j/username
|
||||
% ./wish
|
||||
wish> pwd
|
||||
/afs/cs.wisc.edu/u/m/j/username
|
||||
```
|
||||
|
||||
The format of the `path` built-in command is:
|
||||
```
|
||||
[optionalSpace]path[oneOrMoreSpace]dir[optionalSpace] (and possibly more directories, space separated)
|
||||
```
|
||||
|
||||
A typical usage would be like this:
|
||||
|
||||
```
|
||||
wish> path /bin /usr/bin
|
||||
```
|
||||
|
||||
By doing this, your shell will know to look in `/bin` and `/usr/bin`
|
||||
when a user types a command, to see if it can find the proper binary to
|
||||
execute. If the user sets path to be empty, then the shell should not be able
|
||||
to run any programs unless XXX (but built-in commands, such as path, should
|
||||
still work).
|
||||
|
||||
## Redirection
|
||||
|
||||
Many times, a shell user prefers to send the output of his/her program to a
|
||||
file rather than to the screen. Usually, a shell provides this nice feature
|
||||
with the `>` character. Formally this is named as redirection of standard
|
||||
Many times, a shell user prefers to send the output of a program to a file
|
||||
rather than to the screen. Usually, a shell provides this nice feature with
|
||||
the `>` character. Formally this is named as redirection of standard
|
||||
output. To make your shell users happy, your shell should also include this
|
||||
feature, but with a slight twist (explained below).
|
||||
|
||||
For example, if a user types `ls -la /tmp > output`, nothing should be printed
|
||||
on the screen. Instead, the standard output of the `ls` program should be
|
||||
rerouted to the `output.out` file. In addition, the standard error output of
|
||||
the file should be rerouted to the file `output.err` (the twist is that this
|
||||
rerouted to the file `output`. In addition, the standard error output of
|
||||
the file should be rerouted to the file `output` (the twist is that this
|
||||
is a little different than standard redirection).
|
||||
|
||||
If the `output.out` or `output.err` files already exists before you run your
|
||||
program, you should simple overwrite them (after truncating). If the output
|
||||
file is not specified (e.g., the user types `ls >` without a file), you should
|
||||
print an error message and not run the program `ls`.
|
||||
If the `output` file exists before you run your program, you should simple
|
||||
overwrite them (after truncating it).
|
||||
|
||||
Here are some redirections that should **not** work:
|
||||
```
|
||||
ls > out1 out2
|
||||
ls > out1 out2 out3
|
||||
ls > out1 > out2
|
||||
```
|
||||
The exact format of redirection is a command (and possibly some arguments)
|
||||
followed by the redirection symbol followed by a filename. Multiple
|
||||
redirection operators or multiple files to the right of the redirection sign
|
||||
are errors.
|
||||
|
||||
Note: don't worry about redirection for built-in commands (e.g., we will
|
||||
not test what happens when you type `path /bin > file`).
|
||||
|
||||
## Parallel Commands
|
||||
### Parallel Commands
|
||||
|
||||
Your shell will also allow the user to launch parallel commands.
|
||||
Your shell will also allow the user to launch parallel commands. This is
|
||||
accomplished with the ampersand operator as follows:
|
||||
|
||||
```
|
||||
wish> cmd1 & cmd2 args1 args2 & cmd3 args1
|
||||
```
|
||||
|
||||
In this case, instead of running `cmd1` and then waiting for it to finish,
|
||||
your shell should run `cmd1`, `cmd2`, and `cmd3` (each with whatever arguments
|
||||
the user has passed to it).
|
||||
|
||||
|
||||
## Program Errors
|
||||
### Program Errors
|
||||
|
||||
**The one and only error message.** You should print this one and only error
|
||||
message whenever you encounter an error of any type:
|
||||
@@ -172,8 +189,8 @@ message whenever you encounter an error of any type:
|
||||
write(STDERR_FILENO, error_message, strlen(error_message));
|
||||
```
|
||||
|
||||
The error message should be printed to stderr (standard error). Also,
|
||||
do not add whitespaces or tabs or extra error messages.
|
||||
The error message should be printed to stderr (standard error), as shown
|
||||
above.
|
||||
|
||||
There is a difference between errors that your shell catches and those that
|
||||
the program catches. Your shell should catch all the syntax errors specified
|
||||
@@ -183,178 +200,32 @@ invalid arguments to `ls` when you run it, for example), let the program
|
||||
prints its specific error messages in any manner it desires (e.g., could be
|
||||
stdout or stderr).
|
||||
|
||||
## White Spaces
|
||||
|
||||
The `>` operator will be separated by spaces. Valid input may include the
|
||||
following:
|
||||
|
||||
```
|
||||
wish> ls
|
||||
wish> ls > a
|
||||
wish> ls > a
|
||||
```
|
||||
|
||||
But not this (it is ok if this works, it just doesn't have to):
|
||||
|
||||
```
|
||||
wish> ls>a
|
||||
```
|
||||
|
||||
|
||||
## Defensive Programming and Error Messages
|
||||
|
||||
Defensive programming is good for you, so do it! It is also required. Your
|
||||
program should check all parameters, error-codes, etc. before it trusts
|
||||
them. In general, there should be no circumstances in which your C program
|
||||
will core dump, hang indefinitely, or prematurely terminate. Therefore, your
|
||||
program must respond to all input in a reasonable manner; by "reasonable",
|
||||
we mean print the error message (as specified in the next paragraph) and
|
||||
either continue processing or exit, depending upon the situation.
|
||||
|
||||
Since your code will be graded with automated testing, you should print this
|
||||
*one and only error message* whenever you encounter an error of any type:
|
||||
|
||||
```
|
||||
char error_message\[30\] = \"An error has occurred\\n\";
|
||||
write(STDERR_FILENO, error_message, strlen(error_message));
|
||||
```
|
||||
|
||||
For this project, the error message should be printed to **stderr**. Also, do
|
||||
not attempt to add whitespaces or tabs or extra error messages.
|
||||
|
||||
You should consider the following situations as errors; in each case, your
|
||||
shell should print the error message to stderr and exit gracefully:
|
||||
|
||||
* An incorrect number of command line arguments to your shell program.
|
||||
|
||||
For the following situation, you should print the error message to
|
||||
stderr and continue processing:
|
||||
|
||||
* A command does not exist or cannot be executed.
|
||||
* A very long command line (over 128 bytes).
|
||||
|
||||
Your shell should also be able to handle the following scenarios below, which
|
||||
are *not errors.*
|
||||
|
||||
* An empty command line.
|
||||
* Multiple white spaces on a command line.
|
||||
|
||||
## Hints
|
||||
|
||||
Writing your shell in a simple manner is a matter of finding the relevant
|
||||
library routines and calling them properly. To simplify things for you in
|
||||
this assignment, we will suggest a few library routines you may want to use to
|
||||
make your coding easier. You are free to use these routines if you want or to
|
||||
disregard our suggestions. To find information on these library routines, look
|
||||
at the manual pages.]
|
||||
|
||||
### Basic Shell
|
||||
|
||||
**Parsing:** For reading lines of input, once again check out `getline()`. To
|
||||
open a file and get a handle with type `FILE *`, look into `fopen()`. Be sure
|
||||
to check the return code of these routines for errors! You may find the
|
||||
`strtok()` routine useful for parsing the command line (i.e., for extracting
|
||||
the arguments within a command separated by whitespaces).
|
||||
|
||||
**Executing Commands:** Look into `fork`, `exec`, and `wait/waitpid`. See the
|
||||
man pages for these functions, and also read [book chapter](http://www.ostep.org/cpu-api.pdf).
|
||||
|
||||
You will note that there are a variety of commands in the `exec` family; for
|
||||
this project, you must use `execv`. You should **not** use the `system()`
|
||||
library function call to run a command. Remember that if `execv()` is
|
||||
successful, it will not return; if it does return, there was an error (e.g.,
|
||||
the command does not exist). The most challenging part is getting the
|
||||
arguments correctly specified. The first argument specifies the program that
|
||||
should be executed, with the full path specified; this is
|
||||
straight-forward. The second argument, `char *argv[]` matches those
|
||||
that the program sees in its function prototype:
|
||||
|
||||
```c
|
||||
int main(int argc, char *argv[]);
|
||||
```
|
||||
|
||||
Note that this argument is an array of strings, or an array of
|
||||
pointers to characters. For example, if you invoke a program with:
|
||||
|
||||
```
|
||||
foo 205 535
|
||||
```
|
||||
|
||||
Assuming that you find `foo` in directory `/bin` (or elsewhere in the defined
|
||||
path), then argv[0] = "/bin/foo", argv[1] = "205" and argv[2] = "535".
|
||||
|
||||
Important: the list of arguments must be terminated with a NULL pointer; in
|
||||
our example, this means argv[3] = NULL. We strongly recommend that you
|
||||
carefully check that you are constructing this array correctly!
|
||||
|
||||
### Built-in Commands
|
||||
|
||||
For the `exit` built-in command, you should simply call `exit()` from within
|
||||
your source code. The corresponding shell process will exit, and the parent
|
||||
(i.e. your shell) will be notified.
|
||||
|
||||
For managing the current working directory, you should use `getenv(),
|
||||
`chdir()`, and `getcwd()`. The `getenv()` call is useful when you want to go
|
||||
to your HOME directory. The `getcwd()` call is useful to know the current
|
||||
working directory, i.e., if a user types `pwd`, you simply call `getcwd()` and
|
||||
use those results. Finally, `chdir` is useful for moving to different
|
||||
directories. For more information on these topics, read the man pages or the
|
||||
Advanced Unix Programming book (Chapters 4 and 7) or look around online.
|
||||
|
||||
### Redirection
|
||||
|
||||
Redirection is relatively easy to implement. For example, to redirect standard
|
||||
output to a file, just use `close()` on stdout, and then `open()` on a
|
||||
file. More on this below.
|
||||
|
||||
With a file descriptor, you can perform read and write to a file. Maybe in
|
||||
your life so far, you have only used `fopen()`, `fread()`, and `fwrite()` for
|
||||
reading and writing to a file. Unfortunately, these functions work on `FILE
|
||||
*`, which is more of a C library support; the file descriptors are hidden.
|
||||
|
||||
To work on a file descriptor, you should use `open()`, `read()`, and `write()`
|
||||
system calls. These functions perform their work by using file descriptors.
|
||||
To understand more about file I/O and file descriptors you can read the
|
||||
Advanced Unix Programming book (Chapter 3) (specifically, 3.2 to 3.5, 3.7,
|
||||
3.8, and 3.12), or just read the man pages. Before reading forward, at this
|
||||
point, you should become more familiar file descriptors.
|
||||
|
||||
The idea of redirection is to make the stdout descriptor point to your output
|
||||
file descriptor. First of all, let's understand the STDOUT_FILENO file
|
||||
descriptor. When a command `ls -la /tmp` runs, the `ls` program prints its
|
||||
output to the screen. But obviously, the ls program does not know what a
|
||||
screen is. All it knows is that the screen is basically pointed by the
|
||||
STDOUT_FILENO file descriptor. In other words, you could rewrite
|
||||
`printf("hi");` in this way: `write(STDOUT_FILENO, "hi", 2);`.
|
||||
|
||||
To check if a particular file exists in a directory, use the `stat()` system
|
||||
call. For example, when the user types `ls`, and path is set to include both
|
||||
`/bin` and `/usr/bin`, try `stat("/bin/ls")`. If that fails, try
|
||||
`stat("/usr/bin/ls")`. If that fails too, print the **only error message**.
|
||||
|
||||
### Miscellaneous Hints
|
||||
|
||||
Remember to get the **basic functionality** of your shell working before
|
||||
worrying about all of the error conditions and end cases. For example, first
|
||||
get a single command running (probably first a command with no arguments, such
|
||||
as `ls`). Then try adding more arguments.
|
||||
as `ls`).
|
||||
|
||||
Next, try working on multiple commands. Make sure that you are correctly
|
||||
handling all of the cases where there is miscellaneous white space around
|
||||
commands or missing commands. Next, add built-in commands. Finally, add
|
||||
redirection support.
|
||||
Next, add built-in commands. Then, try working on redirection. Finally, think
|
||||
about parallel commands. Each of these requires a little more effort on
|
||||
parsing, but each should not be too hard to implement.
|
||||
|
||||
We strongly recommend that you check the return codes of all system
|
||||
calls from the very beginning of your work. This will often catch
|
||||
errors in how you are invoking these new system calls. And, it's just good
|
||||
programming sense.
|
||||
At some point, you should make sure your code is robust to white space of
|
||||
various kinds, including spaces (` `) and tabs (`\t`). In general, the user
|
||||
should be able to put variable amounts of white space before and after
|
||||
commands, arguments, and various operators; however, the operators
|
||||
(redirection and parallel commands) do not require whitespace.
|
||||
|
||||
Beat up your own code! You are the best (and in this case, the
|
||||
only) tester of this code. Throw lots of junk at it and make sure the
|
||||
shell behaves well. Good code comes through testing -- you must run
|
||||
all sorts of different tests to make sure things work as
|
||||
desired. Don't be gentle -- other users certainly won't be. Break it
|
||||
now so we don't have to break it later.
|
||||
Check the return codes of all system calls from the very beginning of your
|
||||
work. This will often catch errors in how you are invoking these new system
|
||||
calls. It's also just good programming sense.
|
||||
|
||||
Beat up your own code! You are the best (and in this case, the only) tester of
|
||||
this code. Throw lots of junk at it and make sure the shell behaves well. Good
|
||||
code comes through testing -- you must run all sorts of different tests to
|
||||
make sure things work as desired. Don't be gentle -- other users certainly
|
||||
won't be. Break it now so we don't have to break it later.
|
||||
|
||||
Keep versions of your code. More advanced programmers will use a source
|
||||
control system such as git. Minimally, when you get a piece of functionality
|
||||
|
||||
Reference in New Issue
Block a user