wcat not my-cat
This commit is contained in:
@@ -4,7 +4,7 @@
|
||||
In this project, you'll build a few different UNIX utilities, simple versions
|
||||
of commonly used commands like **cat**, **ls**, etc. We'll call each of them a
|
||||
slightly different name to avoid confusion; for example, instead of **cat**,
|
||||
you'll be implementing **my-cat**.
|
||||
you'll be implementing **wcat** (i.e., "wisconsin" zip).
|
||||
|
||||
Objectives:
|
||||
* Re-familiarize yourself with the C programming language
|
||||
@@ -20,44 +20,44 @@ and of course a basic understanding of C programming. If you **do not** have
|
||||
these skills already, this is not the right place to start.
|
||||
|
||||
Summary of what gets turned in:
|
||||
* A bunch of single .c files for each of the utilities below: **my-cat.c**,
|
||||
**my-grep.c**, **my-zip.c**, and **my-unzip.c**.
|
||||
* A bunch of single .c files for each of the utilities below: **wcat.c**,
|
||||
**wgrep.c**, **wzip.c**, and **wunzip.c**.
|
||||
* Each should compile successfully when compiled with the **-Wall** and
|
||||
**-Werror** flags.
|
||||
* Each should (hopefully) pass the tests we supply to you.
|
||||
|
||||
## my-cat
|
||||
## wcat
|
||||
|
||||
The program **my-cat** is a simple program. Generally, it reads a file as
|
||||
The program **wcat** is a simple program. Generally, it reads a file as
|
||||
specified by the user and prints its contents. A typical usage is as follows,
|
||||
in which the user wants to see the contents of main.c, and thus types:
|
||||
|
||||
```
|
||||
prompt> ./my-cat main.c
|
||||
prompt> ./wcat main.c
|
||||
#include <stdio.h>
|
||||
...
|
||||
```
|
||||
|
||||
As shown, **my-cat** reads the file **main.c** and prints out its contents.
|
||||
The "**./**" before the **my-cat** above is a UNIX thing; it just tells the
|
||||
system which directory to find **my-cat** in (in this case, in the "." (dot)
|
||||
As shown, **wcat** reads the file **main.c** and prints out its contents.
|
||||
The "**./**" before the **wcat** above is a UNIX thing; it just tells the
|
||||
system which directory to find **wcat** in (in this case, in the "." (dot)
|
||||
directory, which means the current working directory).
|
||||
|
||||
To create the **my-cat** binary, you'll be creating a single source file,
|
||||
**my-cat.c**, and writing a little C code to implement this simplified version
|
||||
To create the **wcat** binary, you'll be creating a single source file,
|
||||
**wcat.c**, and writing a little C code to implement this simplified version
|
||||
of **cat**. To compile this program, you will do the following:
|
||||
|
||||
```
|
||||
prompt> gcc -o my-cat my-cat.c -Wall -Werror
|
||||
prompt> gcc -o wcat wcat.c -Wall -Werror
|
||||
prompt>
|
||||
```
|
||||
|
||||
This will make a single *executable binary* called **my-cat** which you can
|
||||
This will make a single *executable binary* called **wcat** which you can
|
||||
then run as above.
|
||||
|
||||
You'll need to learn how to use a few library routines from the C standard
|
||||
library (often called **libc**) to implement the source code for this program,
|
||||
which we'll assume is in a file called **my-cat.c**. All C code is
|
||||
which we'll assume is in a file called **wcat.c**. All C code is
|
||||
automatically linked with the C library, which is full of useful functions you
|
||||
can call to implement your program. Learn more about the C library
|
||||
[here](https://en.wikipedia.org/wiki/C_standard_library) and perhaps
|
||||
@@ -152,24 +152,24 @@ file (thus indicating you no longer need to read from it).
|
||||
|
||||
**Details**
|
||||
|
||||
* Your program **my-cat** can be invoked with one or more files on the command
|
||||
* Your program **wcat** can be invoked with one or more files on the command
|
||||
line; it should just print out each file in turn.
|
||||
* In all non-error cases, **my-cat** should exit with status code 0, usually by
|
||||
* In all non-error cases, **wcat** should exit with status code 0, usually by
|
||||
returning a 0 from **main()** (or by calling **exit(0)**).
|
||||
* If *no files* are specified on the command line, **my-cat** should just exit
|
||||
* If *no files* are specified on the command line, **wcat** should just exit
|
||||
and return 0. Note that this is slightly different than the behavior of
|
||||
normal UNIX **cat** (if you'd like to, figure out the difference).
|
||||
* If the program tries to **fopen()** a file and fails, it should print the
|
||||
exact message "my-cat: cannot open file" (followed by a newline) and exit
|
||||
exact message "wcat: cannot open file" (followed by a newline) and exit
|
||||
with status code 1. If multiple files are specified on the command line,
|
||||
the files should be printed out in order until the end of the file list is
|
||||
reached or an error opening a file is reached (at which point the error
|
||||
message is printed and **my-cat** exits).
|
||||
message is printed and **wcat** exits).
|
||||
|
||||
|
||||
## my-grep
|
||||
## wgrep
|
||||
|
||||
The second utility you will build is called **my-grep**, a variant of the UNIX
|
||||
The second utility you will build is called **wgrep**, a variant of the UNIX
|
||||
tool **grep**. This tool looks through a file, line by line, trying to find a
|
||||
user-specified search term in the line. If a line has the word within it, the
|
||||
line is printed out, otherwise it is not.
|
||||
@@ -177,7 +177,7 @@ line is printed out, otherwise it is not.
|
||||
Here is how a user would look for the term **foo** in the file **bar.txt**:
|
||||
|
||||
```
|
||||
prompt> ./my-grep foo bar.txt
|
||||
prompt> ./wgrep foo bar.txt
|
||||
this line has foo in it
|
||||
so does this foolish line; do you see where?
|
||||
even this line, which has barfood in it, will be printed.
|
||||
@@ -185,40 +185,40 @@ even this line, which has barfood in it, will be printed.
|
||||
|
||||
**Details**
|
||||
|
||||
* Your program **my-grep** is always passed a search term and zero or
|
||||
* Your program **wgrep** is always passed a search term and zero or
|
||||
more files to grep through (thus, more than one is possible). It should go
|
||||
through each line and see if the search term is in it; if so, the line
|
||||
should be printed, and if not, the line should be skipped.
|
||||
* The matching is case sensitive. Thus, if searching for **foo**, lines
|
||||
with **Foo** will *not* match.
|
||||
* Lines can be arbitrarily long (that is, you may see many many characters
|
||||
before you encounter a newline character, \\n). **my-grep** should work
|
||||
before you encounter a newline character, \\n). **wgrep** should work
|
||||
as expected even with very long lines. For this, you might want to look
|
||||
into the **getline()** library call (instead of **fgets()**), or roll your
|
||||
own.
|
||||
* If **my-grep** is passed no command-line arguments, it should print
|
||||
"my-grep: searchterm [file ...]" (followed by a newline) and exit with
|
||||
* If **wgrep** is passed no command-line arguments, it should print
|
||||
"wgrep: searchterm [file ...]" (followed by a newline) and exit with
|
||||
status 1.
|
||||
* If **my-grep** encounters a file that it cannot open, it should print
|
||||
"my-grep: cannot open file" (followed by a newline) and exit with status 1.
|
||||
* In all other cases, **my-grep** should exit with return code 0.
|
||||
* If a search term, but no file, is specified, **my-grep** should work,
|
||||
but instead of reading from a file, **my-grep** should read from
|
||||
* If **wgrep** encounters a file that it cannot open, it should print
|
||||
"wgrep: cannot open file" (followed by a newline) and exit with status 1.
|
||||
* In all other cases, **wgrep** should exit with return code 0.
|
||||
* If a search term, but no file, is specified, **wgrep** should work,
|
||||
but instead of reading from a file, **wgrep** should read from
|
||||
*standard input*. Doing so is easy, because the file stream **stdin**
|
||||
is already open; you can use **fgets()** (or similar routines) to
|
||||
read from it.
|
||||
* For simplicity, if passed the empty string as a search string, **my-grep**
|
||||
* For simplicity, if passed the empty string as a search string, **wgrep**
|
||||
can either match NO lines or match ALL lines, both are acceptable.
|
||||
|
||||
## my-zip and my-unzip
|
||||
## wzip and wunzip
|
||||
|
||||
The next tools you will build come in a pair, because one (**my-zip**) is a
|
||||
file compression tool, and the other (**my-unzip**) is a file decompression
|
||||
The next tools you will build come in a pair, because one (**wzip**) is a
|
||||
file compression tool, and the other (**wunzip**) is a file decompression
|
||||
tool.
|
||||
|
||||
The type of compression used here is a simple form of compression called
|
||||
*run-length encoding* (*RLE*). RLE is quite simple: when you encounter **n**
|
||||
characters of the same type in a row, the compression tool (**my-zip**) will
|
||||
characters of the same type in a row, the compression tool (**wzip**) will
|
||||
turn that into the number **n** and a single instance of the character.
|
||||
|
||||
Thus, if we had a file with the following contents:
|
||||
@@ -237,49 +237,49 @@ character in ASCII. Thus, a compressed file will consist of some number of
|
||||
length) and the single character.
|
||||
|
||||
To write out an integer in binary format (not ASCII), you should use
|
||||
**fwrite()**. Read the man page for more details. For **my-zip**, all
|
||||
**fwrite()**. Read the man page for more details. For **wzip**, all
|
||||
output should be written to standard output (the **stdout** file stream,
|
||||
which, as with **stdin**, is already open when the program starts running).
|
||||
|
||||
Note that typical usage of the **my-zip** tool would thus use shell
|
||||
Note that typical usage of the **wzip** tool would thus use shell
|
||||
redirection in order to write the compressed output to a file. For example,
|
||||
to compress the file **file.txt** into a (hopefully smaller) **file.z**,
|
||||
you would type:
|
||||
|
||||
```
|
||||
prompt> ./my-zip file.txt > file.z
|
||||
prompt> ./wzip file.txt > file.z
|
||||
```
|
||||
|
||||
The "greater than" sign is a UNIX shell redirection; in this case, it ensures
|
||||
that the output from **my-zip** is written to the file **file.z** (instead of
|
||||
that the output from **wzip** is written to the file **file.z** (instead of
|
||||
being printed to the screen). You'll learn more about how this works a little
|
||||
later in the course.
|
||||
|
||||
The **my-unzip** tool simply does the reverse of the **my-zip** tool, taking
|
||||
The **wunzip** tool simply does the reverse of the **wzip** tool, taking
|
||||
in a compressed file and writing (to standard output again) the uncompressed
|
||||
results. For example, to see the contents of **file.txt**, you would type:
|
||||
|
||||
```
|
||||
prompt> ./my-unzip file.z
|
||||
prompt> ./wunzip file.z
|
||||
```
|
||||
|
||||
**my-unzip** should read in the compressed file (likely using **fread()**)
|
||||
**wunzip** should read in the compressed file (likely using **fread()**)
|
||||
and print out the uncompressed output to standard output using **printf()**.
|
||||
|
||||
**Details**
|
||||
|
||||
* Correct invocation should pass one or more files via the command line to the
|
||||
program; if no files are specified, the program should exit with return code
|
||||
1 and print "my-zip: file1 [file2 ...]" (followed by a newline) or
|
||||
"my-unzip: file1 [file2 ...]" (followed by a newline) for **my-zip** and
|
||||
**my-unzip** respectively.
|
||||
1 and print "wzip: file1 [file2 ...]" (followed by a newline) or
|
||||
"wunzip: file1 [file2 ...]" (followed by a newline) for **wzip** and
|
||||
**wunzip** respectively.
|
||||
* The format of the compressed file must match the description above exactly
|
||||
(a 4-byte integer followed by a character for each run).
|
||||
* Do note that if multiple files are passed to **my-zip*, they are compressed
|
||||
* Do note that if multiple files are passed to **wzip*, they are compressed
|
||||
into a single compressed output, and when unzipped, will turn into a single
|
||||
uncompressed stream of text (thus, the information that multiple files were
|
||||
originally input into **my-zip** is lost). The same thing holds for
|
||||
**my-unzip**.
|
||||
originally input into **wzip** is lost). The same thing holds for
|
||||
**wunzip**.
|
||||
|
||||
|
||||
### Footnotes
|
||||
|
||||
Reference in New Issue
Block a user