wcat not my-cat
This commit is contained in:
@@ -4,7 +4,7 @@
|
|||||||
In this project, you'll build a few different UNIX utilities, simple versions
|
In this project, you'll build a few different UNIX utilities, simple versions
|
||||||
of commonly used commands like **cat**, **ls**, etc. We'll call each of them a
|
of commonly used commands like **cat**, **ls**, etc. We'll call each of them a
|
||||||
slightly different name to avoid confusion; for example, instead of **cat**,
|
slightly different name to avoid confusion; for example, instead of **cat**,
|
||||||
you'll be implementing **my-cat**.
|
you'll be implementing **wcat** (i.e., "wisconsin" zip).
|
||||||
|
|
||||||
Objectives:
|
Objectives:
|
||||||
* Re-familiarize yourself with the C programming language
|
* Re-familiarize yourself with the C programming language
|
||||||
@@ -20,44 +20,44 @@ and of course a basic understanding of C programming. If you **do not** have
|
|||||||
these skills already, this is not the right place to start.
|
these skills already, this is not the right place to start.
|
||||||
|
|
||||||
Summary of what gets turned in:
|
Summary of what gets turned in:
|
||||||
* A bunch of single .c files for each of the utilities below: **my-cat.c**,
|
* A bunch of single .c files for each of the utilities below: **wcat.c**,
|
||||||
**my-grep.c**, **my-zip.c**, and **my-unzip.c**.
|
**wgrep.c**, **wzip.c**, and **wunzip.c**.
|
||||||
* Each should compile successfully when compiled with the **-Wall** and
|
* Each should compile successfully when compiled with the **-Wall** and
|
||||||
**-Werror** flags.
|
**-Werror** flags.
|
||||||
* Each should (hopefully) pass the tests we supply to you.
|
* Each should (hopefully) pass the tests we supply to you.
|
||||||
|
|
||||||
## my-cat
|
## wcat
|
||||||
|
|
||||||
The program **my-cat** is a simple program. Generally, it reads a file as
|
The program **wcat** is a simple program. Generally, it reads a file as
|
||||||
specified by the user and prints its contents. A typical usage is as follows,
|
specified by the user and prints its contents. A typical usage is as follows,
|
||||||
in which the user wants to see the contents of main.c, and thus types:
|
in which the user wants to see the contents of main.c, and thus types:
|
||||||
|
|
||||||
```
|
```
|
||||||
prompt> ./my-cat main.c
|
prompt> ./wcat main.c
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
As shown, **my-cat** reads the file **main.c** and prints out its contents.
|
As shown, **wcat** reads the file **main.c** and prints out its contents.
|
||||||
The "**./**" before the **my-cat** above is a UNIX thing; it just tells the
|
The "**./**" before the **wcat** above is a UNIX thing; it just tells the
|
||||||
system which directory to find **my-cat** in (in this case, in the "." (dot)
|
system which directory to find **wcat** in (in this case, in the "." (dot)
|
||||||
directory, which means the current working directory).
|
directory, which means the current working directory).
|
||||||
|
|
||||||
To create the **my-cat** binary, you'll be creating a single source file,
|
To create the **wcat** binary, you'll be creating a single source file,
|
||||||
**my-cat.c**, and writing a little C code to implement this simplified version
|
**wcat.c**, and writing a little C code to implement this simplified version
|
||||||
of **cat**. To compile this program, you will do the following:
|
of **cat**. To compile this program, you will do the following:
|
||||||
|
|
||||||
```
|
```
|
||||||
prompt> gcc -o my-cat my-cat.c -Wall -Werror
|
prompt> gcc -o wcat wcat.c -Wall -Werror
|
||||||
prompt>
|
prompt>
|
||||||
```
|
```
|
||||||
|
|
||||||
This will make a single *executable binary* called **my-cat** which you can
|
This will make a single *executable binary* called **wcat** which you can
|
||||||
then run as above.
|
then run as above.
|
||||||
|
|
||||||
You'll need to learn how to use a few library routines from the C standard
|
You'll need to learn how to use a few library routines from the C standard
|
||||||
library (often called **libc**) to implement the source code for this program,
|
library (often called **libc**) to implement the source code for this program,
|
||||||
which we'll assume is in a file called **my-cat.c**. All C code is
|
which we'll assume is in a file called **wcat.c**. All C code is
|
||||||
automatically linked with the C library, which is full of useful functions you
|
automatically linked with the C library, which is full of useful functions you
|
||||||
can call to implement your program. Learn more about the C library
|
can call to implement your program. Learn more about the C library
|
||||||
[here](https://en.wikipedia.org/wiki/C_standard_library) and perhaps
|
[here](https://en.wikipedia.org/wiki/C_standard_library) and perhaps
|
||||||
@@ -152,24 +152,24 @@ file (thus indicating you no longer need to read from it).
|
|||||||
|
|
||||||
**Details**
|
**Details**
|
||||||
|
|
||||||
* Your program **my-cat** can be invoked with one or more files on the command
|
* Your program **wcat** can be invoked with one or more files on the command
|
||||||
line; it should just print out each file in turn.
|
line; it should just print out each file in turn.
|
||||||
* In all non-error cases, **my-cat** should exit with status code 0, usually by
|
* In all non-error cases, **wcat** should exit with status code 0, usually by
|
||||||
returning a 0 from **main()** (or by calling **exit(0)**).
|
returning a 0 from **main()** (or by calling **exit(0)**).
|
||||||
* If *no files* are specified on the command line, **my-cat** should just exit
|
* If *no files* are specified on the command line, **wcat** should just exit
|
||||||
and return 0. Note that this is slightly different than the behavior of
|
and return 0. Note that this is slightly different than the behavior of
|
||||||
normal UNIX **cat** (if you'd like to, figure out the difference).
|
normal UNIX **cat** (if you'd like to, figure out the difference).
|
||||||
* If the program tries to **fopen()** a file and fails, it should print the
|
* If the program tries to **fopen()** a file and fails, it should print the
|
||||||
exact message "my-cat: cannot open file" (followed by a newline) and exit
|
exact message "wcat: cannot open file" (followed by a newline) and exit
|
||||||
with status code 1. If multiple files are specified on the command line,
|
with status code 1. If multiple files are specified on the command line,
|
||||||
the files should be printed out in order until the end of the file list is
|
the files should be printed out in order until the end of the file list is
|
||||||
reached or an error opening a file is reached (at which point the error
|
reached or an error opening a file is reached (at which point the error
|
||||||
message is printed and **my-cat** exits).
|
message is printed and **wcat** exits).
|
||||||
|
|
||||||
|
|
||||||
## my-grep
|
## wgrep
|
||||||
|
|
||||||
The second utility you will build is called **my-grep**, a variant of the UNIX
|
The second utility you will build is called **wgrep**, a variant of the UNIX
|
||||||
tool **grep**. This tool looks through a file, line by line, trying to find a
|
tool **grep**. This tool looks through a file, line by line, trying to find a
|
||||||
user-specified search term in the line. If a line has the word within it, the
|
user-specified search term in the line. If a line has the word within it, the
|
||||||
line is printed out, otherwise it is not.
|
line is printed out, otherwise it is not.
|
||||||
@@ -177,7 +177,7 @@ line is printed out, otherwise it is not.
|
|||||||
Here is how a user would look for the term **foo** in the file **bar.txt**:
|
Here is how a user would look for the term **foo** in the file **bar.txt**:
|
||||||
|
|
||||||
```
|
```
|
||||||
prompt> ./my-grep foo bar.txt
|
prompt> ./wgrep foo bar.txt
|
||||||
this line has foo in it
|
this line has foo in it
|
||||||
so does this foolish line; do you see where?
|
so does this foolish line; do you see where?
|
||||||
even this line, which has barfood in it, will be printed.
|
even this line, which has barfood in it, will be printed.
|
||||||
@@ -185,40 +185,40 @@ even this line, which has barfood in it, will be printed.
|
|||||||
|
|
||||||
**Details**
|
**Details**
|
||||||
|
|
||||||
* Your program **my-grep** is always passed a search term and zero or
|
* Your program **wgrep** is always passed a search term and zero or
|
||||||
more files to grep through (thus, more than one is possible). It should go
|
more files to grep through (thus, more than one is possible). It should go
|
||||||
through each line and see if the search term is in it; if so, the line
|
through each line and see if the search term is in it; if so, the line
|
||||||
should be printed, and if not, the line should be skipped.
|
should be printed, and if not, the line should be skipped.
|
||||||
* The matching is case sensitive. Thus, if searching for **foo**, lines
|
* The matching is case sensitive. Thus, if searching for **foo**, lines
|
||||||
with **Foo** will *not* match.
|
with **Foo** will *not* match.
|
||||||
* Lines can be arbitrarily long (that is, you may see many many characters
|
* Lines can be arbitrarily long (that is, you may see many many characters
|
||||||
before you encounter a newline character, \\n). **my-grep** should work
|
before you encounter a newline character, \\n). **wgrep** should work
|
||||||
as expected even with very long lines. For this, you might want to look
|
as expected even with very long lines. For this, you might want to look
|
||||||
into the **getline()** library call (instead of **fgets()**), or roll your
|
into the **getline()** library call (instead of **fgets()**), or roll your
|
||||||
own.
|
own.
|
||||||
* If **my-grep** is passed no command-line arguments, it should print
|
* If **wgrep** is passed no command-line arguments, it should print
|
||||||
"my-grep: searchterm [file ...]" (followed by a newline) and exit with
|
"wgrep: searchterm [file ...]" (followed by a newline) and exit with
|
||||||
status 1.
|
status 1.
|
||||||
* If **my-grep** encounters a file that it cannot open, it should print
|
* If **wgrep** encounters a file that it cannot open, it should print
|
||||||
"my-grep: cannot open file" (followed by a newline) and exit with status 1.
|
"wgrep: cannot open file" (followed by a newline) and exit with status 1.
|
||||||
* In all other cases, **my-grep** should exit with return code 0.
|
* In all other cases, **wgrep** should exit with return code 0.
|
||||||
* If a search term, but no file, is specified, **my-grep** should work,
|
* If a search term, but no file, is specified, **wgrep** should work,
|
||||||
but instead of reading from a file, **my-grep** should read from
|
but instead of reading from a file, **wgrep** should read from
|
||||||
*standard input*. Doing so is easy, because the file stream **stdin**
|
*standard input*. Doing so is easy, because the file stream **stdin**
|
||||||
is already open; you can use **fgets()** (or similar routines) to
|
is already open; you can use **fgets()** (or similar routines) to
|
||||||
read from it.
|
read from it.
|
||||||
* For simplicity, if passed the empty string as a search string, **my-grep**
|
* For simplicity, if passed the empty string as a search string, **wgrep**
|
||||||
can either match NO lines or match ALL lines, both are acceptable.
|
can either match NO lines or match ALL lines, both are acceptable.
|
||||||
|
|
||||||
## my-zip and my-unzip
|
## wzip and wunzip
|
||||||
|
|
||||||
The next tools you will build come in a pair, because one (**my-zip**) is a
|
The next tools you will build come in a pair, because one (**wzip**) is a
|
||||||
file compression tool, and the other (**my-unzip**) is a file decompression
|
file compression tool, and the other (**wunzip**) is a file decompression
|
||||||
tool.
|
tool.
|
||||||
|
|
||||||
The type of compression used here is a simple form of compression called
|
The type of compression used here is a simple form of compression called
|
||||||
*run-length encoding* (*RLE*). RLE is quite simple: when you encounter **n**
|
*run-length encoding* (*RLE*). RLE is quite simple: when you encounter **n**
|
||||||
characters of the same type in a row, the compression tool (**my-zip**) will
|
characters of the same type in a row, the compression tool (**wzip**) will
|
||||||
turn that into the number **n** and a single instance of the character.
|
turn that into the number **n** and a single instance of the character.
|
||||||
|
|
||||||
Thus, if we had a file with the following contents:
|
Thus, if we had a file with the following contents:
|
||||||
@@ -237,49 +237,49 @@ character in ASCII. Thus, a compressed file will consist of some number of
|
|||||||
length) and the single character.
|
length) and the single character.
|
||||||
|
|
||||||
To write out an integer in binary format (not ASCII), you should use
|
To write out an integer in binary format (not ASCII), you should use
|
||||||
**fwrite()**. Read the man page for more details. For **my-zip**, all
|
**fwrite()**. Read the man page for more details. For **wzip**, all
|
||||||
output should be written to standard output (the **stdout** file stream,
|
output should be written to standard output (the **stdout** file stream,
|
||||||
which, as with **stdin**, is already open when the program starts running).
|
which, as with **stdin**, is already open when the program starts running).
|
||||||
|
|
||||||
Note that typical usage of the **my-zip** tool would thus use shell
|
Note that typical usage of the **wzip** tool would thus use shell
|
||||||
redirection in order to write the compressed output to a file. For example,
|
redirection in order to write the compressed output to a file. For example,
|
||||||
to compress the file **file.txt** into a (hopefully smaller) **file.z**,
|
to compress the file **file.txt** into a (hopefully smaller) **file.z**,
|
||||||
you would type:
|
you would type:
|
||||||
|
|
||||||
```
|
```
|
||||||
prompt> ./my-zip file.txt > file.z
|
prompt> ./wzip file.txt > file.z
|
||||||
```
|
```
|
||||||
|
|
||||||
The "greater than" sign is a UNIX shell redirection; in this case, it ensures
|
The "greater than" sign is a UNIX shell redirection; in this case, it ensures
|
||||||
that the output from **my-zip** is written to the file **file.z** (instead of
|
that the output from **wzip** is written to the file **file.z** (instead of
|
||||||
being printed to the screen). You'll learn more about how this works a little
|
being printed to the screen). You'll learn more about how this works a little
|
||||||
later in the course.
|
later in the course.
|
||||||
|
|
||||||
The **my-unzip** tool simply does the reverse of the **my-zip** tool, taking
|
The **wunzip** tool simply does the reverse of the **wzip** tool, taking
|
||||||
in a compressed file and writing (to standard output again) the uncompressed
|
in a compressed file and writing (to standard output again) the uncompressed
|
||||||
results. For example, to see the contents of **file.txt**, you would type:
|
results. For example, to see the contents of **file.txt**, you would type:
|
||||||
|
|
||||||
```
|
```
|
||||||
prompt> ./my-unzip file.z
|
prompt> ./wunzip file.z
|
||||||
```
|
```
|
||||||
|
|
||||||
**my-unzip** should read in the compressed file (likely using **fread()**)
|
**wunzip** should read in the compressed file (likely using **fread()**)
|
||||||
and print out the uncompressed output to standard output using **printf()**.
|
and print out the uncompressed output to standard output using **printf()**.
|
||||||
|
|
||||||
**Details**
|
**Details**
|
||||||
|
|
||||||
* Correct invocation should pass one or more files via the command line to the
|
* Correct invocation should pass one or more files via the command line to the
|
||||||
program; if no files are specified, the program should exit with return code
|
program; if no files are specified, the program should exit with return code
|
||||||
1 and print "my-zip: file1 [file2 ...]" (followed by a newline) or
|
1 and print "wzip: file1 [file2 ...]" (followed by a newline) or
|
||||||
"my-unzip: file1 [file2 ...]" (followed by a newline) for **my-zip** and
|
"wunzip: file1 [file2 ...]" (followed by a newline) for **wzip** and
|
||||||
**my-unzip** respectively.
|
**wunzip** respectively.
|
||||||
* The format of the compressed file must match the description above exactly
|
* The format of the compressed file must match the description above exactly
|
||||||
(a 4-byte integer followed by a character for each run).
|
(a 4-byte integer followed by a character for each run).
|
||||||
* Do note that if multiple files are passed to **my-zip*, they are compressed
|
* Do note that if multiple files are passed to **wzip*, they are compressed
|
||||||
into a single compressed output, and when unzipped, will turn into a single
|
into a single compressed output, and when unzipped, will turn into a single
|
||||||
uncompressed stream of text (thus, the information that multiple files were
|
uncompressed stream of text (thus, the information that multiple files were
|
||||||
originally input into **my-zip** is lost). The same thing holds for
|
originally input into **wzip** is lost). The same thing holds for
|
||||||
**my-unzip**.
|
**wunzip**.
|
||||||
|
|
||||||
|
|
||||||
### Footnotes
|
### Footnotes
|
||||||
|
|||||||
Reference in New Issue
Block a user