First cut at file system checker project

2018-04-19 10:39:10 -05:00
parent 774531059b
commit 12906fe6d8
2 changed files with 167 additions and 0 deletions
--- a/filesystems-checker/CONTEST.md
+++ b/filesystems-checker/CONTEST.md
@@ -0,0 +1,42 @@
+## Contest: A Better Checker
+
+For this project, there is a contest, which will compare checkers that can
+handle these more challenging condition checks:
+
+- Each `..` entry in directory refers to the proper parent inode, and parent inode
+points back to it. If not, print `ERROR: parent directory mismatch.`
+
+- Every directory traces back to the root directory. (i.e. no loops in the
+directory tree). If not, print `ERROR: inaccessible directory exists.`
+
+This better checker will also have to do something new: actually repair the
+image, in one specific case. Specifically, your task will be to repair the
+"inode marked use but not found in a directory" error. 
+
+We will provide you with an xv6 image that has a number of in-use inodes that
+are not linked by any directory. Your job is to collect these nodes and put
+them into the `lost_found` directory (which is already in the provided image
+under the root directory). Real checkers do this in order to preserve files
+that may be useful but for some reason are not linked into a directory.
+
+To do so, you will need to obtain write access to the file system image in
+order to modify it. This repair operation of your checker program should only
+be performed when `-r` flag is specified:
+
+```
+prompt> xcheck -r image_to_be_repaired
+```
+
+In this repair mode, your program should **not** exit when an error is
+encountered, but rather continue processing. For simplicity, you can also
+assume there is no other types of error in the provided image. It should exit
+only after it has created an entry under the `lost_found` directory for every
+lost inode. 
+
+The contest will be judged based on whether all extra tests are passed. If
+they are, the winner will be given to the most readable implementation.
+
+
+
+
+
--- a/filesystems-checker/README.md
+++ b/filesystems-checker/README.md
@@ -0,0 +1,125 @@
+
+# File System Checking
+
+In this assignment, you will be developing a working file system checker. A
+checker reads in a file system image and makes sure that it is
+consistent. When it isn't, the checker takes steps to repair the problems it
+sees; however, you won't be doing any repairs to keep this project a little
+simpler. 
+
+## Background
+
+Some basic background about file system consistency is found here:
+
+- [Crash Consistency: FSCK and Journaling](http://pages.cs.wisc.edu/~remzi/OSTEP/file-journaling.pdf)
+
+For those of you who are really interested, some research papers also make for
+a fun read:
+
+- [The Original FSCK Paper](https://docs.freebsd.org/44doc/smm/03.fsck/paper.pdf)
+- [Our Work on a SQL-Based Checker](https://www.usenix.org/legacy/event/osdi08/tech/full_papers/gunawi/gunawi.pdf)
+- [Our Work on a Faster File System Checker](http://research.cs.wisc.edu/wind/Publications/ffsck-fast13.pdf) and subsequent follow-up by McKusick on [the BSD Implementation](https://www.usenix.org/system/files/login/articles/05a_mckusick_020-023_online.pdf)
+
+
+## A Basic Checker
+
+For this project, you will use the xv6 file system image as the basic image
+that you will be reading and checking. The file `include/fs.h` includes the
+basic structures you need to understand, including the superblock, on disk
+inode format (`struct dinode`), and directory entry format (`struct
+dirent`). The tool `tools/mkfs.c` will also be useful to look at, in order to
+see how an empty file-system image is created.
+
+Much of this project will be puzzling out the exact on-disk format xv6 uses
+for its simple file system, and then writing checks to see if various parts of
+that structure are consistent. Thus, reading through `mkfs.c` and the file
+system code itself will help you understand how xv6 uses the bits in the image
+to record persistent information.
+
+Your checker should read through the file system image and determine the
+consistency of a number of things, including the following. When a problem is
+detected, print the error message (shown below) to **standard error** and
+exit immediately with **exit code 1** (i.e., call `exit(1)`). 
+
+- Each inode is either unallocated or one of the valid types (`T_FILE`, `T_DIR`,
+`T_DEV`). If not, print `ERROR: bad inode.`
+
+- For in-use inodes, each address that is used by inode is valid (points to a
+valid datablock address within the image). If the direct block is used and is
+invalid, print `ERROR: bad direct address in inode.`; if the indirect block is
+in use and is invalid, print `ERROR: bad indirect address in inode.`
+
+- Root directory exists, its inode number is 1, and the parent of the root
+directory is itself. If not, print `ERROR: root directory does not exist.`
+
+- Each directory contains `.` and `..` entries, and the `.` entry points to the
+directory itself. If not, print `ERROR: directory not properly formatted.`
+
+- For in-use inodes, each address in use is also marked in use in the
+  bitmap. If not, print `ERROR: address used by inode but marked free in bitmap.`
+
+- For blocks marked in-use in bitmap, the block should actually be in-use in
+an inode or indirect block somewhere. If not, print `ERROR: bitmap marks block in use but it is not in use.`
+
+- For in-use inodes, each direct address in use is only used once. If not,
+  print `ERROR: direct address used more than once.`
+
+- For in-use inodes, each indirect address in use is only used once. If not,
+  print `ERROR: indirect address used more than once.`
+
+- For all inodes marked in use, each must be referred to in at least one directory. 
+  If not, print `ERROR: inode marked use but not found in a directory.`
+
+- For each inode number that is referred to in a valid directory, it is actually
+  marked in use. If not, print `ERROR: inode referred to in directory but marked free.`
+
+- Reference counts (number of links) for regular files match the number of times
+  file is referred to in directories (i.e., hard links work correctly). 
+  If not, print `ERROR: bad reference count for file.`
+
+- No extra links allowed for directories (each directory only appears in one
+  other directory). If not, print `ERROR: directory appears more than once in file system.`
+
+
+## Other Specifications
+
+Your checker program, called `xcheck`, must be invoked exactly as follows:
+
+```
+prompt> xcheck file_system_image
+```
+
+The image file is a file that contains the file system image. If no image file
+is provided, you should print the usage error shown below:
+```
+prompt> xcheck 
+Usage: xcheck <file_system_image> 
+```
+This output must be printed to standard error and exit with the error code of 1. 
+
+If the file system image does not exist, you should print the error `image not
+found.` to standard error and exit with the error code of 1. 
+
+If the checker detects one 
+
+If the checker detects none of the problems listed above, it should exit with
+return code of 0 and not print anything. 
+
+## Hints
+
+It may be worth looking into using `mmap()` for the project. Like, seriously,
+use `mmap()` to access the file-system image, it will make your life so much
+better. 
+
+It should be very helpful to read Chapter 6 of the xv6 book
+[here](https://pdos.csail.mit.edu/6.828/2014/xv6/book-rev8.pdf). Note 
+that the version of xv6 we're using does not include the logging feature
+described in the book; you can safely ignore the parts that pertain to that.
+
+Make sure to look at `fs.img`, which is a file system image created when you
+make xv6 by the tool mkfs (found in the `tools/` directory of xv6). The output
+of this tool is the file `fs.img` and it is a consistent file-system image. The
+tests, of course, will put inconsistencies into this image, but your tool
+should work over a consistent image as well. Study `mkfs` and its output to
+begin to make progress on this project.
+