148 lines
7.3 KiB
HTML
148 lines
7.3 KiB
HTML
<html><head>
|
|
<title>HPFS: Summary</title>
|
|
</head>
|
|
|
|
<body>
|
|
<center>
|
|
<h1>Summary</h1>
|
|
</center>
|
|
|
|
The HPFS solves all of the historical problems of the FAT file system. it
|
|
achieves excellent throughput even in extreme casses--many very small files
|
|
or a few very large files--by means of advanced data structures and techniques
|
|
such as intelligent caching read-ahead and write-behind. Disk space is used
|
|
economically because it is managed on a sector basis. Existing application
|
|
programs will need modification to take advantage of the HPF'S's support for
|
|
extended attributes and long filenames but these changes will not be difficult.
|
|
All application programs will benefit from the HPFS's improved performance and
|
|
decreased CPU use whether they are modified or not. This article is based on a
|
|
prerelease version of the HPFS that was still undergoing modification and
|
|
tuning. Therefore the final release of the HPFS may differ in some details
|
|
from the description given here.
|
|
<p>
|
|
Most programmers are at at least passingly famiIiar with the data structure
|
|
known as a binary tree. Binary trees are a technique for imposing a logical
|
|
ordering on a collection of data items by means of pointers, without regard
|
|
to the physical order of the data. In a simple binary tree, each node contains
|
|
some data, including a key value that determines the node's logical position
|
|
in the tree, as well as pointers to the node's left and right sub trees. The
|
|
node that begins the tree is known as the root: the nodes that sit at the
|
|
ends of the tree's branches are sometimes called the leaves. To find a
|
|
particular piece of data, the binary tree is traversed from the root. At each
|
|
node, the desired key is compared with the node's key: if they don't match,
|
|
one branch of the node's sub tree or another is selected based on whether the
|
|
desired key is less than or greater than the node's key. This process
|
|
continues until a match is found or an empty sub tree is encountered
|
|
(see <a href="#figa">Figure A</a>).
|
|
Such simple binary trees, although easy to understand and implement, have
|
|
disadvantages in practice. If keys are not well distributed or are added to
|
|
the tree in a non-random fashion, the tree can become quite asymmetric,
|
|
leading to wide variations in tree traversal times. In order to make access
|
|
times uniform, many programmers prefer a particular type of balanced tree
|
|
known as a B- Tree. For the purposes of this discussion, the important
|
|
points about a B-Tree are that data is stored in all nodes, more than one
|
|
data item might be stored in a node, and all of the branches of the tree
|
|
are of identical length (see <a href="#figb">Figure B</a>).
|
|
The worst-case behavior of a B-Tree is predictable and much better than that
|
|
of a simple binary tree, but the maintenance of a B-Tree is correspondingly
|
|
more complex. Adding a new data item, changing a key value, or deleting a
|
|
data item may result in the splitting or merging of a node, which in turm
|
|
forces a cascade of other operations on the tree to rebalance it. A B+ Tree
|
|
is a specialized form of B-Tree that has two types of nodes: internal, which
|
|
only point to other nodes, and external, which contain the actual data
|
|
(see <a href="#figc">Figure C</a>).
|
|
The advantage of a B+ Tree over a B- Tree is that the internal nodes of the
|
|
B+Tree can hold many more decision values than the intenmediate-level nodes
|
|
of a B-Tree, so the fan out of the tree is faster and the average length of
|
|
a branch is shorter. This makes up for the fact that yell must always follow
|
|
a B+ Tree branch to its end to get the data for which you are looking, whereas
|
|
in a B-Tree you may discover the data at an interme-diate code or even at the
|
|
root.
|
|
|
|
<table><tr>
|
|
<th> </th><th align=left>FAT File System </th><th align=left>High Performance File System</th>
|
|
</tr><tr>
|
|
<td>Maximum filename length </td><td>11(in 8.3 format) </td><td>254</td>
|
|
</tr><tr>
|
|
<td>Number of dot (.) delimeters allowed </td><td>One </td><td>Multiple</td>
|
|
</tr><tr>
|
|
<td>File Attributes </td><td>Bit flags </td><td>Bit flags plus up to 64Kb of free-form ASCll of binary information</td>
|
|
</tr><tr>
|
|
<td>Maximium Path Length </td><td>64 </td><td>260</td>
|
|
</tr><tr>
|
|
<td>Miniumum disk space overhead per file </td><td>Directory entry (32 bytes) </td><td>Directory entry (length varies) + Fnode (512 bytes)</td>
|
|
</tr><tr>
|
|
<td>Average wasted space per file </td><td>1/2 cluster (typically 2Kb or more) </td><td>1/2 sector (256 bytes)</td>
|
|
</tr><tr>
|
|
<td>Minimum alocation unit </td><td>Cluster (typically 4Kb or more) </td><td>Sector (512 bytes)</td>
|
|
</tr><tr>
|
|
<td>Allocation info for files </td><td>Centralized in FAT on home track </td><td>Located nearby each file in its Fnode</td>
|
|
</tr><tr>
|
|
<td>Free disk space info </td><td>Centralized in FAT on home track </td><td>Located near free space in bitmaps</td>
|
|
</tr><tr>
|
|
<td>Free disk space described per byte </td><td>2Kb ( 1/2 cluster at 8 sectors /clustor)
|
|
</td><td>4Kb (8 sectors)</td>
|
|
</tr><tr>
|
|
<td>Directory structure </td><td>Unsorted linear list, must be searched exhaustivily
|
|
</td><td>Sorted B-Tree</td>
|
|
</tr><tr>
|
|
<td>Directory Location </td><td>Root directory on home track, others scattered
|
|
</td><td>Localized near seek center of volume</td>
|
|
</tr><tr>
|
|
<td>Cache replacement strategy </td><td>Simple LRU
|
|
</td><td>Modified LRU, sensitive to data type and usage history</td>
|
|
</tr><tr>
|
|
<td>Read ahead </td><td>None in MS-DOS 3.3 or earlier, primitive read-ahead optional in MS-DOS 4
|
|
</td><td>Always present, sensitive to data type and usage history</td>
|
|
</tr><tr>
|
|
<td>Write behind </td><td>Not available </td><td>Used by default, but can be defeated on per-handle basis</td>
|
|
</tr></table>
|
|
<p>
|
|
|
|
|
|
<center>
|
|
<a href="figa.gif" name="figa">
|
|
<img src="figa.gif" alt="[Fig. A]" border=0></a>
|
|
</center>
|
|
<p>
|
|
<b>FIGURE A</b>:
|
|
To find a piece of data, the binary tree is traversed from the root untill
|
|
the data is found or an empty subtree is encountered.
|
|
<p>
|
|
|
|
|
|
<center>
|
|
<a href="figb.gif" name="figb">
|
|
<img src="figb.gif" alt="[Fig. B]" border=0></a>
|
|
</center>
|
|
<p>
|
|
<b>FIGURE B</b>:
|
|
In a balanced B-Tree, data is stored in nodes, more than one data item can
|
|
be stored in a node, and all branches of the tree are the same length.
|
|
<p>
|
|
|
|
|
|
<center>
|
|
<a href="figc.gif" name="figc">
|
|
<img src="figc.gif" alt="[Fig. C]" border=0></a>
|
|
</center>
|
|
<p>
|
|
<b>FIGURE C</b>:
|
|
A B+ Tree has internal nodes that point to other nodes and external nodes
|
|
that contain actual data.
|
|
|
|
|
|
<p>
|
|
<hr>
|
|
|
|
< <a href="app_hpfs.html">[Application Programs and HPFS]</a> |
|
|
<a href="hpfs.html">[HPFS Home]</a>
|
|
|
|
<hr>
|
|
|
|
<font size=-1>
|
|
Html'ed by <a href="http://www.seds.org/~spider/">Hartmut Frommert</a>
|
|
</font>
|
|
|
|
</body></html>
|