Files
2024-02-19 00:25:23 -05:00

1007 lines
39 KiB
HTML

<H1>Inside the High Performance File System</H1>
<H2>Part 4: B-Trees, DIRBLKs, and DIRENTs</H2>
Written by Dan Bridges
<H2>Introduction</H2>
This article originally appeared in the July 1996 issue of Significant
Bits, the monthly magazine of the Brisbug PC User Group Inc.
<P>
The most basic structures in the HPFS are DIRBLKs, DIRENTs and FNODEs.
This month we examine DIRBLKs and DIRENTs, learn about the differences
between binary trees and B-trees and see how DIRBLKs are interconnected to
facilitate quick access in a large directory (one of HPFS' strengths). To
assist in this investigation, this month's program, ShowBtree.cmd, will
help you to visualise the layout of directory and file entries in a
partition.
<H2>Binary Trees</H2>
A linearly-linked list, such as used by FAT, is a very simple
structure. It's fine for storing a small list. Typically, there is no
sorting. If there are N records in the list, on average, N/2 entries
will need to be searched before the required record is found.
<P>
An ordered tree is a more efficient method of storing a large number
of records. Some terminology. All trees have nodes. Nodes have keys
(elements). Nodes can have 1 or more links (pointers) to other nodes
and different tree structures have different arrangement of links. The
source of the link is the parent node and the destination is the child
node. Assuming every key in an intermediate node has a child link,
there can be one more link from a parent node to a child than the
number of keys in the parent. Say a node has space for a maximum of 8
keys and currently holds 5 keys (entries). This node can currently
have up to 6 links to its children. The maximum number of descendent
links in a node is also known as a tree's order. In this example the
tree has an order of 9. At the top of the tree is the root
node. (Computer trees are arranged upside-down compared to trees in
nature.) At the bottom of the tree are leaf nodes which have no
descendent links. Leaf nodes contain a special sentinel value in their
pointers to indicate the end of this particular branch.
<P>
The simplest type of tree structure is a binary tree. Each node
contains a single key and up to two links. The links go to child nodes
that contain keys that are less than or greater than the key in their
parent. Figure 1a shows a perfectly balanced binary tree while Figure
1b shows a degenerate structure. A binary tree is perfectly balanced
when the maximum length from the root to any leaf node is the
same. The number of levels in 1a is 3, and the order is 2 because it
has a maximum of two descendent links. The maximum number of records a
binary tree can hold is (2^levels) - 1 and the maximum number of levels
required is Log2(N+1) rounded up. In our perfect example we have a
maximum number of records of (2^3) - 1 = 7, and 7 records require
Log2(7+1) = 3 levels.
<P>
<IMG WIDTH=399 HEIGHT=293 SRC="fig1_4.gif">
<P>
<FONT SIZE=2>
Figure 1: Examples of perfectly balanced (a) and degenerate (b) binary trees.
</FONT>
<P>
Searching for an entry in the perfectly balanced tree requires a
maximum of 3 node checks (the same as the number of levels). The
required entry may be located in an intermediate node or the root node
so sometimes the search will end more quickly.
<P>
The degenerate tree in Figure 1b is no better than a linked linear
list. The difference becomes more apparent when you have many
entries. For example, say you have 1,023 keys. The maximum search
length for a balanced binary tree would be Log2(1023+1) = 10 whereas
the maximum for a degenerate tree would be 1,023 with an average of
about 512.
<P>
As you add records to a binary tree it grows downwards with subsequent
entries going to the left or the right based on whether or not they
are then are less than or greater than the record in the current
node. By convention, greater than values go to the right. In effect
the entries are sorted due to the way the search is performed for a
suitable node for their storage.
<P>
The first record in a sequence plays a crucial role because it ends up
as the key in the root node. If it is a bad choice then a lopsided
tree will develop. The optimum choice for the initial record is the
median value for the sequence. In a series of 7 values the median
value has 3 values less than it and 3 values greater than it. The
basic binary tree is unsuitable for use in a file system where there
is no prior knowledge of a suitable median value.
<H2>B-Trees</H2>
In a way, B-trees are the reverse of a binary tree. Whereas a binary
tree grows downwards, as you will see shortly a B-tree grows
upwards. This allows suitable node median values, which will be
promoted to a higher level, to emerge naturally when storage space in
a lower-level node runs out. Also B-trees have a greater order than 2
(more than 2 descendent links per node) which means than you can have
a maximum of 2 or more keys per node. So, a B-tree is flatter and has
a faster fan-out than a binary tree with the same number of keys.
<P>
We'll deal with an increasing number series. Figure 2 shows a B-tree
with an order of 10 (maximum of 10 descendent links and 9 keys per
node) just before and just after budding. Note how the median value in
the sequence is promoted to become the sole entry in the root
node. Figure 3 shows how adding more records will eventually cause
another promotion to the root.
<P>
<IMG WIDTH=352 HEIGHT=162 SRC="fig2_4.gif">
<P>
<FONT SIZE=2>
Figure 2: Adding another record to a B-tree node which is already full
produces promotion and node splitting. This B-tree has an order of 10
(max. 9 keys per node).
</FONT>
<P>
<IMG WIDTH=362 HEIGHT=194 SRC="fig3_4.gif">
<P>
<FONT SIZE=2>
Figure 3: Adding more records eventually produces further promotion
and node splitting.
</FONT>
<P>
Before we get down to rules I want to present an example with 3
levels. Figure 4 shows the level 2 -> level 3 transition in a
order-of-5 structure.
<P>
<IMG WIDTH=372 HEIGHT=267 SRC="fig4_4.gif">
<P>
<FONT SIZE=2>
Figure 4: The transition in an order-of-5 B-tree from a 2-level to a 3-level
structure.
</FONT>
<P>
Here are the rules that a B-tree structure obeys:
<UL>
<LI>It has nodes that have a maximum of order links and order - 1 keys.
<LI>Except for the leaf nodes, a node has one more link then it has keys.
<LI>The keys are stored in sequential lists within a node.
<LI>Leaf nodes are all the same distance from the root.
<LI>The root can't be empty and unless it's the sole node (and thus
also a leaf node) it will have at least two links.
<LI>Intermediate and leaf nodes, but not the root, have at least
order/2 - 1 keys.
<LI>New keys are only added to the leaf nodes.
</UL>
Source: <U>C++ Components and Algorithms</U>, Scott Robert Ladd, M&T Books
San Mateo CA, 1992, p.431. Ladd lists 8 rules but I've combined two of
them.
<P>
You can see from Figure 2 and 3 and Rule 3 that a B-tree combines
sequential lists linked in a multi-descendent tree structure. This
means that, while it only takes a few disk reads to get to the node
containing the required value, the node itself must be searched in a
sequential fashion. However, the whole of a node will typically be
read into memory in one gulp so the actual node search doesn't take
long either. Also the sorted ordering of Rule 3 ensures that it is a
straightforward matter to determine whether or not an entry exists.
<H2>The HPFS Implementation of a B-tree</H2>
Theory is fine but it's now time for some real-world examples. In the
HPFS implementation the role of the node is played by DIRBLKs, while
DIRENTs correspond to keys within the node. A DIRBLK is made up of 4
sectors which are considered as one 2K data structure. Within this
block there is a 20-byte header followed by around 44 DIRENTs, each
describing the properties of a file/directory and its location on the
disk. File/directory names are variable length so the size of a DIRENT
varies as well, but its size will always be a multiple of 4 to ensure
32-bit alignment. The name size includes "." and the extension, if
present. A name without an extension won't include the dot.
<P>
The first DIRENT in any directory, including the root directory, is a
".." entry. The last entry in any DIRBLK is an end marker record. The
number of entries in the first DIRBLK in a directory (not counting the
".." and the end marker records) compared to name size is:
<PRE>
Name Length Max. DIRENTs
2-5 54
6-9 49
10-13 44
14-17 40
</PRE>
The order for a DIRBLK will be one more than the maximum number of
DIRENTs (keys). Other DIRBLKs for a directory may have an order of one
more again due to not having a ".." entry.
<P>
Figure 5 shows the layout of a DIRBLK and its DIRENTs. We will use
ShowBtree.cmd (to be described later) to investigate the arrangement
of DIRENTs and DIRBLKs. We will also need a method of creating
multiple test files. The method presented here assumes the files will
be put in an empty partition, otherwise you have to find them
first. If you have the flexibility make the partition 1 MB (the
smallest possible HPFS volume) to speed things up. If you have a
larger empty partition then this will also be suitable and I suggest
you use default reformating. With WARP, the default will be "quick"
reformating (described in Part 1). I prefer to use the complete
reformat method ("/L") which means that you can take a binary look at
the disk without seeing artifacts from its previous incarnation. This
take longer unless you have a very small partition. (Due to the small
partition size the whole operation was only taking about 4 secs.) In
this test situation it's not going to matter if the artifacts
exist. Note: the disposition of HPFS structures changes if you have a
very small partition with the directory band, which is normally near
the centre, being placed near the beginning.
<PRE>
Offset Data Size Comment
hex (dec) bytes
Header
00h (1) Signature 4 0x77E40AAE
04h (5) Off. 1st Free 4 Offset within DIRBLK to 1st free entry.
08h (9) Change 4 Low bit indicates whether this is
topmost DIRBLK in B-tree.
0Ch (13) Parent LSN 4 If topmost, LSN of this dir's FNODE.
Otherwise, LSN of parent DIRBLK.
10h (17) DIRBLK's LSN 4 Self-pointer.
DIRENT #0
14h (21) Entry's Size 2 Always a multiple of 4.
16h (23) Flags 1 0x01 (0) Special ".." entry
0x02 (1) Has an ACL
0x04 (2) Has a B-tree down-pointer
0x08 (3) Entry is a dummy end record
0x10 (4) Has an EA list
0x20 (5) Has an extended permission list
0x40 (6) Has an explicit ACL
0x80 (7) Has "needed" EAs
17h (24) Attributes 1 0x01 (0) Read-Only
0x02 (1) Hidden
0x04 (2) System
0x08 (3) Not used. Would be Vol attr.
0x10 (4) Directory
0x20 (5) Archive
0x40 (6) Long Name (bigger than 8.3)
0x80 (7) Reserved
18h (25) FNODE LSN 4 FNODE of this entry.
1Ch (29) Time Last Mod. 4 Secs since 00:00 1-1-70
20h (33) File Size 4
24h (37) Time Last Access 4 Secs since 00:00 1-1-70
28h (41) Time When Created 4 Secs since 00:00 1-1-70
2Ch (45) EA Size 4 Size of internal or external EAs.
30h (49) Flex 1 3 LSBits: # ACLs. 5 MSBits: reserved
31h (50) Code Page Index 1 7 LSBits: CP index. MSBit: DCBS present
32h (51) Name Size 1 Length of following name
33h (52) Name n Variable length: 0-254
33h+n ACL ? Access Control List info, if present.
Padding 0-3 bytes to reestablish dword align.
x Down Ptr LSN 4 If present, down ptr to next B-tree node
DIRENT #1
Start offset: DIRBLKHdrLen (20 bytes) + DIRENT #0 size
...
Last "END" DIRENT in DIRBLK contains information of limited validity
Entry's Size 2 32 (dec) or 36 if it has a Down Ptr.
Flags 1 Bit 3 (counting from 0) set.
...
Don't trust any reported FNODE LSN. End marker doesn't have one.
Down Ptr LSN 4 If present.
Note 1: Decimal offset is shown with 1-based numbering to suit REXX work.
Note 2: x = DIRBLKHdrLen + DIRENT size - 4. Always computed from DIRENT end.
</PRE>
<FONT SIZE=2>
Figure 5: Layout of a HPFS DIRBLK and its DIRENTs. DIRBLKs are 2K in size and the
maximum number of DIRENTs they can contain varies due to the variable length of a
DIRENT's name field (0-254 bytes).
</FONT>
<P>
Figure 6 show the test program, TEST.CMD. You would issue "TEST 10" if
you want to produce 10 sequential zero-length file DIRENTs in the
N:\directoryname1 directory. The filenames are all 241 characters
long. When you add on the 18 characters of "N:\directoryname1\" you
arrive at the 259 character full pathname limit. I've used very long
filenames to reduce the number of DIRENTs that will fit in a DIRBLK to
a minimum so that structure listings are more compact.
<PRE>
/* Wipes a test partition (here N:), creates 3 directories &
places the requested number of files with very long names
in one of the directories. THIS PROGRAM WILL AUTOMATICALLY
REFORMAT THE TEST PARTION SO BE SURE THAT'S WHAT YOU WANT.
*/
ARG numOfFiles .
'@echo off'
'setlocal' /* Preserve orig drv & dir. Exiting returns there */
'echo y | format n: /l /fs:hpfs' /* Requires no keystrokes */
'md n:\directoryname1'
'md n:\directoryname3'
'md n:\directoryname2' /*"Out of sequence" directory creation*/
'n:'
'cd\directoryname1'
DO x = 1 TO numOfFiles
CALL Stream Pad(x), 'c', 'open' /* Create file */
CALL Stream Pad(x), 'c', 'close'
END
/* Deduce the B-tree layout with ShowBtree.cmd and send its
output to a LIST-type program running in STDIN mode */
'd:\hpfs4\showbtree n: | list /s'
EXIT /***********************************************/
Pad: /* Generate a very long constant-length filename */
RETURN Copies('a',228)||'Testfile'||Right('0000'||x,5)
</PRE>
<FONT SIZE=2>
Figure 6: TEST.CMD produces a sequential file sequence. Ensure that you
have a comment starting in row 1, col 1.
</FONT>
<P>
Figure 7 shows the result of "TEST 7". Each section, separated by a
space, is a DIRBLK with its start-end LSNs listed. First look at the
letters down the left. "S" indicates a special ".." entry; "E" is a
dummy end record; "D" is a directory. The first two of these come from
the Flags byte while the third one is determined from the Attributes
byte.
<PRE>
Root Directory:
1016-1019 Next Byte Free: 233 Topmost DirBlk
This directory's FNODE: 1032 (\ [level 1]) 1016->1032
**************************************************
SD 21 #00: .. FNODE:1032
D 57 #01: directoryname1 FNODE:1033
D 105 #02: directoryname2 FNODE:1035
D 153 #03: directoryname3 FNODE:1034
E 201 #04:
36-39 Next Byte Free: 1993 Topmost DirBlk
This directory's FNODE: 1033 (directoryname1 [level 1]) 36->1033
**************************************************
SD 21 #00: .. FNODE:1033
57 #01: aaTestfile00001 FNODE:316
329 #02: aaTestfile00002 FNODE:317
601 #03: aaTestfile00003 FNODE:318
873 #04: aaTestfile00004 FNODE:319
1145 #05: aaTestfile00005 FNODE:320
1417 #06: aaTestfile00006 FNODE:321
1689 #07: aaTestfile00007 FNODE:322
E 1961 #08:
68-71 Next Byte Free: 89 Topmost DirBlk
This directory's FNODE: 1034 (directoryname3 [level 1]) 68->1034
**************************************************
SD 21 #00: .. FNODE:1034
E 57 #01:
100-103 Next Byte Free: 89 Topmost DirBlk
This directory's FNODE: 1035 (directoryname2 [level 1])
100->1035
**************************************************
SD 21 #00: .. FNODE:1035
E 57 #01:
</PRE>
<FONT SIZE=2>
Figure 7: Output from the ShowBtree.cmd program. Test partition only had
3 subdirectories and 7 very long filename zero-length files in one of the
subdirectories.
</FONT>
<P>
The numbers to the right are the decimal offsets (counting from 1) to
these DIRENTs. (One-based numbering assists you when you are tracing
in REXX.) The "Next Byte Free" value at the top is also one-based. If
you add 32 bytes to the starting offset of the dummy end record you
will also arrive at this value.
<P>
The program is designed to truncate DIRENT names to the last 15
characters. Showing the additional 226 "a"s at the beginning of each
of these entries would only clutter the display.
<P>
Look now at the directoryname entries and their associated FNODEs
(every file and directory has an FNODE). Notice how the FNODE LSN
order is 1033, 1035 and 1034 due to the order in which they were
created but the DIRENTs in the Root Directory DIRBLK are sorted. This
indicates that sorting is done by rearranging DIRENTs, not by moving
either the files or their FNODEs on the disk.
<P>
There are multiple "level 1" entries in this display because the main
B-tree currently contains three other B-tree structures arising from
the presence of three subdirectories. The LSN pointer display on the
far right shows the back pointer relationship between the DIRBLK and
its parent FNODE. Notice how HPFS spaces the root DIRBLK of each
B-tree (36-39, 68-71 and 100-103). This allows space for each B-tree
to grow in complexity without its descendent DIRBLKs (nodes) being
situated far apart.
<P>
Each of the mammoth test file DIRENTs requires 272 bytes so, looking
at the DIRBLK starting at LSN 36 in Figure 7, we can see that another
one won't fit it this DIRBLK. Figure 8 omits the directoryname DIRENT
and DIRBLKs and concentrates on the changes that occur when we issue
"TEST 8". (You may now wish to alter TEST.CMD so it won't create the
last two directories but leave directoryname1 and create the files in
it.)
<PRE>
36-39 Next Byte Free: 905
Parent DirBlk: 44 (directoryname1 [level 2]) 36->44->1033
**************************************************
SD 21 #00: .. FNODE:1033
57 #01: aaTestfile00001 FNODE:316
329 #02: aaTestfile00002 FNODE:317
601 #03: aaTestfile00003 FNODE:318
E 873 #04:
40-43 Next Byte Free: 1141
Parent DirBlk: 44 (directoryname1 [level 2]) 40->44->1033
**************************************************
21 #00: aaTestfile00005 FNODE:320
293 #01: aaTestfile00006 FNODE:321
565 #02: aaTestfile00007 FNODE:322
837 #03: aaTestfile00008 FNODE:323
E 1109 #04:
44-47 Next Byte Free: 333 Topmost DirBlk
This directory's FNODE: 1033 (directoryname1 [level 1]) 44->1033
**************************************************
P 21 #00: aaTestfile00004 FNODE:319 DownPtr:36
PE 297 #01: DownPtr:40
</PRE>
<FONT SIZE=2>
Figure 8: Promotion and node splitting occurs when we put 8 very long
filename entries in a directory. Note the down pointers in the DIRENTs
for File 4 and for the end marker in the topmost DIRBLK in directoryname1's
B-tree. (The other two directories are not shown.)
</FONT>
<P>
We now see the presence of down-pointer LSNs in the new root node
(don't confuse this with the root directory - it's just the top of
this particular B-tree). There is one root DIRENT so there has to be
two descendent links (Rule 2). The dummy end marker's down-pointer is
best considered as a "greater than" pointer, while all other
down-pointers lead to "less than" values. Due to the large DIRENT
size, it is not immediately obvious that a median value was
promoted. To see this more clearly, temporarily comment out the
current Pad() RETURN line in TEST.CMD and instead generate a
4-character file sequence with:
<PRE>
RETURN Right('000'||x,4)
</PRE>
Now issue "TEST 55" (the first budding point) and you will see that
the distribution is: 1-26 (+ ".."), 27, 28-55.
<P>
We'll now look at the transition from a 2-level to a 3-level
structure. Figure 9a shows the structure with 35 very long
filenames. Due to the sequential (non-random) nature of our file
series all the action has been occurring on the far right node of
level 2. Each time this node fills up, it splits, promotes the median
up to the root node and moves the top half of the overfull node's
DIRENTs into a new level 2 node which then becomes the new rightmost
DIRBLK. Since no files lower in the sequence occur afterwards we end
up with partially filled nodes. In practice, files in a directory will
be added in a more random fashion and DIRBLK occupancy will be
greater. This time, when another file is added, there will be no space
in the root for another promoted median. So it's time for the root to
split and send up a new root node. See Figure 9b.
<P>
<A HREF="fig9a.gif">
<IMG WIDTH=99 HEIGHT=32 SRC="fig9a_sm.gif"></A>
<P>
<FONT SIZE=2>
Figure 9a: Simplified B-tree diagram for <B>directoryname1</B>. This B-tree currently
holds 35 files. The "Next Byte Free" for the DIRBLK situated at LSN 72-75 is offset
1956. Each "F" entry is 272 bytes long. If another one is added to this DIRBLK, then
promotion will occur. FNODEs are not shown here, but it should not be forgotten that a
DIRBLK is not an abstract data structure, but a means of storing the FNODE locations of
files.
</FONT>
<P>
<A HREF="fig9b.gif">
<IMG WIDTH=98 HEIGHT=43 SRC="fig9b_sm.gif"></A>
<P>
<FONT SIZE=2>
Figure 9b: The change to a 3-level structure.
</FONT>
<P>
What about deletions? What causes demotions? Rule 6 states that any
DIRBLK, except the root, must have at least order/2 - 1 DIRENTs. Since
order is 8 in this example you will notice that no lower level node
has less than three entries. Delete ...aTestfile00010. Thanks to
CMD.EXE's more powerful wildcard capabilities compared to COMMAND.COM,
you can issue:
<PRE>
DEL N:\DIRECTORYNAME1\*00010
</PRE>
This level 3 node now contains two entries excluding the end
marker. (It's not a proper DIRENT but rather a means of holding a
"greater than" down-pointer.) You should check that no demotion has
occurred yet. HPFS appears to break Rule 6 in the interests of
reducing disk activity. Keep deleting files from this node and
checking. After you delete the final file in this node HPFS will act
because a leaf node can not be empty. Figure 10 shows what
happens. File 12 is demoted from level 2 to become the sole occupant
of the leaf. But this causes a problem: we need to have 4 descendent
links from the level 2 node, but now only have 2 entries. So File 13
is promoted to fix this.
<P>
<IMG WIDTH=346 HEIGHT=444 SRC="fig10_4.gif">
<P>
<FONT SIZE=2>
Figure 10: The deletion of files in the DIRBLK situated at LSN 48-51.
This is a subset of the structure shown in Figure 9b.<BR>
Figure 10a: At this stage F10 and F11 have been deleted.<BR>
Figure 10b: When the final leaf entry (F9) is deleted, some DIRENT
shuffling occurs.
</FONT>
<P>
This rule breaking is only possible in a leaf node because leaves have
no descendents. To check this out we'll remove an entry from a level 2
directory. Re-establish the 36 file structure of Figure 9b and then
delete File 8. You'll find that File 9 gets promoted from level 3 to
plug the gap.
<H2>The ShowBtree.cmd Program</H2>
There are two ways to attack the problem of displaying DIRENTs on a
volume. The standard method would be to start at the root directory
and, using recursion, follow each directory branch to its end taking,
say, all lower branchings to the left. When the end is reached you
then come back up until you find a branch to the right and then you go
down it, and so on until you've traversed the whole occupied
structure. This approach did not appeal to me for this application
because I wanted to keep DIRENTs together and physically show how both
the total volume's and individual DIRBLK's B-trees evolve. Instead I
used the method of linearly traversing the Directory Band looking for
used entries. In Part 3 you saw how Data Band bitmaps chart the
occupancy of sectors in 8 MB data bands. Well a similar method is used
to show which sectors in the Directory Band are in use. The difference
is that, instead of 1 bit mapping 1 sector, 1 bit maps 1 DIRBLK (4
sectors). The File System needs to know about DIRBLK occupancy when it
goes to create a new DIRBLK. It is the job of the DIRBLK header itself
to keep track of DIRENT usage within the DIRBLK. Node entries are keep
in a sorted order without gaps between the entries so a "Next Byte
Free" offset pointer suffices when it comes time to determine where
the next DIRENT goes.
<P>
In a typically situation, on a empty partition of reasonable size (>=
15 MB), the root directory FNODE will be located either right before
or right after the Directory Band. The root directory DIRBLK will be
outside the Directory Band and situated next to its FNODE. The design
presented here (see Figure 11) finds the root directory FNODE from the
dword at offset 13 (decimal, counting from 1) in the SuperBlock. It
then consults the physical LSN value (offset 73) in the sole extent
entry in this FNODE. (FNODE layouts will be covered in Part 5.) This
LSN points to the start of the topmost DIRBLK in this B-tree i.e. the
start of the root directory DIRBLK. The ShowDirBlk procedure is then
used to display the Btree-related information of this DIRBLK.
<PRE>
/* Shows B-tree structures. Note: only the Root DIRBLK and
DIRBLKs in the Main Directory Band are examined. */
'@cls'
SAY
ARG drive . /* Drive parameter missing */
IF drive = '' THEN CALL Help
IF WordPos(drive,'? /? /H HELP A: B:') \= 0 THEN CALL Help
CALL RxFuncAdd 'ReadSect','Sector','ReadSect'
CALL RxFuncAdd 'SysLoadFuncs', 'RexxUtil', 'SysLoadFuncs'
CALL SysLoadFuncs
/* Initialise Lookup Table for later bitwise ANDing */
DO exponent = 0 TO 7
bitValue.exponent = D2C(2**exponent)
END exponent
sectorString = ReadSect(drive, 16) /*SuperBlock is LSN 16*/
rootFnodeLSN = Bytes2Dec(13,4)
dirBandSecs = Bytes2Dec(49,4)
dirBandStart = Bytes2Dec(53,4)
dirBandEnd = Bytes2Dec(57,4)
dirBandBitmapLSN = Bytes2Dec(61,4)
numDirBlks = dirBandSecs%4 /*Each DirBlk comprises 4 secs*/
dirBlk = 0
element. = 0 /*Initialise contents of this compound var to 0*/
dirBlkHdrLen = 20
/* Read in the root FNODE */
sectorString = ReadSect(drive, rootFnodeLSN)
rootDirBlkStart = Bytes2Dec(73,4)
i = 1 /*The ShowDirBlk() func expects i to hold the count*/
element.0 = 1
element.i.dirBlkLSN = rootDirBlkStart
SAY 'Root Directory:'
CALL Charout 'STDOUT:',rootDirBlkStart"-"rootDirBlkStart+3" "
Call ShowDirBlk /* Present the contents of the root DIRBLK */
i = 0
element. = 0
bytes = numDirBlks%8 /*Byte maps usage of 8 DIRBLKs (32 secs)*/
IF numDirBlks//8 > 0 THEN bytes = bytes+1 /* Round up */
/* Read in Dir Band Bitmap */
sector = ReadSect(drive,dirBandBitmapLSN)
/* Reduce length to valid data */
dirBandBitmapBytes = Left(sector,bytes)
DO bitmapByte = 1 TO bytes /*Examine occupancy of Dir Band*/
char = Substr(dirBandBitmapBytes,bitmapByte,1)
DO bitPos = 0 TO 7
IF BitAnd(char, bitValue.bitPos) = '0'x THEN
DO /* DIRBLK is occupied */
tempLSN = dirBandStart + dirBlk*4
/* Root DIRBLK covered earlier */
IF tempLSN \= rootDirBlkStart THEN
DO
i = i + 1
element.0 = i /*Inc count of occupied DIRBLKs*/
element.i.dirBlkLSN=tempLSN/*Store DIRBLK start*/
END
END
/* Have we reached the Directory Band end? */
IF dirBlk+1 < numDirBlks THEN
dirBlk = dirBlk+1 /* Not yet */
ELSE
LEAVE bitmapByte /* Exit outer DO loop */
END bitPos
END bitmapByte
IF i = 0 THEN EXIT
/* Dir Band empty (all entries must have been in root DIRBLK) */
/* Process contents of the compound variable */
DO i = 1 TO element.0
span = element.i.dirBlkLSN'-'element.i.dirBlkLSN+3
CALL Charout 'STDOUT:',span' '
CALL ShowDirBlk /* Present the contents of a DIRBLK */
END i
EXIT /************************************************/
Bytes2Dec:
ARG startPos,numOfChars
RETURN C2D(Reverse(Substr(sectorString,startPos,numOfChars)))
Byte2Char:
ARG startPos
RETURN Substr(sectorString,startPos,1)
ShowDirBlk:
sectorString = ''
DO k = 0 TO 3 /* Read in the 4 secs of the DIRBLK */
sectorString = sectorString||ReadSect(drive,element.i.dirBlkLSN+k)
END k
/* Make the 0-based NextFreeByte 1-based
to match DIRENT offset numbering */
nextFreeByte = Bytes2Dec(5,4)+1
CALL Charout 'STDOUT:','Next Byte Free:' nextFreeByte' '
IF BitAnd(Byte2Char(9),bitValue.0) \= '0'x THEN
DO /*Bit0 of the Change byte indicates this block is topmost*/
SAY 'Topmost DirBlk'
CALL CharOut 'STDOUT:',"This directory's FNODE:"
END
ELSE
DO
SAY
CALL Charout 'STDOUT:', 'Parent DirBlk:'
END
CALL Charout 'STDOUT:',' 'Bytes2Dec(13,4)' '
level = 1
parent = ''
CALL Charout 'STDOUT:',' ('||FindTheRoot(Bytes2Dec(13,4))||') '
PARSE VALUE SysCurPos() WITH row . /* Get current cursor pos*/
CALL SysCurPos row,60 /* Set cursor to column 60 */
SAY parent /* parent was determined by FindTheRoot() */
SAY Copies("*",50)
filenameStr = ''
dirEntOffset = dirBlkHdrLen+1 /* Convert to 1-based */
dirEnt. = ''
DO n = 0 TO 999 /* This loop will be exited by LEAVE */
IF dirEntOffset=nextFreeByte THEN LEAVE /*Reached end*/
dirEntLen = Bytes2Dec(dirEntOffset,2)
nameLen = Bytes2Dec(dirEntOffset+30,1)
flags = Byte2Char(dirEntOffset+2,1)
IF BitAnd(flags,bitValue.0) \= '0'x THEN
/* Is a ".." entry */
dirEnt.n.special = dirEnt.n.special||"S"
IF BitAnd(flags,bitValue.2) \= '0'x THEN
DO /* Has a down-pointer */
dirEnt.n.special = dirEnt.n.special||"P"
dirEnt.n.downPtr=Bytes2Dec(dirEntOffset+dirEntLen-4,4)
END
IF BitAnd(flags,bitValue.3) \= '0'x THEN /*Dummy end rec*/
dirEnt.n.special = dirEnt.n.special||"E"
ELSE
DO
/* Only bother showing a set directory attribute
if this isn't a dummy record */
attrib = Byte2Char(dirEntOffset+3)
IF BitAnd(attrib,bitValue.4) \= '0'x THEN
/* Directory attribute set */
dirEnt.n.special = dirEnt.n.special||"D"
dirEnt.n.fnode = Bytes2Dec(dirEntOffset+4,4)
END
dirEnt.n.name = Substr(sectorString,dirEntOffset+31,nameLen)
/* Translate non-printable "parent" chars */
IF dirEnt.n.name = '101'x THEN
dirEnt.n.name = '..'
dirEnt.n.offset = dirEntOffset /* Store offset */
/* Adjust for next DIRENT */
dirEntOffset = dirEntOffset+dirEntLen
END n
DO m = 0 to n-1
IF Pos('E',dirEnt.m.special) = 0 THEN
/* Not a dummy end record */
CALL Charout 'STDOUT:',Left(dirEnt.m.special,3) Format(dirEnt.m.offset,5),
'#'||Right(0||m,2)':' Strip(Right(dirEnt.m.name,15)),
' FNODE:'||dirEnt.m.fnode
ELSE
CALL Charout 'STDOUT:',Left(dirEnt.m.special,3) Format(dirEnt.m.offset,5),
'#'||Right(0||m,2)':' Strip(Right(dirEnt.m.name,15))
IF Pos('P',dirEnt.m.special) > 0 THEN
/* This DIRENT has a down-pointer */
SAY " DownPtr:"dirEnt.m.downPtr
ELSE
SAY
END m
SAY
RETURN
FindTheRoot:
ARG fnodeLSN
secString = ReadSect(drive,fnodeLSN)
/* Is this a DIRBLK or is it an FNODE? */
IF Substr(secString,1,4) = 'AE0AE477'x THEN
DO /* Is a DIRBLK because of Signature dword*/
parentLSN = C2D(Reverse(Substr(secString,13,4)))
IF level = 1 THEN
parent = element.i.dirBlkLSN"->"||fnodeLSN"->"
ELSE
parent = parent||fnodeLSN"->"
level = level+1
CALL FindTheRoot parentLSN /*Recurse up the FNODE chain*/
END
ELSE
DO /* Is an FNODE */
nameLen = C2D(Substr(secString,13,1))
filename = Substr(secString,14,nameLen)
IF Length(filename) > 15 THEN
/* Cut back name to max of 15 chars */
filename = Left(filename,15)
parentLSN = C2D(Reverse(Substr(secString,29,4)))
/* Add a root dir indicator */
IF filename = '' THEN
filename = '\'
filename = filename "[level" level"]"
IF level = 1 THEN
parent = element.i.dirBlkLSN"->"||fnodeLSN
ELSE
parent = parent||fnodeLSN
END
RETURN filename
Help:
SAY 'ShowBtree displays the layout of any Btree structures'
SAY 'on a HPFS partition. Due to limitations in the design'
SAY 'it will not show Btree branches that are outside the'
SAY 'Directory Band, except for the Root Directory DIRBLK'
SAY 'itself.'
SAY
SAY 'Usage Example: ShowBtree C:'
EXIT
</PRE>
<FONT SIZE=2>
Figure 11: The ShowBtree.cmd REXX program.
</FONT>
<P>
Once the root directory DIRBLK has been handled, the Directory Band is
sequentially accessed using information from its usage bitmap to
determine where other DIRBLKs are situated.
<P>
One limitation of ShowBtree.cmd becomes apparent when we add a long
name subdirectories under an existing directory. Figure 12 shows a run
where we only have N:\directoryname1 and
N:\directoryname1\subdirname1234567890. As expected, the name in the
DIRENT in the DIRBLK at LSN 36 has been truncated to the last 15
characters. However the topmost name as determined by the
FindTheRoot() function and shown in the DIRBLK at LSN 68 has also been
truncated to show the first 15 characters. What gives? This name was
determined by recursion, but in an upward direction. When we reach the
top we end with the FNODE (LSN 35) rather than the DIRENT (LSN 36 +
offset 57 dec). Now the DIRENT contains the full name whereas the
FNODE only has space for the first 15 characters of it. (This extra
infomation comes in handy when undeleting or for disaster
recovery). This appears to be an unavoidable outcome of using upwards
recursion in HPFS B-trees.
<PRE>
Root Directory:
1016-1019 Next Byte Free: 137 Topmost DirBlk
This directory's FNODE: 1032 (\ [level 1]) 1016->1032
**************************************************
SD 21 #00: .. FNODE:1032
D 57 #01: directoryname1 FNODE:1033
E 105 #02:
36-39 Next Byte Free: 141 Topmost DirBlk
This directory's FNODE: 1033 (directoryname1 [level 1]) 36->1033
**************************************************
SD 21 #00: .. FNODE:1033
D 57 #01: rname1234567890 FNODE:35
E 109 #02:
68-71 Next Byte Free: 89 Topmost DirBlk
This directory's FNODE: 35 (subdirname12345 [level 1]) 68->35
**************************************************
SD 21 #00: .. FNODE:35
E 57 #01:
</PRE>
<FONT SIZE=2>
Figure 12: An example of the limitation of finding the B-tree directory name
through following up the FNODE chain. The actual directory name was
subdirname1234567890.
</FONT>
<P>
Notice how each directory FNODE is a root entry in a new B-tree structure.
<P>
There is another limitation that is much more significant. Figure 13
shows the result of issuing "TEST 8" when TEST.CMD has been altered to
create no subdirectories and to place all files in the root
directory. This may be difficult for longtime FAT users to comprehend
but the root DIRBLK has budded and the topmost DIRBLK (LSN 40-43) of
the root directory B-tree structure is now situated inside the
Directory Band (LSN 36-235) whereas, before budding, the topmost block
was situated at LSN 1016-1019, close to the root directory FNODE at
LSN 1032. With a bottom-up search approach there is no indication that
a branch of the tree is now situated outside the Directory Band so it
is not found by ShowBtree.cmd. Of course, a top-down search method
would find it by following the topmost DIRBLK's end marker
down-pointer, but that approach was not conducive to producing a
sequential layout display. So this limitation rules out using this
program for general-purpose partition work. One solution would have
been to check the root directory DIRBLK and recursively go down any
branches whose initial down-pointer is outside the Directory
Band. I've not done that here due to space reasons. However, the
program, as is, is still useful for the examination of most B-tree
changes. (See next paragraph.)
<PRE>
Root Directory:
40-43 Next Byte Free: 333 Topmost DirBlk
This directory's FNODE: 1032 (\ [level 1]) 40->1032
**************************************************
P 21 #00: aaTestfile00004 FNODE:319 DownPtr:1016
PE 297 #01: DownPtr:36
36-39 Next Byte Free: 1141
Parent DirBlk: 40 (\ [level 2]) 36->40->1032
**************************************************
21 #00: aaTestfile00005 FNODE:320
293 #01: aaTestfile00006 FNODE:321
565 #02: aaTestfile00007 FNODE:322
837 #03: aaTestfile00008 FNODE:323
E 1109 #04:
</PRE>
<FONT SIZE=2>
Figure 13: How a B-tree branch is missed if it lies in a DIRBLK outside
both the Directory Band and the root DIRBLK.
</FONT>
<P>
While every HPFS partition initially has a sole DIRBLK outside the
Directory Band, it is possible for extra DIRBLKs to occur outside this
band. You can look for this occurrence by running the freeware FST
(File System Tool) program in Check mode with the summary option
e.g. "FST -n check -s C:". Figure 13 shows that this situation has
occurred on my C: drive. (FST reports the total number of used DIRBLKs
and how many of these are outside the directory band. In Figure 14
I've subtracted these to show Inside/Outside.) This partition has
2,400 sectors in its directory band i.e. it has space for 600
DIRBLKs. The reason why C: has a large number of external DIRBLKs is
due to the relatively large number of subdirectories in relation to
its directory band size. Remember that HPFS leaves some space between
each directory DIRBLK so that entries within it are more likely to be
situated close by. These external DIRBLKs are splattered all over the
partition. For search speed reasons (say when using a WHEREIS-type
program) it is better to keep all the entries within the directory
band so that head seeking will be minimised.
<PRE>
C: 120 600 230/250 406/2612
D: 400 1995 466/1 318/2612
E: 200 995 158/1 131/1978
F: 100 495 138/1 89/2286
G: 1282 4000 1501/1 1108/14619
</PRE>
<FONT SIZE=2>
Figure 14: DIRBLKs are usually, but not always, situated inside the Directory
Band. Due to the large number of directories on C: in relation to its size, it
has the majority of its DIRBLKs placed outside the Directory Band.
</FONT>
<P>
As a general rule of thumb you can expect at least as many DIRBLKs as
there are directories. The presence of any more DIRBLKs will depend on
the number of files in a directory. On my system the extra DIRBLKs
figure is around 40%.
<H2>Conclusion</H2>
This month we've taken a major step in understanding the structure of
HPFS. Next month we take another one when we examine FNODEs. Every
file/directory has an FNODE which maps the extent usage of the data
object. We'll also look at ALSECs (Allocation Sectors) which use a
B+tree structure. These are an interesting variation of what we've
covered here.