1029 lines
63 KiB
HTML
1029 lines
63 KiB
HTML
<!DOCTYPE HTML SYSTEM "html.dtd">
|
||
|
||
<HTML>
|
||
<HEAD><TITLE>Secure Deletion of Data from Magnetic and Solid-State Memory</TITLE></HEAD>
|
||
|
||
<CENTER>
|
||
<H1>Secure Deletion of Data from Magnetic and Solid-State Memory</H1>
|
||
<BR>
|
||
Peter Gutmann<BR>
|
||
<I>Department of Computer Science</I><BR>
|
||
<I>University of Auckland</I><BR>
|
||
<A HREF="mailto:pgut001@cs.auckland.ac.nz">pgut001@cs.auckland.ac.nz</A><P>
|
||
This paper was first published in the Sixth USENIX Security Symposium
|
||
Proceedings, San Jose, California, July 22-25, 1996<P>
|
||
<I>I am currently working on an update to this paper which is about twice as
|
||
long as this one, goes into a lot more detail (including background
|
||
information and explanations which were too long for this paper), and covers
|
||
new technology which has appeared since 1996, especially in terms of
|
||
recovering data from memory. I expect to have it available in the second
|
||
quarter of 1999.</I>
|
||
</CENTER>
|
||
|
||
<H2>Abstract</H2>
|
||
|
||
With the use of increasingly sophisticated encryption systems, an attacker
|
||
wishing to gain access to sensitive data is forced to look elsewhere for
|
||
information. One avenue of attack is the recovery of supposedly erased data
|
||
from magnetic media or random-access memory. This paper covers some of the
|
||
methods available to recover erased data and presents schemes to make this
|
||
recovery significantly more difficult.<P>
|
||
|
||
<H2>1. Introduction</H2>
|
||
|
||
Much research has gone into the design of highly secure encryption systems
|
||
intended to protect sensitive information. However work on methods of securing
|
||
(or at least safely deleting) the original plaintext form of the encrypted data
|
||
against sophisticated new analysis techniques seems difficult to find. In the
|
||
1980's some work was done on the recovery of erased data from magnetic media
|
||
[<A HREF="#1">1</A>] [<A HREF="#2">2</A>] [<A HREF="#3">3</A>], but to date the
|
||
main source of information is government standards covering the destruction of
|
||
data. There are two main problems with these official guidelines for
|
||
sanitizing media. The first is that they are often somewhat old and may
|
||
predate newer techniques for both recording data on the media and for
|
||
recovering the recorded data. For example most of the current guidelines on
|
||
sanitizing magnetic media predate the early-90's jump in recording densities,
|
||
the adoption of sophisticated channel coding techniques such as PRML, the use
|
||
of magnetic force microscopy for the analysis of magnetic media, and recent
|
||
studies of certain properties of magnetic media recording such as the behaviour
|
||
of erase bands. The second problem with official data destruction standards is
|
||
that the information in them may be partially inaccurate in an attempt to fool
|
||
opposing intelligence agencies (which is probably why a great many guidelines
|
||
on sanitizing media are classified). By deliberately under-stating the
|
||
requirements for media sanitization in publicly-available guides, intelligence
|
||
agencies can preserve their information-gathering capabilities while at the
|
||
same time protecting their own data using classified techniques.<P>
|
||
|
||
This paper represents an attempt to analyse the problems inherent in trying to
|
||
erase data from magnetic disk media and random-access memory without access to
|
||
specialised equipment, and suggests methods for ensuring that the recovery of
|
||
data from these media can be made as difficult as possible for an attacker.<P>
|
||
|
||
<H2>2. Methods of Recovery for Data stored on Magnetic Media</H2>
|
||
|
||
Magnetic force microscopy (MFM) is a recent technique for imaging magnetization
|
||
patterns with high resolution and minimal sample preparation. The technique is
|
||
derived from scanning probe microscopy (SPM) and uses a sharp magnetic tip
|
||
attached to a flexible cantilever placed close to the surface to be analysed,
|
||
where it interacts with the stray field emanating from the sample. An image of
|
||
the field at the surface is formed by moving the tip across the surface and
|
||
measuring the force (or force gradient) as a function of position. The
|
||
strength of the interaction is measured by monitoring the position of the
|
||
cantilever using an optical interferometer or tunnelling sensor.<P>
|
||
|
||
Magnetic force scanning tunneling microscopy (STM) is a more recent variant of
|
||
this technique which uses a probe tip typically made by plating pure nickel
|
||
onto a prepatterned surface, peeling the resulting thin film from the substrate
|
||
it was plated onto and plating it with a thin layer of gold to minimise
|
||
corrosion, and mounting it in a probe where it is placed at some small bias
|
||
potential (typically a few tenths of a nanoamp at a few volts DC) so that
|
||
electrons from the surface under test can tunnel across the gap to the probe
|
||
tip (or vice versa). The probe is scanned across the surface to be analysed as
|
||
a feedback system continuously adjusts the vertical position to maintain a
|
||
constant current. The image is then generated in the same way as for MFM [<A
|
||
HREF="#4">4</A>] [<A HREF="#5">5</A>]. Other techniques which have been used in
|
||
the past to analyse magnetic media are the use of ferrofluid in combination
|
||
with optical microscopes (which, with gigabit/square inch recording density is
|
||
no longer feasible as the magnetic features are smaller than the wavelength of
|
||
visible light) and a number of exotic techniques which require significant
|
||
sample preparation and expensive equipment. In comparison, MFM can be
|
||
performed through the protective overcoat applied to magnetic media, requires
|
||
little or no sample preparation, and can produce results in a very short
|
||
time.<P>
|
||
|
||
Even for a relatively inexperienced user the time to start getting images of
|
||
the data on a drive platter is about 5 minutes. To start getting useful images
|
||
of a particular track requires more than a passing knowledge of disk formats,
|
||
but these are well-documented, and once the correct location on the platter is
|
||
found a single image would take approximately 2-10 minutes depending on the
|
||
skill of the operator and the resolution required. With one of the more
|
||
expensive MFM's it is possible to automate a collection sequence and
|
||
theoretically possible to collect an image of the entire disk by changing the
|
||
MFM controller software.<P>
|
||
|
||
There are, from manufacturers sales figures, several thousand SPM's in use in
|
||
the field today, some of which have special features for analysing disk drive
|
||
platters, such as the vacuum chucks for standard disk drive platters along with
|
||
specialised modes of operation for magnetic media analysis. These SPM's can be
|
||
used with sophisticated programmable controllers and analysis software to allow
|
||
automation of the data recovery process. If commercially-available SPM's are
|
||
considered too expensive, it is possible to build a reasonably capable SPM for
|
||
about US$1400, using a PC as a controller [<A HREF="#6">6</A>].<P>
|
||
|
||
Faced with techniques such as MFM, truly deleting data from magnetic media is
|
||
very difficult. The problem lies in the fact that when data is written to the
|
||
medium, the write head sets the polarity of most, but not all, of the magnetic
|
||
domains. This is partially due to the inability of the writing device to write
|
||
in exactly the same location each time, and partially due to the variations in
|
||
media sensitivity and field strength over time and among devices.<P>
|
||
|
||
In conventional terms, when a one is written to disk the media records a one,
|
||
and when a zero is written the media records a zero. However the actual effect
|
||
is closer to obtaining a 0.95 when a zero is overwritten with a one, and a 1.05
|
||
when a one is overwritten with a one. Normal disk circuitry is set up so that
|
||
both these values are read as ones, but using specialised circuitry it is
|
||
possible to work out what previous "layers" contained. The recovery of at least
|
||
one or two layers of overwritten data isn't too hard to perform by reading the
|
||
signal from the analog head electronics with a high-quality digital sampling
|
||
oscilloscope, downloading the sampled waveform to a PC, and analysing it in
|
||
software to recover the previously recorded signal. What the software does is
|
||
generate an "ideal" read signal and subtract it from what was actually read,
|
||
leaving as the difference the remnant of the previous signal. Since the analog
|
||
circuitry in a commercial hard drive is nowhere near the quality of the
|
||
circuitry in the oscilloscope used to sample the signal, the ability exists to
|
||
recover a lot of extra information which isn't exploited by the hard drive
|
||
electronics (although with newer channel coding techniques such as PRML
|
||
(explained further on) which require extensive amounts of signal processing,
|
||
the use of simple tools such as an oscilloscope to directly recover the data is
|
||
no longer possible).<P>
|
||
|
||
Using MFM, we can go even further than this. During normal readback, a
|
||
conventional head averages the signal over the track, and any remnant
|
||
magnetization at the track edges simply contributes a small percentage of noise
|
||
to the total signal. The sampling region is too broad to distinctly detect the
|
||
remnant magnetization at the track edges, so that the overwritten data which is
|
||
still present beside the new data cannot be recovered without the use of
|
||
specialised techniques such as MFM or STM (in fact one of the "official" uses
|
||
of MFM or STM is to evaluate the effectiveness of disk drive servo-positioning
|
||
mechanisms) [<A HREF="#7">7</A>]. Most drives are capable of microstepping the
|
||
heads for internal diagnostic and error recovery purposes (typical error
|
||
recovery strategies consist of rereading tracks with slightly changed data
|
||
threshold and window offsets and varying the head positioning by a few percent
|
||
to either side of the track), but writing to the media while the head is
|
||
off-track in order to erase the remnant signal carries too much risk of making
|
||
neighbouring tracks unreadable to be useful (for this reason the microstepping
|
||
capability is made very difficult to access by external means).<P>
|
||
|
||
These specialised techniques also allow data to be recovered from magnetic
|
||
media long after the read/write head of the drive is incapable of reading
|
||
anything useful. For example one experiment in AC erasure involved driving the
|
||
write head with a 40 MHz square wave with an initial current of 12 mA which was
|
||
dropped in 2 mA steps to a final level of 2 mA in successive passes, an order
|
||
of magnitude more than the usual write current which ranges from high microamps
|
||
to low milliamps. Any remnant bit patterns left by this erasing process were
|
||
far too faint to be detected by the read head, but could still be observed
|
||
using MFM [<A HREF="#8">8</A>].<P>
|
||
|
||
Even with a DC erasure process, traces of the previously recorded signal may
|
||
persist until the applied DC field is several times the media coercivity [<A
|
||
HREF="#9">9</A>].<P>
|
||
|
||
Deviations in the position of the drive head from the original track may leave
|
||
significant portions of the previous data along the track edge relatively
|
||
untouched. Newly written data, present as wide alternating light and dark bands
|
||
in MFM and STM images, are often superimposed over previously recorded data
|
||
which persists at the track edges. Regions where the old and new data coincide
|
||
create continuous magnetization between the two. However, if the new
|
||
transition is out of phase with the previous one, a few microns of erase band
|
||
with no definite magnetization are created at the juncture of the old and new
|
||
tracks. The write field in the erase band is above the coercivity of the media
|
||
and would change the magnetization in these areas, but its magnitude is not
|
||
high enough to create new well- defined transitions. One experiment involved
|
||
writing a fixed pattern of all 1's with a bit interval of 2.5 µm, moving the
|
||
write head off-track by approximately half a track width, and then writing the
|
||
pattern again with a frequency slightly higher than that of the previously
|
||
recorded track for a bit interval of 2.45 µm to create all possible phase
|
||
differences between the transitions in the old and new tracks. Using a 4.2 µm
|
||
wide head produced an erase band of approximately 1 µm in width when the old
|
||
and new tracks were 180° out of phase, dropping to almost nothing when the two
|
||
tracks were in-phase. Writing data at a higher frequency with the original
|
||
tracks bit interval at 0.5 µm and the new tracks bit interval at 0.49 µm allows
|
||
a single MFM image to contain all possible phase differences, showing a
|
||
dramatic increase in the width of the erase band as the two tracks move from
|
||
in-phase to 180° out of phase [<A HREF="#10">10</A>].<P>
|
||
|
||
In addition, the new track width can exhibit modulation which depends on the
|
||
phase relationship between the old and new patterns, allowing the previous data
|
||
to be recovered even if the old data patterns themselves are no longer
|
||
distinct. The overwrite performance also depends on the position of the write
|
||
head relative to the originally written track. If the head is directly aligned
|
||
with the track, overwrite performance is relatively good; as the head moves
|
||
offtrack, the performance drops markedly as the remnant components of the
|
||
original data are read back along with the newly-written signal. This effect is
|
||
less noticeable as the write frequency increases due to the greater attenuation
|
||
of the field with distance [<A HREF="#11">11</A>].<P>
|
||
|
||
When all the above factors are combined it turns out that each track contains
|
||
an image of everything ever written to it, but that the contribution from each
|
||
"layer" gets progressively smaller the further back it was made. Intelligence
|
||
organisations have a lot of expertise in recovering these palimpsestuous
|
||
images.<P>
|
||
|
||
<H2>3. Erasure of Data stored on Magnetic Media</H2>
|
||
|
||
The general concept behind an overwriting scheme is to flip each magnetic
|
||
domain on the disk back and forth as much as possible (this is the basic idea
|
||
behind degaussing) without writing the same pattern twice in a row. If the
|
||
data was encoded directly, we could simply choose the desired overwrite pattern
|
||
of ones and zeroes and write it repeatedly. However, disks generally use some
|
||
form of run-length limited (RLL) encoding, so that the adjacent ones won't be
|
||
written. This encoding is used to ensure that transitions aren't placed too
|
||
closely together, or too far apart, which would mean the drive would lose track
|
||
of where it was in the data.<P>
|
||
|
||
To erase magnetic media, we need to overwrite it many times with alternating
|
||
patterns in order to expose it to a magnetic field oscillating fast enough that
|
||
it does the desired flipping of the magnetic domains in a reasonable amount of
|
||
time. Unfortunately, there is a complication in that we need to saturate the
|
||
disk surface to the greatest depth possible, and very high frequency signals
|
||
only "scratch the surface" of the magnetic medium. Disk drive manufacturers,
|
||
in trying to achieve ever-higher densities, use the highest possible
|
||
frequencies, whereas we really require the lowest frequency a disk drive can
|
||
produce. Even this is still rather high. The best we can do is to use the
|
||
lowest frequency possible for overwrites, to penetrate as deeply as possible
|
||
into the recording medium.<P>
|
||
|
||
The write frequency also determines how effectively previous data can be
|
||
overwritten due to the dependence of the field needed to cause magnetic
|
||
switching on the length of time the field is applied. Tests on a number of
|
||
typical disk drive heads have shown a difference of up to 20 dB in overwrite
|
||
performance when data recorded at 40 kFCI (flux changes per inch), typical of
|
||
recent disk drives, is overwritten with a signal varying from 0 to 100 kFCI.
|
||
The best average performance for the various heads appears to be with an
|
||
overwrite signal of around 10 kFCI, with the worst performance being at 100
|
||
kFCI [<A HREF="#12">12</A>]. The track write width is also affected by the
|
||
write frequency - as the frequency increases, the write width decreases for
|
||
both MR and TFI heads. In [<A HREF="#13">13</A>] there was a decrease in write
|
||
width of around 20% as the write frequency was increased from 1 to 40 kFCI,
|
||
with the decrease being most marked at the high end of the frequency range.
|
||
However, the decrease in write width is balanced by a corresponding increase in
|
||
the two side- erase bands so that the sum of the two remains nearly constant
|
||
with frequency and equal to the DC erase width for the head. The media
|
||
coercivity also affects the width of the write and erase bands, with their
|
||
width dropping as the coercivity increases (this is one of the explanations for
|
||
the ever-increasing coercivity of newer, higher-density drives).<P>
|
||
|
||
To try to write the lowest possible frequency we must determine what decoded
|
||
data to write to produce a low-frequency encoded signal.<P>
|
||
|
||
In order to understand the theory behind the choice of data patterns to write,
|
||
it is necessary to take a brief look at the recording methods used in disk
|
||
drives. The main limit on recording density is that as the bit density is
|
||
increased, the peaks in the analog signal recorded on the media are read at a
|
||
rate which may cause them to appear to overlap, creating intersymbol
|
||
interference which leads to data errors. Traditional peak detector read
|
||
channels try to reduce the possibility of intersymbol interference by coding
|
||
data in such a way that the analog signal peaks are separated as far as
|
||
possible. The read circuitry can then accurately detect the peaks (actually
|
||
the head itself only detects transitions in magnetisation, so the simplest
|
||
recording code uses a transition to encode a 1 and the absence of a transition
|
||
to encode a 0. The transition causes a positive/negative peak in the head
|
||
output voltage (thus the name "peak detector read channel"). To recover the
|
||
data, we differentiate the output and look for the zero crossings). Since a
|
||
long string of 0's will make clocking difficult, we need to set a limit on the
|
||
maximum consecutive number of 0's. The separation of peaks is implemented as
|
||
some form of run-length-limited, or RLL, coding.<P>
|
||
|
||
The RLL encoding used in most current drives is described by pairs of
|
||
run-length limits (<I>d, k</I>), where <I>d</I> is the minimum number of 0
|
||
symbols which must occur between each 1 symbol in the encoded data, and
|
||
<I>k</I> is the maximum. The parameters (<I>d, k</I>) are chosen to place
|
||
adjacent 1's far enough apart to avoid problems with intersymbol interference,
|
||
but not so far apart that we lose synchronisation.<P>
|
||
|
||
The grandfather of all RLL codes was FM, which wrote one user data bit followed
|
||
by one clock bit, so that a 1 bit was encoded as two transitions (1 wavelength)
|
||
while a 0 bit was encoded as one transition (<28> wavelength). A different
|
||
approach was taken in modified FM (MFM), which suppresses the clock bit except
|
||
between adjacent 0's (the ambiguity in the use of the term MFM is unfortunate.
|
||
From here on it will be used to refer to modified FM rather than magnetic force
|
||
microscopy). Taking three example sequences 0000, 1111, and 1010, these will be
|
||
encoded as 0(1)0(1)0(1)0, 1(0)1(0)1(0)1, and 1(0)0(0)1(0)0 (where the ()s are
|
||
the clock bits inserted by the encoding process). The maximum time between 1
|
||
bits is now three 0 bits (so that the peaks are no more than four encoded time
|
||
periods apart), and there is always at least one 0 bit (so that the peaks in
|
||
the analog signal are at least two encoded time periods apart), resulting in a
|
||
(1,3) RLL code. (1,3) RLL/MFM is the oldest code still in general use today,
|
||
but is only really used in floppy drives which need to remain
|
||
backwards-compatible.<P>
|
||
|
||
These constraints help avoid intersymbol interference, but the need to separate
|
||
the peaks reduces the recording density and therefore the amount of data which
|
||
can be stored on a disk. To increase the recording density, MFM was gradually
|
||
replaced by (2,7) RLL (the original "RLL" format), and that in turn by (1,7)
|
||
RLL, each of which placed less constraints on the recorded signal.<P>
|
||
|
||
Using our knowledge of how the data is encoded, we can now choose which decoded
|
||
data patterns to write in order to obtain the desired encoded signal. The
|
||
three encoding methods described above cover the vast majority of magnetic disk
|
||
drives. However, each of these has several possible variants. With MFM, only
|
||
one is used with any frequency, but the newest (1,7) RLL code has at least half
|
||
a dozen variants in use. For MFM with at most four bit times between
|
||
transitions, the lowest write frequency possible is attained by writing the
|
||
repeating decoded data patterns 1010 and 0101. These have a 1 bit every other
|
||
"data" bit, and the intervening "clock" bits are all 0. We would also like
|
||
patterns with every other clock bit set to 1 and all others set to 0, but these
|
||
are not possible in the MFM encoding (such "violations" are used to generate
|
||
special marks on the disk to identify sector boundaries). The best we can do
|
||
here is three bit times between transitions, which is generated by repeating
|
||
the decoded patterns 100100, 010010 and 001001. We should use several passes
|
||
with these patterns, as MFM drives are the oldest, lowest-density drives around
|
||
(this is especially true for the very-low-density floppy drives). As such,
|
||
they are the easiest to recover data from with modern equipment and we need to
|
||
take the most care with them.<P>
|
||
|
||
From MFM we jump to the next simplest case, which is (1,7) RLL. Although there
|
||
can be as many as 8 bit times between transitions, the lowest sustained
|
||
frequency we can have in practice is 6 bit times between transitions. This is a
|
||
desirable property from the point of view of the clock-recovery circuitry, and
|
||
all (1,7) RLL codes seem to have this property. We now need to find a way to
|
||
write the desired pattern without knowing the particular (1,7) RLL code used.
|
||
We can do this by looking at the way the drives error-correction system works.
|
||
The error- correction is applied to the decoded data, even though errors
|
||
generally occur in the encoded data. In order to make this work well, the data
|
||
encoding should have limited error amplification, so that an erroneous encoded
|
||
bit should affect only a small, finite number of decoded bits.<P>
|
||
|
||
Decoded bits therefore depend only on nearby encoded bits, so that a repeating
|
||
pattern of encoded bits will correspond to a repeating pattern of decoded bits.
|
||
The repeating pattern of encoded bits is 6 bits long. Since the rate of the
|
||
code is 2/3, this corresponds to a repeating pattern of 4 decoded bits. There
|
||
are only 16 possibilities for this pattern, making it feasible to write all of
|
||
them during the erase process. So to achieve good overwriting of (1,7) RLL
|
||
disks, we write the patterns 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111,
|
||
1000, 1001, 1010, 1011, 1100, 1101, 1110, and 1111. These patterns also
|
||
conveniently cover two of the ones needed for MFM overwrites, although we
|
||
should add a few more iterations of the MFM-specific patterns for the reasons
|
||
given above.<P>
|
||
|
||
Finally, we have (2,7) RLL drives. These are similar to MFM in that an
|
||
eight-bit-time signal can be written in some phases, but not all. A
|
||
six-bit-time signal will fill in the remaining cracks. Using a <20> encoding
|
||
rate, an eight-bit-time signal corresponds to a repeating pattern of 4 data
|
||
bits. The most common (2,7) RLL code is shown below:<P>
|
||
|
||
<CENTER>
|
||
<TABLE ALIGN=center BORDER=1 CELLPADDING=3 CELLSPACING=2>
|
||
<TR><TH COLSPAN=2>The most common (2,7) RLL Code</TH></TR>
|
||
<TR><TH>Decoded Data</TH><TH>(2,7) RLL Encoded Data</TH></TR>
|
||
<TR><TD>00 </TD><TD>1000 </TD></TR>
|
||
<TR><TD>01 </TD><TD>0100 </TD></TR>
|
||
<TR><TD>100 </TD><TD>001000 </TD></TR>
|
||
<TR><TD>101 </TD><TD>100100 </TD></TR>
|
||
<TR><TD>111 </TD><TD>000100 </TD></TR>
|
||
<TR><TD>1100</TD><TD>00001000</TD></TR>
|
||
<TR><TD>1101</TD><TD>00100100</TD></TR>
|
||
</TABLE>
|
||
</CENTER>
|
||
<P>
|
||
|
||
The second most common (2,7) RLL code is the same but with the "decoded data"
|
||
complemented, which doesn't alter these patterns. Writing the required encoded
|
||
data can be achieved for every other phase using patterns of 0x33, 0x66, 0xCC
|
||
and 0x99, which are already written for (1,7) RLL drives.<P>
|
||
|
||
Six-bit-time patterns can be written using 3-bit repeating patterns. The
|
||
all-zero and all-one patterns overlap with the (1,7) RLL patterns, leaving six
|
||
others:<P>
|
||
|
||
<PRE>
|
||
001001001001001001001001
|
||
2 4 9 2 4 9
|
||
</PRE>
|
||
|
||
in binary or 0x24 0x92 0x49, 0x92 0x49 0x24 and 0x49 0x24 0x92 in hex, and<P>
|
||
|
||
<PRE>
|
||
011011011011011011011011
|
||
6 D B 6 D B
|
||
</PRE>
|
||
|
||
in binary or 0x6D 0xB6 0xDB, 0xB6 0xDB 0x6D and 0xDB 0x6D 0xB6 in hex. The
|
||
first three are the same as the MFM patterns, so we need only three extra
|
||
patterns to cover (2,7) RLL drives.<P>
|
||
|
||
Although (1,7) is more popular in recent (post-1990) drives, some older hard
|
||
drives do still use (2,7) RLL, and with the ever-increasing reliability of
|
||
newer drives it is likely that they will remain in use for some time to come,
|
||
often being passed down from one machine to another. The above three patterns
|
||
also cover any problems with endianness issues, which weren't a concern in the
|
||
previous two cases, but would be in this case (actually, thanks to the strong
|
||
influence of IBM mainframe drives, everything seems to be uniformly big-endian
|
||
within bytes, with the most significant bit being written to the disk first).<P>
|
||
|
||
The latest high-density drives use methods like Partial-Response
|
||
Maximum-Likelihood (PRML) encoding, which may be roughly equated to the trellis
|
||
encoding done by V.32 modems in that it is effective but computationally
|
||
expensive. PRML codes are still RLL codes, but with somewhat different
|
||
constraints. A typical code might have (0,4,4) constraints in which the 0
|
||
means that 1's in a data stream can occur right next to 0's (so that peaks in
|
||
the analog readback signal are not separated), the first 4 means that there can
|
||
be no more than four 0's between 1's in a data stream, and the second 4
|
||
specifies the maximum number of 0's between 1's in certain symbol subsequences.
|
||
PRML codes avoid intersymbol influence errors by using digital filtering
|
||
techniques to shape the read signal to exhibit desired frequency and timing
|
||
characteristics (this is the "partial response" part of PRML) followed by
|
||
maximum- likelihood digital data detection to determine the most likely
|
||
sequence of data bits that was written to the disk (this is the "maximum
|
||
likelihood" part of PRML). PRML channels achieve the same low bit error rate as
|
||
standard peak-detection methods, but with much higher recording densities,
|
||
while using the same heads and media. Several manufacturers are currently
|
||
engaged in moving their peak-detection-based product lines across to PRML,
|
||
giving a 30-40% density increase over standard RLL channels [<A
|
||
HREF="#14">14</A>].<P>
|
||
|
||
Since PRML codes don't try to separate peaks in the same way that non-PRML RLL
|
||
codes do, all we can do is to write a variety of random patterns because the
|
||
processing inside the drive is too complex to second- guess. Fortunately,
|
||
these drives push the limits of the magnetic media much more than older drives
|
||
ever did by encoding data with much smaller magnetic domains, closer to the
|
||
physical capacity of the magnetic media (the current state of the art in PRML
|
||
drives has a track density of around 6700 TPI (tracks per inch) and a data
|
||
recording density of 170 kFCI, nearly double that of the nearest (1,7) RLL
|
||
equivalent. A convenient side-effect of these very high recording densities is
|
||
that a written transition may experience the write field cycles for successive
|
||
transitions, especially at the track edges where the field distribution is much
|
||
broader [<A HREF="#15">15</A>]. Since this is also where remnant data is most
|
||
likely to be found, this can only help in reducing the recoverability of the
|
||
data). If these drives require sophisticated signal processing just to read
|
||
the most recently written data, reading overwritten layers is also
|
||
correspondingly more difficult. A good scrubbing with random data will do
|
||
about as well as can be expected.<P>
|
||
|
||
We now have a set of 22 overwrite patterns which should erase everything,
|
||
regardless of the raw encoding. The basic disk eraser can be improved slightly
|
||
by adding random passes before and after the erase process, and by performing
|
||
the deterministic passes in random order to make it more difficult to guess
|
||
which of the known data passes were made at which point. To deal with all this
|
||
in the overwrite process, we use the sequence of 35 consecutive writes shown
|
||
below:<P>
|
||
|
||
<CENTER>
|
||
<TABLE ALIGN=center BORDER=1 CELLPADDING=3 CELLSPACING=2>
|
||
<TR><TH COLSPAN=5>Overwrite Data</TH></TR>
|
||
<TR><TH>Pass No.</TH><TH>Data Written</TH><TH COLSPAN=3>Encoding Scheme Targeted</TH></TR>
|
||
<TR><TD>1 </TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>2 </TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>3 </TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>4 </TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>5 </TD><TD>01010101 01010101 01010101 0x55 </TD><TD>(1,7) RLL</TD><TD> </TD><TD>MFM </TD></TR>
|
||
<TR><TD>6 </TD><TD>10101010 10101010 10101010 0xAA </TD><TD>(1,7) RLL</TD><TD> </TD><TD>MFM </TD></TR>
|
||
<TR><TD>7 </TD><TD>10010010 01001001 00100100 0x92 0x49 0x24</TD><TD> </TD><TD>(2,7) RLL</TD><TD>MFM </TD></TR>
|
||
<TR><TD>8 </TD><TD>01001001 00100100 10010010 0x49 0x24 0x92</TD><TD> </TD><TD>(2,7) RLL</TD><TD>MFM </TD></TR>
|
||
<TR><TD>9 </TD><TD>00100100 10010010 01001001 0x24 0x92 0x49</TD><TD> </TD><TD>(2,7) RLL</TD><TD>MFM </TD></TR>
|
||
<TR><TD>10</TD><TD>00000000 00000000 00000000 0x00 </TD><TD>(1,7) RLL</TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>11</TD><TD>00010001 00010001 00010001 0x11 </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>12</TD><TD>00100010 00100010 00100010 0x22 </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>13</TD><TD>00110011 00110011 00110011 0x33 </TD><TD>(1,7) RLL</TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>14</TD><TD>01000100 01000100 01000100 0x44 </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>15</TD><TD>01010101 01010101 01010101 0x55 </TD><TD>(1,7) RLL</TD><TD> </TD><TD>MFM </TD></TR>
|
||
<TR><TD>16</TD><TD>01100110 01100110 01100110 0x66 </TD><TD>(1,7) RLL</TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>17</TD><TD>01110111 01110111 01110111 0x77 </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>18</TD><TD>10001000 10001000 10001000 0x88 </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>19</TD><TD>10011001 10011001 10011001 0x99 </TD><TD>(1,7) RLL</TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>20</TD><TD>10101010 10101010 10101010 0xAA </TD><TD>(1,7) RLL</TD><TD> MFM </TD><TD> </TD></TR>
|
||
<TR><TD>21</TD><TD>10111011 10111011 10111011 0xBB </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>22</TD><TD>11001100 11001100 11001100 0xCC </TD><TD>(1,7) RLL</TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>23</TD><TD>11011101 11011101 11011101 0xDD </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>24</TD><TD>11101110 11101110 11101110 0xEE </TD><TD>(1,7) RLL</TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>25</TD><TD>11111111 11111111 11111111 0xFF </TD><TD>(1,7) RLL</TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>26</TD><TD>10010010 01001001 00100100 0x92 0x49 0x24</TD><TD> </TD><TD>(2,7) RLL</TD><TD>MFM </TD></TR>
|
||
<TR><TD>27</TD><TD>01001001 00100100 10010010 0x49 0x24 0x92</TD><TD> </TD><TD>(2,7) RLL</TD><TD>MFM </TD></TR>
|
||
<TR><TD>28</TD><TD>00100100 10010010 01001001 0x24 0x92 0x49</TD><TD> </TD><TD>(2,7) RLL</TD><TD>MFM </TD></TR>
|
||
<TR><TD>29</TD><TD>01101101 10110110 11011011 0x6D 0xB6 0xDB</TD><TD> </TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>30</TD><TD>10110110 11011011 01101101 0xB6 0xDB 0x6D</TD><TD> </TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>31</TD><TD>11011011 01101101 10110110 0xDB 0x6D 0xB6</TD><TD> </TD><TD>(2,7) RLL</TD><TD> </TD></TR>
|
||
<TR><TD>32</TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>33</TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>34</TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
<TR><TD>35</TD><TD>Random </TD><TD> </TD><TD> </TD><TD> </TD></TR>
|
||
</TABLE>
|
||
</CENTER>
|
||
<P>
|
||
|
||
The MFM-specific patterns are repeated twice because MFM drives have the lowest
|
||
density and are thus particularly easy to examine. The deterministic patterns
|
||
between the random writes are permuted before the write is performed, to make
|
||
it more difficult for an opponent to use knowledge of the erasure data written
|
||
to attempt to recover overwritten data (in fact we need to use a
|
||
cryptographically strong random number generator to perform the permutations to
|
||
avoid the problem of an opponent who can read the last overwrite pass being
|
||
able to predict the previous passes and "echo cancel" passes by subtracting the
|
||
known overwrite data).<P>
|
||
|
||
If the device being written to supports caching or buffering of data, this
|
||
should be disabled to ensure that physical disk writes are performed for each
|
||
pass instead of everything but the last pass being lost in the buffering. For
|
||
example physical disk access can be forced during SCSI-2 Group 1 write commands
|
||
by setting the Force Unit Access bit in the SCSI command block (although at
|
||
least one popular drive has a bug which causes all writes to be ignored when
|
||
this bit is set - remember to test your overwrite scheme before you deploy it).
|
||
Another consideration which needs to be taken into account when trying to erase
|
||
data through software is that drives conforming to some of the higher-level
|
||
protocols such as the various SCSI standards are relatively free to interpret
|
||
commands sent to them in whichever way they choose (as long as they still
|
||
conform to the SCSI specification). Thus some drives, if sent a FORMAT UNIT
|
||
command may return immediately without performing any action, may simply
|
||
perform a read test on the entire disk (the most common option), or may
|
||
actually write data to the disk (the SCSI- 2 standard includes an
|
||
initialization pattern (IP) option for the FORMAT UNIT command, however this is
|
||
not necessarily supported by existing drives).<P>
|
||
|
||
If the data is very sensitive and is stored on floppy disk, it can best be
|
||
destroyed by removing the media from the disk liner and burning it, or by
|
||
burning the entire disk, liner and all (most floppy disks burn remarkably well
|
||
- albeit with quantities of oily smoke - and leave very little residue).<P>
|
||
|
||
<H2>4. Other Methods of Erasing Magnetic Media</H2>
|
||
|
||
The previous section has concentrated on erasure methods which require no
|
||
specialised equipment to perform the erasure. Alternative means of erasing
|
||
media which do require specialised equipment are degaussing (a process in which
|
||
the recording media is returned to its initial state) and physical destruction.
|
||
Degaussing is a reasonably effective means of purging data from magnetic disk
|
||
media, and will even work through most drive cases (research has shown that the
|
||
aluminium housings of most disk drives attenuate the degaussing field by only
|
||
about 2 dB [<A HREF="#16">16</A>]).<P>
|
||
|
||
The switching of a single-domain magnetic particle from one magnetization
|
||
direction to another requires the overcoming of an energy barrier, with an
|
||
external magnetic field helping to lower this barrier. The switching depends
|
||
not only on the magnitude of the external field, but also on the length of time
|
||
for which it is applied. For typical disk drive media, the short-term field
|
||
needed to flip enough of the magnetic domains to be useful in recording a
|
||
signal is about 1/3 higher than the coercivity of the media (the exact figure
|
||
varies with different media types) [<A HREF="#17">17</A>].<P>
|
||
|
||
However, to effectively erase a medium to the extent that recovery of data from
|
||
it becomes uneconomical requires a magnetic force of about five times the
|
||
coercivity of the medium [<A HREF="#18">18</A>], although even small external
|
||
magnetic fields are sufficient to upset the normal operation of a hard disk
|
||
(typically a few gauss at DC, dropping to a few milligauss at 1 MHz).
|
||
Coercivity (measured in Oersteds, Oe) is a property of magnetic material and is
|
||
defined as the amount of magnetic field necessary to reduce the magnetic
|
||
induction in the material to zero - the higher the coercivity, the harder it is
|
||
to erase data from a medium. Typical figures for various types of magnetic
|
||
media are given below:<P>
|
||
|
||
<CENTER>
|
||
<TABLE ALIGN=center BORDER=1 CELLPADDING=3 CELLSPACING=2>
|
||
<TR><TH COLSPAN=2>Typical Media Coercivity Figures</TH></TR>
|
||
<TR><TH>Medium</TH><TH>Coercivity</TH></TR>
|
||
<TR><TD>5.25" 360K floppy disk </TD><TD>300 Oe </TD></TR>
|
||
<TR><TD>5.25" 1.2M floppy disk </TD><TD>675 Oe </TD></TR>
|
||
<TR><TD>3.5" 720K floppy disk </TD><TD>300 Oe </TD></TR>
|
||
<TR><TD>3.5" 1.44M floppy disk </TD><TD>700 Oe </TD></TR>
|
||
<TR><TD>3.5" 2.88M floppy disk </TD><TD>750 Oe </TD></TR>
|
||
<TR><TD>3.5" 21M floptical disk </TD><TD>750 Oe </TD></TR>
|
||
<TR><TD>Older (1980's) hard disks </TD><TD>900-1400 Oe </TD></TR>
|
||
<TR><TD>Newer (1990's) hard disks </TD><TD>1400-2200 Oe</TD></TR>
|
||
<TR><TD>1/2" magnetic tape </TD><TD>300 Oe </TD></TR>
|
||
<TR><TD>1/4" QIC tape </TD><TD>550 Oe </TD></TR>
|
||
<TR><TD>8 mm metallic particle tape</TD><TD>1500 Oe </TD></TR>
|
||
<TR><TD>DAT metallic particle tape </TD><TD>1500 Oe </TD></TR>
|
||
</TABLE>
|
||
</CENTER>
|
||
<P>
|
||
|
||
US Government guidelines class tapes of 350 Oe coercivity or less as low-energy
|
||
or Class I tapes and tapes of 350-750 Oe coercivity as high-energy or Class II
|
||
tapes. Degaussers are available for both types of tapes. Tapes of over 750 Oe
|
||
coercivity are referred to as Class III, with no known degaussers capable of
|
||
fully erasing them being known [<A HREF="#19">19</A>], since even the most
|
||
powerful commercial AC degausser cannot generate the recommended 7,500 Oe
|
||
needed for full erasure of a typical DAT tape currently used for data
|
||
backups.<P>
|
||
|
||
Degaussing of disk media is somewhat more difficult - even older hard disks
|
||
generally have a coercivity equivalent to Class III tapes, making them fairly
|
||
difficult to erase at the outset. Since manufacturers rate their degaussers in
|
||
peak gauss and measure the field at a certain orientation which may not be
|
||
correct for the type of medium being erased, and since degaussers tend to be
|
||
rated by whether they erase sufficiently for clean rerecording rather than
|
||
whether they make the information impossible to recover, it may be necessary to
|
||
resort to physical destruction of the media to completely sanitise it (in fact
|
||
since degaussing destroys the sync bytes, ID fields, error correction
|
||
information, and other paraphernalia needed to identify sectors on the media,
|
||
thus rendering the drive unusable, it makes the degaussing process mostly
|
||
equivalent to physical destruction). In addition, like physical destruction,
|
||
it requires highly specialised equipment which is expensive and difficult to
|
||
obtain (one example of an adequate degausser was the 2.5 MW Navy research
|
||
magnet used by a former Pentagon site manager to degauss a 14" hard drive for
|
||
1<EFBFBD> minutes. It bent the platters on the drive and probably succeeded in
|
||
erasing it beyond the capabilities of any data recovery attempts [<A
|
||
HREF="#20">20</A>]).<P>
|
||
|
||
<H2>5. Further Problems with Magnetic Media</H2>
|
||
|
||
A major issue which cannot be easily addressed using any standard
|
||
software-based overwrite technique is the problem of defective sector handling.
|
||
When the drive is manufactured, the surface is scanned for defects which are
|
||
added to a defect list or flaw map. If further defects, called grown defects,
|
||
occur during the life of the drive, they are added to the defect list by the
|
||
drive or by drive management software. There are several techniques which are
|
||
used to mask the defects in the defect list. The first, alternate tracks, moves
|
||
data from tracks with defects to known good tracks. This scheme is the
|
||
simplest, but carries a high access cost, as each read from a track with
|
||
defects requires seeking to the alternate track and a rotational latency delay
|
||
while waiting for the data location to appear under the head, performing the
|
||
read or write, and, if the transfer is to continue onto a neighbouring track,
|
||
seeking back to the original position. Alternate tracks may be interspersed
|
||
among data tracks to minimise the seek time to access them.<P>
|
||
|
||
A second technique, alternate sectors, allocates alternate sectors at the end
|
||
of the track to minimise seeks caused by defective sectors. This eliminates
|
||
the seek delay, but still carries some overhead due to rotational latency. In
|
||
addition it reduces the usable storage capacity by 1-3%.<P>
|
||
|
||
A third technique, inline sector sparing, again allocates a spare sector at the
|
||
end of each track, but resequences the sector ID's to skip the defective sector
|
||
and include the spare sector at the end of the track, in effect pushing the
|
||
sectors past the defective one towards the end of the track. The associated
|
||
cost is the lowest of the three, being one sector time to skip the defective
|
||
sector [<A HREF="#21">21</A>].<P>
|
||
|
||
The handling of mapped-out sectors and tracks is an issue which can't be easily
|
||
resolved without the cooperation of hard drive manufacturers. Although some
|
||
SCSI and IDE hard drives may allow access to defect lists and even to
|
||
mapped-out areas, this must be done in a highly manufacturer- and
|
||
drive-specific manner. For example the SCSI-2 READ DEFECT DATA command can be
|
||
used to obtain a list of all defective areas on the drive. Since SCSI logical
|
||
block numbers may be mapped to arbitrary locations on the disk, the defect list
|
||
is recorded in terms of heads, tracks, and sectors. As all SCSI device
|
||
addressing is performed in terms of logical block numbers, mapped-out sectors
|
||
or tracks cannot be addressed. The only reasonably portable possibility is to
|
||
clear various automatic correction flags in the read-write error recovery mode
|
||
page to force the SCSI device to report read/write errors to the user instead
|
||
of transparently remapping the defective areas. The user can then use the READ
|
||
LONG and WRITE LONG commands (which allow access to sectors and extra data even
|
||
in the presence of read/write errors), to perform any necessary operations on
|
||
the defective areas, and then use the REASSIGN BLOCKS command to reassign the
|
||
defective sections. However this operation requires an in-depth knowledge of
|
||
the operation of the SCSI device and extensive changes to disk drivers, and
|
||
more or less defeats the purpose of having an intelligent peripheral.<P>
|
||
|
||
The ANSI X3T-10 and X3T-13 subcommittees are currently looking at creating new
|
||
standards for a Universal Security Reformat command for IDE and SCSI hard disks
|
||
which will address these issues. This will involve a multiple-pass overwrite
|
||
process which covers mapped-out disk areas with deliberate off-track writing.
|
||
Many drives available today can be modified for secure erasure through a
|
||
firmware upgrade, and once the new firmware is in place the erase procedure is
|
||
handled by the drive itself, making unnecessary any interaction with the host
|
||
system beyond the sending of the command which begins the erase process.<P>
|
||
|
||
Long-term ageing can also have a marked effect on the erasability of magnetic
|
||
media. For example, some types of magnetic tape become increasingly difficult
|
||
to erase after being stored at an elevated temperature or having contained the
|
||
same magnetization pattern for a considerable period of time [<A
|
||
HREF="#22">22</A>]. The same applies for magnetic disk media, with decreases
|
||
in erasability of several dB being recorded [<A HREF="#23">23</A>]. The
|
||
erasability of the data depends on the amount of time it has been stored on the
|
||
media, not on the age of the media itself (so that, for example, a
|
||
five-year-old freshly-written disk is no less erasable than a new
|
||
freshly-written disk).<P>
|
||
|
||
The dependence of media coercivity on temperature can affect overwrite
|
||
capability if the data was initially recorded at a temperature where the
|
||
coercivity was low (so that the recorded pattern penetrated deep into the
|
||
media), but must be overwritten at a temperature where the coercivity is
|
||
relatively high. This is important in hard disk drives, where the temperature
|
||
varies depending on how long the unit has been used and, in the case of drives
|
||
with power-saving features enabled, how recently and frequently it has been
|
||
used. However the overwrite performance depends not only on
|
||
temperature-dependent changes in the media, but also on temperature-dependent
|
||
changes in the read/write head. Thankfully the combination of the most common
|
||
media used in current drives with various common types of read/write heads
|
||
produce a change in overwrite performance of only a few hundredths of a decibel
|
||
per degree over the temperature range -40°C to + 40°C, as changes in
|
||
the head compensate for changes in the media [<A HREF="#24">24</A>].<P>
|
||
|
||
Another issue which needs to be taken into account is the ability of most newer
|
||
storage devices to recover from having a remarkable amount of damage inflicted
|
||
on them through the use of various error-correction schemes. As increasing
|
||
storage densities began to lead to multiple-bit errors, manufacturers started
|
||
using sophisticated error-correction codes (ECC's) capable of correcting
|
||
multiple error bursts. A typical drive might have 512 bytes of data, 4 bytes
|
||
of CRC, and 11 bytes of ECC per sector. This ECC would be capable of
|
||
correcting single burst errors of up to 22 bits or double burst errors of up to
|
||
11 bits, and can detect a single burst error of up to 51 bits or three burst
|
||
errors of up to 11 bits in length [<A HREF="#25">25</A>]. Another drive
|
||
manufacturer quotes the ability to correct up to 120 bits, or up to 32 bits on
|
||
the fly, using 198-bit Reed-Solomon ECC [<A HREF="#26">26</A>]. Therefore even
|
||
if some data is reliably erased, it may be possible to recover it using the
|
||
built-in error-correction capabilities of the drive. Conversely, any erasure
|
||
scheme which manages to destroy the ECC information (for example through the
|
||
use of the SCSI-2 WRITE LONG command which can be used to write to areas of a
|
||
disk sector outside the normal data areas) stands a greater chance of making
|
||
the data unrecoverable.<P>
|
||
|
||
<H2>6. Sidestepping the Problem</H2>
|
||
|
||
The easiest way to solve the problem of erasing sensitive information from
|
||
magnetic media is to ensure that it never gets to the media in the first place.
|
||
Although not practical for general data, it is often worthwhile to take steps
|
||
to keep particularly important information such as encryption keys from ever
|
||
being written to disk. This would typically happen when the memory containing
|
||
the keys is paged out to disk by the operating system, where they can then be
|
||
recovered at a later date, either manually or using software which is aware of
|
||
the in-memory data format and can locate it automatically in the swap file (for
|
||
example there exists software which will search the Windows swap file for keys
|
||
from certain DOS encryption programs). An even worse situation occurs when the
|
||
data is paged over a network, allowing anyone with a packet sniffer or similar
|
||
tool on the same subnet to observe the information (for example there exists
|
||
software which will monitor and even alter NFS traffic on the fly which could
|
||
be modified to look for known in-memory data patterns moving to and from a
|
||
networked swap disk [<A HREF="#27">27</A>]).<P>
|
||
|
||
To solve these problems the memory pages containing the information can be
|
||
locked to prevent them from being paged to disk or transmitted over a network.
|
||
This approach is taken by at least one encryption library, which allocates all
|
||
keying information inside protected memory blocks visible to the user only as
|
||
opaque handles, and then optionally locks the memory (provided the underlying
|
||
OS allows it) to prevent it from being paged [<A HREF="#28">28</A>]. The exact
|
||
details of locking pages in memory depend on the operating system being used.
|
||
Many Unix systems now support the <TT>mlock()</TT>/<TT>munlock()</TT> calls or have some
|
||
alternative mechanism hidden among the <TT>mmap()</TT>-related functions which can be
|
||
used to lock pages in memory. Unfortunately these operations require superuser
|
||
privileges because of their potential impact on system performance if large
|
||
ranges of memory are locked. Other systems such as Microsoft Windows NT allow
|
||
user processes to lock memory with the <TT>VirtualLock()</TT>/<TT>VirtualUnlock()</TT> calls, but
|
||
limit the total number of regions which can be locked.<P>
|
||
|
||
Most paging algorithms are relatively insensitive to having sections of memory
|
||
locked, and can even relocate the locked pages (since the logical to physical
|
||
mapping is invisible to the user), or can move the pages to a "safe" location
|
||
when the memory is first locked. The main effect of locking pages in memory is
|
||
to increase the minimum working set size which, taken in moderation, has little
|
||
noticeable effect on performance. The overall effects depend on the operating
|
||
system and/or hardware implementations of virtual memory. Most Unix systems
|
||
have a global page replacement policy in which a page fault may be satisfied by
|
||
any page frame. A smaller number of operating systems use a local page
|
||
replacement policy in which pages are allocated from a fixed (or occasionally
|
||
dynamically variable) number of page frames allocated on a per- process basis.
|
||
This makes them much more sensitive to the effects of locking pages, since
|
||
every locked page decreases the (finite) number of pages available to the
|
||
process. On the other hand it makes the system as a whole less sensitive to
|
||
the effects of one process locking a large number of pages. The main effective
|
||
difference between the two is that under a local replacement policy a process
|
||
can only lock a small fixed number of pages without affecting other processes,
|
||
whereas under a global replacement policy the number of pages a process can
|
||
lock is determined on a system-wide basis and may be affected by other
|
||
processes.<P>
|
||
|
||
In practice neither of these allocation strategies seem to cause any real
|
||
problems. Although any practical measurements are very difficult to perform
|
||
since they vary wildly depending on the amount of physical memory present,
|
||
paging strategy, operating system, and system load, in practice locking a dozen
|
||
1K regions of memory (which might be typical of a system on which a number of
|
||
users are running programs such as mail encryption software) produced no
|
||
noticeable performance degradation observable by system- monitoring tools. On
|
||
machines such as network servers handling large numbers of secure connections
|
||
(for example an HTTP server using SSL), the effects of locking large numbers of
|
||
pages may be more noticeable.<P>
|
||
|
||
<H2>7. Methods of Recovery for Data stored in Random-Access Memory</H2>
|
||
|
||
Contrary to conventional wisdom, "volatile" semiconductor memory does not
|
||
entirely lose its contents when power is removed. Both static (SRAM) and
|
||
dynamic (DRAM) memory retains some information on the data stored in it while
|
||
power was still applied. SRAM is particularly susceptible to this problem, as
|
||
storing the same data in it over a long period of time has the effect of
|
||
altering the preferred power-up state to the state which was stored when power
|
||
was removed. Older SRAM chips could often "remember" the previously held state
|
||
for several days. In fact, it is possible to manufacture SRAM's which always
|
||
have a certain state on power-up, but which can be overwritten later on - a
|
||
kind of "writeable ROM".<P>
|
||
|
||
DRAM can also "remember" the last stored state, but in a slightly different
|
||
way. It isn't so much that the charge (in the sense of a voltage appearing
|
||
across a capacitance) is retained by the RAM cells, but that the thin oxide
|
||
which forms the storage capacitor dielectric is highly stressed by the applied
|
||
field, or is not stressed by the field, so that the properties of the oxide
|
||
change slightly depending on the state of the data. One thing that can cause a
|
||
threshold shift in the RAM cells is ionic contamination of the cell(s) of
|
||
interest, although such contamination is rarer now than it used to be because
|
||
of robotic handling of the materials and because the purity of the chemicals
|
||
used is greatly improved. However, even a perfect oxide is subject to having
|
||
its properties changed by an applied field. When it comes to contaminants,
|
||
sodium is the most common offender - it is found virtually everywhere, and is a
|
||
fairly small (and therefore mobile) atom with a positive charge. In the
|
||
presence of an electric field, it migrates towards the negative pole with a
|
||
velocity which depends on temperature, the concentration of the sodium, the
|
||
oxide quality, and the other impurities in the oxide such as dopants from the
|
||
processing. If the electric field is zero and given enough time, this stress
|
||
tends to dissipate eventually.<P>
|
||
|
||
The stress on the cell is a cumulative effect, much like charging an RC
|
||
circuit. If the data is applied for only a few milliseconds then there is very
|
||
little "learning" of the cell, but if it is applied for hours then the cell
|
||
will acquire a strong (relatively speaking) change in its threshold. The
|
||
effects of the stress on the RAM cells can be measured using the built-in self
|
||
test capabilities of the cells, which provide the ability to impress a weak
|
||
voltage on a storage cell in order to measure its margin. Cells will show
|
||
different margins depending on how much oxide stress has been present. Many
|
||
DRAM's have undocumented test modes which allow some normal I/O pin to become
|
||
the power supply for the RAM core when the special mode is active. These test
|
||
modes are typically activated by running the RAM in a nonstandard
|
||
configuration, so that a certain set of states which would not occur in a
|
||
normally-functioning system has to be traversed to activate the mode.
|
||
Manufacturers won't admit to such capabilities in their products because they
|
||
don't want their customers using them and potentially rejecting devices which
|
||
comply with their spec sheets, but have little margin beyond that.<P>
|
||
|
||
A simple but somewhat destructive method to speed up the annihilation of stored
|
||
bits in semiconductor memory is to heat it. Both DRAM's and SRAM's will lose
|
||
their contents a lot more quickly at Tjunction = 140°C than they will at
|
||
room temperature. Several hours at this temperature with no power applied will
|
||
clear their contents sufficiently to make recovery difficult. Conversely, to
|
||
extend the life of stored bits with the power removed, the temperature should
|
||
be dropped below -60°C. Such cooling should lead to weeks, instead of hours
|
||
or days, of data retention.<P>
|
||
|
||
<H2>8. Erasure of Data stored in Random-Access Memory</H2>
|
||
|
||
Simply repeatedly overwriting the data held in DRAM with new data isn't nearly
|
||
as effective as it is for magnetic media. The new data will begin stressing or
|
||
relaxing the oxide as soon as it is written, and the oxide will immediately
|
||
begin to take a "set" which will either reinforce the previous "set" or will
|
||
weaken it. The greater the amount of time that new data has existed in the
|
||
cell, the more the old stress is "diluted", and the less reliable the
|
||
information extraction will be. Generally, the rates of change due to stress
|
||
and relaxation are in the same order of magnitude. Thus, a few microseconds of
|
||
storing the opposite data to the currently stored value will have little effect
|
||
on the oxide. Ideally, the oxide should be exposed to as much stress at the
|
||
highest feasible temperature and for as long as possible to get the greatest
|
||
"erasure" of the data. Unfortunately if carried too far this has a rather
|
||
detrimental effect on the life expectancy of the RAM.<P>
|
||
|
||
Therefore the goal to aim for when sanitising memory is to store the data for
|
||
as long as possible rather than trying to change it as often as possible.
|
||
Conversely, storing the data for as short a time as possible will reduce the
|
||
chances of it being "remembered" by the cell. Based on tests on DRAM cells, a
|
||
storage time of one second causes such a small change in threshold that it
|
||
probably isn't detectable. On the other hand, one minute is probably
|
||
detectable, and 10 minutes is certainly detectable.<P>
|
||
|
||
The most practical solution to the problem of DRAM data retention is therefore
|
||
to constantly flip the bits in memory to ensure that a memory cell never holds
|
||
a charge long enough for it to be "remembered". While not practical for
|
||
general use, it is possible to do this for small amounts of very sensitive data
|
||
such as encryption keys. This is particularly advisable where keys are stored
|
||
in the same memory location for long periods of time and control access to
|
||
large amounts of information, such as keys used for transparent encryption of
|
||
files on disk drives. The bit-flipping also has the convenient side-effect of
|
||
keeping the page containing the encryption keys at the top of the queue
|
||
maintained by the system's paging mechanism, greatly reducing the chances of it
|
||
being paged to disk at some point.<P>
|
||
|
||
<H2>9. Conclusion</H2>
|
||
|
||
Data overwritten once or twice may be recovered by subtracting what is expected
|
||
to be read from a storage location from what is actually read. Data which is
|
||
overwritten an arbitrarily large number of times can still be recovered
|
||
provided that the new data isn't written to the same location as the original
|
||
data (for magnetic media), or that the recovery attempt is carried out fairly
|
||
soon after the new data was written (for RAM). For this reason it is
|
||
effectively impossible to sanitise storage locations by simple overwriting
|
||
them, no matter how many overwrite passes are made or what data patterns are
|
||
written. However by using the relatively simple methods presented in this
|
||
paper the task of an attacker can be made significantly more difficult, if not
|
||
prohibitively expensive.<P>
|
||
|
||
<H2>Acknowledgments</H2>
|
||
|
||
The author would like to thank Nigel Bree, Peter Fenwick, Andy Hospodor, Kevin
|
||
Martinez, Colin Plumb, and Charles Preston for their advice and input during
|
||
the preparation of this paper.<P>
|
||
|
||
<H2>References</H2>
|
||
|
||
<A NAME="1">
|
||
[1] "Emergency Destruction of Information Storing Media",
|
||
M.Slusarczuk et al, Institute for Defense Analyses, December 1987.<P>
|
||
|
||
<A NAME="2">
|
||
[2] "A Guide to Understanding Data Remanence in Automated Information Systems",
|
||
National Computer Security Centre, September 1991.<P>
|
||
|
||
<A NAME="3">
|
||
[3] "Detection of Digital Information from Erased Magnetic Disks", Venugopal
|
||
Veeravalli, Masters thesis, Carnegie-Mellon University, 1987.<P>
|
||
|
||
<A NAME="4">
|
||
[4] "Magnetic force microscopy: General principles and application to
|
||
longitudinal recording media", D.Rugar, H.Mamin, P.Guenther, S.Lambert,
|
||
J.Stern, I.McFadyen, and T.Yogi, <I>Journal of Applied Physics</I>, <B>Vol.68,
|
||
No.3</B> (August 1990), p.1169.<P>
|
||
|
||
<A NAME="5">
|
||
[5] "Tunneling-stabilized Magnetic Force Microscopy of Bit Tracks on a Hard
|
||
Disk", Paul Rice and John Moreland, <I>IEEE Trans.on Magnetics</I>, <B>Vol.27,
|
||
No.3</B> (May 1991), p.3452.<P>
|
||
|
||
<A NAME="6">
|
||
[6] "NanoTools: The Homebrew STM Page", Jim Rice,
|
||
<A HREF="http://www.skypoint.com/members/jrice/STMWebPage.html">NanoTools: The
|
||
Homebrew STM Page</A>.<P>
|
||
|
||
<A NAME="7">
|
||
[7] "Magnetic Force Scanning Tunnelling Microscope Imaging of Overwritten
|
||
Data", Romel Gomez, Amr Adly, Isaak Mayergoyz, Edward Burke, <I>IEEE Trans.on
|
||
Magnetics</I>, <B>Vol.28, No.5</B> (September 1992), p.3141.<P>
|
||
|
||
<A NAME="8">
|
||
[8] "Comparison of Magnetic Fields of Thin-Film Heads and Their Corresponding
|
||
Patterns Using Magnetic Force Microscopy", Paul Rice, Bill Hallett, and John
|
||
Moreland, <I>IEEE Trans.on Magnetics</I>, <B>Vol.30, No.6</B> (November 1994),
|
||
p.4248.<P>
|
||
|
||
<A NAME="9">
|
||
[9] "Computation of Magnetic Fields in Hysteretic Media", Amr Adly, Isaak
|
||
Mayergoyz, Edward Burke, <I>IEEE Trans.on Magnetics</I>, <B>Vol.29, No.6</B>
|
||
(November 1993), p.2380.<P>
|
||
|
||
<A NAME="10">
|
||
[10] "Magnetic Force Microscopy Study of Edge Overwrite Characteristics in Thin
|
||
Film Media", Jian- Gang Zhu, Yansheng Luo, and Juren Ding, <I>IEEE Trans.on
|
||
Magnetics</I>, <B>Vol.30, No.6</B> (November 1994), p.4242.<P>
|
||
|
||
<A NAME="11">
|
||
[11] "Microscopic Investigations of Overwritten Data", Romel Gomez, Edward
|
||
Burke, Amr Adly, Isaak Mayergoyz, J.Gorczyca, <I>Journal of Applied
|
||
Physics</I>, <B>Vol.73, No.10</B> (May 1993), p.6001.<P>
|
||
|
||
<A NAME="12">
|
||
[12] "Relationship between Overwrite and Transition Shift in Perpendicular
|
||
Magnetic Recording", Hiroaki Muraoka, Satoshi Ohki, and Yoshihisa Nakamura,
|
||
<I>IEEE Trans.on Magnetics</I>, <B>Vol.30, No.6</B> (November 1994), p.4272.<P>
|
||
|
||
<A NAME="13">
|
||
[13] "Effects of Current and Frequency on Write, Read, and Erase Widths for
|
||
Thin-Film Inductive and Magnetoresistive Heads", Tsann Lin, Jodie Christner,
|
||
Terry Mitchell, Jing-Sheng Gau, and Peter George, <I>IEEE Trans.on
|
||
Magnetics</I>, <B>Vol.25, No.1</B> (January 1989), p.710.<P>
|
||
|
||
<A NAME="14">
|
||
[14] "PRML Read Channels: Bringing Higher Densities and Performance to
|
||
New-Generation Hard Drives", Quantum Corporation, 1995.<P>
|
||
|
||
<A NAME="15">
|
||
[15] "Density and Phase Dependence of Edge Erase Band in MR/Thin Film Head
|
||
Recording", Yansheng Luo, Terence Lam, Jian-Gang Zhu, <I>IEEE Trans.on
|
||
Magnetics</I>, <B>Vol.31, No.6</B> (November 1995), p.3105.<P>
|
||
|
||
<A NAME="16">
|
||
[16] "A Guide to Understanding Data Remanence in Automated Information
|
||
Systems", National Computer Security Centre, September 1991.<P>
|
||
|
||
<A NAME="17">
|
||
[17] "Time-dependant Magnetic Phenomena and Particle-size Effects in Recording
|
||
Media", <I>IEEE Trans.on Magnetics</I>, <B>Vol.26, No.1</B> (January 1990),
|
||
p.193.<P>
|
||
|
||
<A NAME="18">
|
||
[18] "The Data Dilemna", Charles Preston, Security Management Journal, February
|
||
1995.<P>
|
||
|
||
<A NAME="19">
|
||
[19] "Magnetic Tape Degausser", NSA/CSS Specification L14-4-A, 31 October
|
||
1985.<P>
|
||
|
||
<A NAME="20">
|
||
[20] "How many times erased does DoD want?", David Hayes, posting to
|
||
comp.periphs.scsi newsgroup, 24 July 1991, message-ID
|
||
1991Jul24.050701.16005@sulaco.lone star.org.<P>
|
||
|
||
<A NAME="21">
|
||
[21] "The Changing Nature of Disk Controllers", Andrew Hospodor and Albert
|
||
Hoagland, <I>Proceedings of the IEEE</I>, <B>Vol.81, No.4</B> (April 1993),
|
||
p.586.<P>
|
||
|
||
<A NAME="22">
|
||
[22] "Annealing Study of the Erasability of High Energy Tapes", L.Lekawat,
|
||
G.Spratt, and M.Kryder, <I>IEEE Trans.on Magnetics</I>, <B>Vol.29, No.6</B>
|
||
(November 1993), p.3628.<P>
|
||
|
||
<A NAME="23">
|
||
[23] "The Effect of Aging on Erasure in Particulate Disk Media", K.Mountfield
|
||
and M.Kryder, <I>IEEE Trans.on Magnetics</I>, <B>Vol.25, No 5</B> (September
|
||
1989), p.3638.<P>
|
||
|
||
<A NAME="24">
|
||
[24] "Overwrite Temperature Dependence for Magnetic Recording", Takayuki
|
||
Takeda, Katsumichi Tagami, and Takaaki Watanabe, <I>Journal of Applied
|
||
Physics</I>, <B>Vol.63, No.8</B> (April 1988), p.3438.<P>
|
||
|
||
<A NAME="25">
|
||
[25] Conner 3.5" hard drive data sheets, 1994, 1995.<P>
|
||
|
||
<A NAME="26">
|
||
[26] "Technology and Time-to-Market: The Two Go Hand-in-Hand", Quantum
|
||
Corporation, 1995.<P>
|
||
|
||
<A NAME="27">
|
||
[27] "Basic Flaws in Internet Security and Commerce", Paul Gauthier, posting to
|
||
comp.security.unix newsgroup, 9 October 1995, message-ID
|
||
gauthier.813274073@espresso.cs.ber keley.edu.<P>
|
||
|
||
<A NAME="28">
|
||
[28] "cryptlib Free Encryption Library", Peter Gutmann,
|
||
<A HREF="http://www.cs.auckland.ac.nz/~pgut001/cryptlib.html">
|
||
cryptlib</A>.<P>
|
||
|
||
<HR>
|
||
<ADDRESS>
|
||
Secure Deletion of Data from Magnetic and Solid-State Memory / Peter Gutmann /
|
||
pgut001@cs.auckland.ac.nz
|
||
</ADDRESS>
|
||
</BODY>
|