add directory study
This commit is contained in:
Binary file not shown.
533
study/sabre/os/files/MemoryManagement/LEA.html
Normal file
533
study/sabre/os/files/MemoryManagement/LEA.html
Normal file
@@ -0,0 +1,533 @@
|
||||
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
|
||||
<html> <head> <title>A Memory Allocator</title> </head>
|
||||
<body bgcolor="#ffffee" vlink="#0000aa" link="#cc0000">
|
||||
|
||||
<h1>A Memory Allocator</h1>
|
||||
|
||||
<p> by <a href="http://g.oswego.edu">Doug Lea</a>
|
||||
<p>
|
||||
[A German adaptation and translation of this article appears
|
||||
in <b>unix/mail</b> December, 1996.]
|
||||
|
||||
<h2>Introduction</h2>
|
||||
<p>
|
||||
Memory allocators form interesting case studies in the engineering
|
||||
of infrastructure software. I started writing one in 1987, and
|
||||
have maintained and evolved it (with the help of many volunteer
|
||||
contributors) ever since. This allocator provides implementations
|
||||
of the the standard C routines <code>malloc()</code>,
|
||||
<code>free()</code>, and <code>realloc()</code>, as well as a few
|
||||
auxiliary utility routines. The allocator has never been given a
|
||||
specific name. Most people just call it <em>Doug Lea's
|
||||
Malloc</em>, or <em>dlmalloc</em> for short.
|
||||
|
||||
<p>
|
||||
The code for this allocator
|
||||
has been placed in the public domain (available from
|
||||
<a href="ftp://g.oswego.edu/pub/misc/malloc.c">
|
||||
ftp://g.oswego.edu/pub/misc/malloc.c</a>), and is apparently
|
||||
widely used: It serves as the default native version of malloc in
|
||||
some versions of Linux; it is compiled into several commonly
|
||||
available software packages (overriding the native malloc), and
|
||||
has been used in various PC environments as well as in embedded
|
||||
systems, and surely many other places I don't even know about.
|
||||
|
||||
<p>
|
||||
I wrote the first version of the allocator after writing some C++
|
||||
programs that almost exclusively relied on allocating dynamic
|
||||
memory. I found that they ran much more slowly and/or with much
|
||||
more total memory consumption than I expected them to. This was
|
||||
due to characteristics of the memory allocators on the systems I
|
||||
was running on (mainly the then-current versions of SunOs and BSD
|
||||
). To counter this, at first I wrote a number of special-purpose
|
||||
allocators in C++, normally by overloading <code>operator
|
||||
new</code> for various classes. Some of these are described in a
|
||||
paper on C++ allocation techniques that was adapted into the 1989
|
||||
<em>C++ Report</em> article <a
|
||||
href="ftp://g.oswego.edu/pub/papers/C++Report89.txt"> <em>Some
|
||||
storage allocation techniques for container classes</em></a>.
|
||||
|
||||
<p>
|
||||
However, I soon realized that building a special allocator for
|
||||
each new class that tended to be dynamically allocated and heavily
|
||||
used was not a good strategy when building kinds of
|
||||
general-purpose programming support classes I was writing at the
|
||||
time. (From 1986 to 1991, I was the the primary author of <A
|
||||
HREF="http://g.oswego.edu/dl/libg++paper/libg++/libg++.html">
|
||||
libg++ </A>, the GNU C++ library.) A broader solution was needed --
|
||||
to write an allocator that was good enough under normal C++ and C
|
||||
loads so that programmers would not be tempted to write
|
||||
special-purpose allocators except under very special conditions.
|
||||
<p>
|
||||
This article presents a description of some of the main design
|
||||
goals, algorithms, and implementation considerations for this
|
||||
allocator. More detailed documentation can be found with the code
|
||||
distribution.
|
||||
|
||||
|
||||
<h2>Goals</h2>
|
||||
|
||||
A good memory allocator needs to balance a number of goals:
|
||||
|
||||
<dl>
|
||||
<dt>Maximizing Compatibility
|
||||
<dd>An allocator should be plug-compatible with others; in particular
|
||||
it should obey ANSI/POSIX conventions.
|
||||
|
||||
<dt> Maximizing Portability
|
||||
<dd> Reliance on as few system-dependent features (such as system calls)
|
||||
as possible, while still providing optional support for other useful
|
||||
features found only one some systems; conformance
|
||||
to all known system constraints on alignment and addressing rules.
|
||||
|
||||
<dt> Minimizing Space
|
||||
<dd> The allocator should not waste space: It should obtain as little
|
||||
memory from the system as possible, and should maintain memory in ways
|
||||
that minimize <em>fragmentation</em> -- ``holes''in contiguous chunks
|
||||
of memory that are not used by the program.
|
||||
|
||||
<dt> Minimizing Time
|
||||
<dd> The <code>malloc()</code>, <code>free()</code> and <code>realloc</code>
|
||||
routines should be as fast as possible in the average case.
|
||||
|
||||
<dt> Maximizing Tunability
|
||||
<dd> Optional features and behavior should be controllable by users
|
||||
either statically (via <code>#define</code> and the like) or
|
||||
dynamically (via control commands such as <code>mallopt</code>).
|
||||
|
||||
<dt> Maximizing Locality
|
||||
<dd> Allocating chunks of memory that are typically
|
||||
used together near each other. This helps minimize page and cache misses
|
||||
during program execution.
|
||||
|
||||
<dt> Maximizing Error Detection
|
||||
<dd> It does not seem possible for a general-purpose allocator to
|
||||
also serve as general-purpose memory error testing tool
|
||||
such as <em>Purify</em>. However,
|
||||
allocators should provide some means for detecting corruption due
|
||||
to overwriting memory, multiple frees, and so on.
|
||||
|
||||
<dt>Minimizing Anomalies
|
||||
<dd>An allocator configured using default settings should perform well
|
||||
across a wide range of real loads that depend heavily on
|
||||
dynamic allocation -- windowing toolkits, GUI applications, compilers,
|
||||
interpretors, development tools, network (packet)-intensive programs,
|
||||
graphics-intensive packages, web browsers,
|
||||
string-processing applications, and so on.
|
||||
</dl>
|
||||
|
||||
<p>
|
||||
Paul Wilson and colleagues have written an excellent survey
|
||||
paper on allocation techniques that discusses some of these goals
|
||||
in more detail. See Paul R. Wilson, Mark S. Johnstone, Michael
|
||||
Neely, and David Boles, ``Dynamic Storage Allocation: A Survey
|
||||
and Critical Review'' in <em>International Workshop on Memory
|
||||
Management</em>, September 1995 (also
|
||||
available via <a href=
|
||||
"ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps"> ftp</a>).
|
||||
(Note that the version of my allocator they describe is
|
||||
<em>not</em> the most current one however.)
|
||||
<p>
|
||||
As they discuss,
|
||||
minimizing space by minimizing wastage (generally due to
|
||||
fragmentation) must be the primary goal in any allocator.
|
||||
|
||||
<p>
|
||||
For an extreme example, among the fastest possible versions of
|
||||
<code>malloc()</code> is one that always allocates the next
|
||||
sequential memory location available on the system, and the
|
||||
corresponding fastest version of <code>free()</code> is a no-op.
|
||||
However, such an implementation is hardly ever acceptable: it will
|
||||
cause a program to run out of memory quickly since it never
|
||||
reclaims unused space. Wastages seen in some allocators used in
|
||||
practice can be almost this extreme under some loads. As Wilson
|
||||
also notes, wastage can be measured monetarily: Considered
|
||||
globally, poor allocation schemes cost people perhaps even
|
||||
billions of dollars in memory chips.
|
||||
|
||||
<p>
|
||||
While time-space issues dominate, the set of trade-offs and compromises
|
||||
is nearly endless. Here are just a few of the many examples:
|
||||
|
||||
<ul>
|
||||
<li> Accommodating worst-case alignment requirements increases
|
||||
wastage by forcing the allocator to skip over bytes in order
|
||||
to align chunks.
|
||||
|
||||
<li> Most provisions for dynamic tunability (such as setting
|
||||
a <em>debug</em> mode) can seriously impact time efficiency
|
||||
by adding levels of indirection and increasing numbers of branches.
|
||||
|
||||
<li> Some provisions designed to catch errors limit range of
|
||||
applicability. For example, regardless of platform, the
|
||||
current malloc internally handles allocation size arguments as if they
|
||||
were signed 32-bit integers, and treats nonpositive arguments
|
||||
as if they were requests for a size of zero. This is considered
|
||||
by nearly all users as a feature rather than a bug: A negative
|
||||
32-bit argument or a huge 64-bit argument is essentially always
|
||||
a programming mistake. Returning a minimally-sized chunk will
|
||||
help catch this error.
|
||||
|
||||
<li> Accommodating the oddities of other allocators to remain
|
||||
plug-compatible with them can reduce flexibility and performance.
|
||||
For the oddest example, some early versions of Unix allocators
|
||||
allowed programmers to <code>realloc</code>
|
||||
memory that had already been <code>freed</code>. Until 1993,
|
||||
I allowed this for the sake of compatibility.
|
||||
(However, no one at all complained when this ``feature'' was dropped.)
|
||||
|
||||
<li> Some (but by no means all) heuristics that improve time and/or
|
||||
space for small programs cause unacceptably
|
||||
worse time and/or space characteristics for larger programs that
|
||||
dominate the load on typical systems these days.
|
||||
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
No set of compromises along these lines can be
|
||||
perfect. However, over the years, the allocator has
|
||||
evolved to make trade-offs that the majority of users find to
|
||||
be acceptable. The driving forces that continue to impact the
|
||||
evolution of this malloc include:
|
||||
|
||||
<ol>
|
||||
<li> Empirical studies of malloc performance by others
|
||||
(including the above-mentioned paper by Wilson et al, as well
|
||||
as others that it in turn cites). These papers find that
|
||||
versions of this malloc increasingly rank as simultaneously
|
||||
among the most time- and space-efficient memory allocators
|
||||
available. However, each reveals weaknesses or opportunities
|
||||
for further improvements.
|
||||
|
||||
<li> Changes in target workloads. The nature of the kinds of
|
||||
programs that are most sensitive to malloc implementations
|
||||
continually change. For perhaps the primary example, the
|
||||
memory characteristics of <em>X</em> and other windowing
|
||||
systems increasingly dominate.
|
||||
|
||||
<li> Changes in systems and processors. Implementation details
|
||||
and fine-tunings that try to make code readily optimizable for
|
||||
typical processors change across time. Additionally, operating
|
||||
systems (including Linux and Solaris) have themselves evolved,
|
||||
for example to make memory mapping an occasionally-wise choice
|
||||
for system-level allocation.
|
||||
|
||||
<li> Suggestions, experience reports, and code from users and
|
||||
contributors. The code has evolved with the help of
|
||||
several regular volunteer contributors.
|
||||
The majority of recent changes were instigated
|
||||
by people using the version supplied in Linux, and were
|
||||
implemented in large part by Wolfram Gloger for the Linux
|
||||
version and then integrated by me.
|
||||
</ol>
|
||||
|
||||
|
||||
<h2>Algorithms</h2>
|
||||
|
||||
|
||||
The two core elements of the malloc algorithm have remained
|
||||
unchanged since the earliest versions:
|
||||
<p>
|
||||
|
||||
<dl>
|
||||
<dt> Boundary Tags
|
||||
<dd> Chunks of memory carry around with them size information
|
||||
fields both before and after the chunk. This allows for
|
||||
two important capabilities:
|
||||
<ul>
|
||||
|
||||
<li> Two bordering unused chunks can be coalesced into
|
||||
one larger chunk. This minimizes the number of unusable
|
||||
small chunks.
|
||||
|
||||
<li> All chunks can be traversed starting from any known
|
||||
chunk in either a forward or backward direction.
|
||||
</ul>
|
||||
<p>
|
||||
<img src="malloc1.gif">
|
||||
|
||||
<p>
|
||||
The original versions implemented boundary tags exactly in
|
||||
this fashion. More recent versions omit trailer
|
||||
fields on chunks that are in use by the program. This
|
||||
is itself a minor trade-off: The fields are not ever used
|
||||
while chunks are active so need not be present. Eliminating them decreases
|
||||
overhead and wastage. However,
|
||||
lack of these fields weakens error detection a bit by
|
||||
making it impossible to check if users mistakenly overwrite
|
||||
fields that should have known values.
|
||||
|
||||
<dt>Binning
|
||||
<dd> Available chunks are maintained in bins, grouped by size.
|
||||
There are a surprisingly large number (128) of fixed-width
|
||||
bins, approximately logarithmically spaced in size. Bins for
|
||||
sizes less than 512 bytes each hold only exactly one size
|
||||
(spaced 8 bytes apart, simplifying enforcement of 8-byte alignment).
|
||||
Searches for available chunks are processed in smallest-first,
|
||||
<em>best-fit</em> order. As shown by Wilson et al, best-fit
|
||||
schemes (of various kinds and approximations) tend to produce
|
||||
the least fragmentation on real loads
|
||||
compared to other general approaches such as first-fit.
|
||||
<p>
|
||||
<img src="malloc2.gif">
|
||||
<p>
|
||||
|
||||
Until the versions released in 1995, chunks were left unsorted
|
||||
within bins, so that the best-fit strategy was only approximate.
|
||||
More recent versions instead sort chunks by size within bins, with
|
||||
ties broken by an oldest-first rule. (This was done after finding that
|
||||
the minor time investment was worth it to avoid observed bad cases.)
|
||||
|
||||
|
||||
</dl>
|
||||
<p>
|
||||
|
||||
Thus, the general categorization of this algorithm is
|
||||
<em>best-first with coalescing</em>: Freed chunks are
|
||||
coalesced with neighboring ones, and held in bins that are
|
||||
searched in size order.
|
||||
<p>
|
||||
|
||||
This approach leads to fixed
|
||||
bookkeeping overhead per chunk. Because both size information
|
||||
and bin links must be held in each available chunk, the
|
||||
smallest allocatable chunk is 16 bytes in systems with 32-bit
|
||||
pointers and 24 bytes in systems with 64-bit pointers. These
|
||||
minimum sizes are larger than most people would like to see --
|
||||
they can lead to significant wastage for example in
|
||||
applications allocating many tiny linked-list nodes. However,
|
||||
the 16 bytes minimum at least is characteristic of
|
||||
<em>any</em> system requiring 8-byte alignment in which there
|
||||
is <em>any</em> malloc bookkeeping overhead.
|
||||
|
||||
<p>
|
||||
This basic algorithm can be made to be very fast. Even though
|
||||
it rests upon a search mechanism to find best fits, the use
|
||||
of indexing techniques, exploitation of special cases, and
|
||||
careful coding lead to average cases requiring only a few
|
||||
dozen instructions, depending of course on the machine and the
|
||||
allocation pattern.
|
||||
|
||||
<p>
|
||||
While coalescing via boundary tags and best-fit via binning
|
||||
represent the main ideas of the algorithm, further
|
||||
considerations lead to a number of heuristic
|
||||
improvements. They include locality preservation, wilderness
|
||||
preservation, memory mapping, and caching.
|
||||
|
||||
<h3>Locality preservation</h3>
|
||||
|
||||
Chunks allocated at about the same time by a program tend to have
|
||||
similar reference patterns and coexistent lifetimes. Maintaining
|
||||
locality minimizes page faults and cache misses, which can have
|
||||
a dramatic effect on performance on modern processors.
|
||||
If locality
|
||||
were the <em>only</em> goal, an allocator might always allocate
|
||||
each successive chunk as close to the previous one as possible.
|
||||
However, this <em>nearest-fit</em> (often approximated by <em>next-fit</em>)
|
||||
strategy can lead to very bad fragmentation. In the current
|
||||
version of malloc, a version of next-fit is used only in a
|
||||
restricted context that maintains locality in those cases where
|
||||
it conflicts the least with other goals: If a chunk of the
|
||||
exact desired size is not available, the most recently split-off
|
||||
space is used (and resplit) if it is big enough; otherwise
|
||||
best-fit is used. This restricted use eliminates cases where
|
||||
a perfectly usable existing chunk fails to be allocated; thus
|
||||
eliminating at least this form of fragmentation. And, because this form
|
||||
of next-fit is faster than best-fit bin-search, it speeds up
|
||||
the average <code>malloc</code>.
|
||||
|
||||
<h3>Wilderness Preservation</h3>
|
||||
|
||||
The ``wilderness'' (so named by Kiem-Phong Vo) chunk represents
|
||||
the space bordering the topmost address allocated from the
|
||||
system. Because it is at the border, it is the only chunk that
|
||||
can be arbitrarily extended
|
||||
(via <code>sbrk</code> in Unix) to be bigger than it is (unless
|
||||
of course <code>sbrk</code> fails because all memory has been
|
||||
exhausted).
|
||||
<p>
|
||||
|
||||
One way to deal with the wilderness chunk is to
|
||||
handle it about the same way as any other chunk. (This
|
||||
technique was used in most versions of this malloc until 1994).
|
||||
While this simplifies and speeds up implementation, without care
|
||||
it can lead to some very bad worst-case space characteristics:
|
||||
Among other problems, if the wilderness chunk is used when
|
||||
another available chunk exists, you increase the chances that a
|
||||
later request will cause an otherwise preventable
|
||||
<code>sbrk</code>.
|
||||
|
||||
<p>
|
||||
A better strategy is currently used: treat the wilderness
|
||||
chunk as ``bigger'' than all others, since it can be made so
|
||||
(up to system limitations) and use it as such in a best-first
|
||||
scan. This results in the wilderness chunk always being used
|
||||
only if no other chunk exists, further avoiding preventable
|
||||
fragmentation.
|
||||
|
||||
<h3>Memory Mapping</h3>
|
||||
|
||||
<p>
|
||||
In addition to extending general-purpose allocation regions
|
||||
via <code>sbrk</code>, most versions of Unix support system
|
||||
calls such as <code>mmap</code> that allocate a separate
|
||||
non-contiguous region of memory for use by a program. This
|
||||
provides a second option within <code>malloc</code> for
|
||||
satisfying a memory request. Requesting and returning a
|
||||
<code>mmap</code>ed chunk can further reduce downstream
|
||||
fragmentation, since a released memory map does not create a
|
||||
``hole'' that would need to be managed. However, because of
|
||||
built-in limitations and overheads associated with
|
||||
<code>mmap</code>, it is only worth doing this in very
|
||||
restricted situations. For example, in all current systems,
|
||||
mapped regions must be page-aligned. Also, invoking
|
||||
<code>mmap</code> and <code>mfree</code> is much slower than
|
||||
carving out an existing chunk of memory. For these reasons,
|
||||
the current version of malloc relies on <code>mmap</code> only
|
||||
if (1) the request is greater than a (dynamically adjustable)
|
||||
threshold size (currently by default 1MB) and (2) the space
|
||||
requested is not already available in the existing arena so
|
||||
would have to be obtained via <code>sbrk</code>.
|
||||
|
||||
<p>
|
||||
In part because <code>mmap</code> is not always applicable in most
|
||||
programs, the current version of malloc also supports
|
||||
<em>trimming</em> of the main arena, which achieves one of the effects
|
||||
of memory mapping -- releasing unused space back to the system. When
|
||||
long-lived programs contain brief peaks where they allocate large
|
||||
amounts of memory, followed by longer valleys where the have more
|
||||
modest requirements, system performance as a whole can be improved
|
||||
by releasing unused parts of the <em>wilderness</em> chunk back to
|
||||
the system. (In nearly all versions of Unix, <code>sbrk</code> can
|
||||
be used with negative arguments to achieve this effect.) Releasing
|
||||
space allows the underlying operating system to cut down on swap
|
||||
space requirements and reuse memory mapping tables. However, as with
|
||||
<code>mmap</code>, the call itself can be expensive, so is only attempted
|
||||
if trailing unused memory exceeds a tunable threshold.
|
||||
|
||||
|
||||
<h3>Caching</h3>
|
||||
|
||||
<p>
|
||||
In the most straightforward version of the basic algorithm,
|
||||
each freed chunk is immediately coalesced with neighbors to
|
||||
form the largest possible unused chunk. Similarly, chunks
|
||||
are created (by splitting larger chunks) only when
|
||||
explicitly requested.
|
||||
|
||||
<p>
|
||||
Operations to split and to coalesce chunks take time. This time
|
||||
overhead can sometimes be avoided by using either of both of
|
||||
two <em>caching</em> strategies:
|
||||
|
||||
<dl>
|
||||
<dt> Deferred Coalescing
|
||||
<dd> Rather than coalescing freed chunks, leave them at their
|
||||
current sizes in hopes that another request for the same size
|
||||
will come along soon. This saves a coalesce, a later split,
|
||||
and the time it would take to find a non-exactly-matching chunk
|
||||
to split.
|
||||
|
||||
<dt> Preallocation
|
||||
<dd> Rather than splitting out new chunks one-by one, pre-split
|
||||
many at once. This is normally faster than doing it one-at-a-time.
|
||||
</dl>
|
||||
|
||||
Because the basic data structures in the allocator permit
|
||||
coalescing at any time, in any of <code>malloc</code>,
|
||||
<code>free</code>, or <code>realloc</code>, corresponding caching
|
||||
heuristics are easy to apply.
|
||||
|
||||
<p>
|
||||
The effectiveness of caching obviously depends on the costs of
|
||||
splitting, coalescing, and searching relative to the work
|
||||
needed to track cached chunks. Additionally, effectiveness
|
||||
less obviously depends on the policy used in deciding when
|
||||
to cache versus coalesce them. .
|
||||
|
||||
<p>
|
||||
Caching can be a good idea in programs that continuously
|
||||
allocate and release chunks of only a few sizes.
|
||||
For example, if you write a program that
|
||||
allocates and frees many tree nodes, you might decide that is
|
||||
worth it to cache some nodes, assuming you know of a fast way
|
||||
to do this. However, without knowledge of the program,
|
||||
<code>malloc</code> cannot know whether it would be a good
|
||||
idea to coalesce cached small chunks in order to satisfy a
|
||||
larger request, or whether that larger request should be taken
|
||||
from somewhere else. And it is difficult for the allocator to
|
||||
make more informed guesses about this matter. For example, it
|
||||
is just as costly for an allocator to determine how much total
|
||||
contiguous space would be gained by coalescing chunks as it
|
||||
would be to just coalesce them and then resplit them.
|
||||
|
||||
<p>
|
||||
Previous versions of the allocator used a few
|
||||
search-ordering heuristics that made adequate guesses about
|
||||
caching, although with occasionally bad worst-case
|
||||
results. But across time, these heuristics appear to be
|
||||
decreasingly effective under real loads. This is probably because
|
||||
actual programs that rely heavily on malloc increasingly tend
|
||||
to use a larger variety of chunk sizes. For example, in C++
|
||||
programs, this probably corresponds to a trend for programs to
|
||||
use an increasing number of classes. Different classes tend to
|
||||
have different sizes.
|
||||
|
||||
<p>
|
||||
As a consequence, the current version <em>never</em> caches
|
||||
chunks. It appears to be more effective to concentrate
|
||||
efforts on further reducing the costs of handling non-cached
|
||||
chunks than to rely on policies and heuristics that are of
|
||||
decreasing utility. However, the issue is still open for further
|
||||
experimentation.
|
||||
|
||||
|
||||
<h3>Lookasides</h3>
|
||||
|
||||
<p>
|
||||
There remains one kind of caching that is highly desirable in
|
||||
some applications but not implemented in this allocator --
|
||||
lookasides for very small chunks. As mentioned above, the
|
||||
basic algorithm imposes a minimum chunk size that can be
|
||||
very wasteful for very small requests. For example, a linked
|
||||
list on a system with 4-byte pointers might allocate nodes
|
||||
holding only, say, two pointers, requiring only 8 bytes.
|
||||
Since the minimum chunk size is 16 bytes, user programs
|
||||
allocating only list nodes suffer 100% overhead.
|
||||
|
||||
<p>
|
||||
Eliminating this problem while still maintaining portable
|
||||
alignment would require that the allocator not impose
|
||||
<em>any</em> overhead. Techniques for carrying this out
|
||||
exist. For example, chunks could be checked to see if they
|
||||
belong to a larger aggregated space via address
|
||||
comparisons. However, doing so can impose significant costs;
|
||||
in fact the cost would be unacceptable in this allocator.
|
||||
Chunks are not otherwise tracked by address, so unless
|
||||
arbitrarily limited, checking might lead to random searches
|
||||
through memory. Additionally, support requires the adoption of
|
||||
one or more policies controlling whether and how to ever
|
||||
coalesce small chunks.
|
||||
|
||||
<p>
|
||||
Such issues and limitations lead to one of the very few kinds
|
||||
of situations in which programmers should routinely write their
|
||||
own special purpose memory management routines (by, for example
|
||||
in C++ overloading <code>operator new()</code>). Programs relying
|
||||
on large but approximately known numbers of very small chunks
|
||||
may find it profitable to build very simple allocators. For
|
||||
example, chunks can be allocated out of a fixed array with
|
||||
an embedded freelist, along with a provision to rely on
|
||||
<code>malloc</code> as a backup if the array becomes exhausted.
|
||||
Somewhat more flexibly, these can be based on the C or C++
|
||||
versions of <em>obstack</em> available with GNU gcc and libg++.
|
||||
|
||||
<hr>
|
||||
<address><a href="mailto:dl@gee.cs.oswego.edu">Doug Lea</a></address>
|
||||
<!-- Created: Fri Oct 25 19:07:46 EDT 1996 -->
|
||||
<!-- hhmts start -->
|
||||
Last modified: Wed Dec 4 12:20:31 EST
|
||||
<!-- hhmts end -->
|
||||
</body>
|
||||
</html>
|
||||
13381
study/sabre/os/files/MemoryManagement/LIMEMS41.txt
Normal file
13381
study/sabre/os/files/MemoryManagement/LIMEMS41.txt
Normal file
File diff suppressed because it is too large
Load Diff
BIN
study/sabre/os/files/MemoryManagement/SlabAllocator.pdf
Normal file
BIN
study/sabre/os/files/MemoryManagement/SlabAllocator.pdf
Normal file
Binary file not shown.
828
study/sabre/os/files/MemoryManagement/XMS20.txt
Normal file
828
study/sabre/os/files/MemoryManagement/XMS20.txt
Normal file
@@ -0,0 +1,828 @@
|
||||
eXtended Memory Specification (XMS), ver 2.0
|
||||
|
||||
|
||||
July 19, 1988
|
||||
|
||||
|
||||
Copyright (c) 1988, Microsoft Corporation, Lotus Development
|
||||
Corporation, Intel Corporation, and AST Research, Inc.
|
||||
|
||||
Microsoft Corporation
|
||||
Box 97017
|
||||
16011 NE 36th Way
|
||||
Redmond, WA 98073
|
||||
|
||||
LOTUS (r)
|
||||
INTEL (r)
|
||||
MICROSOFT (r)
|
||||
AST (r) Research
|
||||
|
||||
This specification was jointly developed by Microsoft Corporation,
|
||||
Lotus Development Corporation, Intel Corporation,and AST Research,
|
||||
Inc. Although it has been released into the public domain and is not
|
||||
confidential or proprietary, the specification is still the copyright
|
||||
and property of Microsoft Corporation, Lotus Development Corporation,
|
||||
Intel Corporation, and AST Research, Inc.
|
||||
|
||||
Disclaimer of Warranty
|
||||
|
||||
MICROSOFT CORPORATION, LOTUS DEVELOPMENT CORPORATION, INTEL
|
||||
CORPORATION, AND AST RESEARCH, INC., EXCLUDE ANY AND ALL IMPLIED
|
||||
WARRANTIES, INCLUDING WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
|
||||
PARTICULAR PURPOSE. NEITHER MICROSOFT NOR LOTUS NOR INTEL NOR AST
|
||||
RESEARCH MAKE ANY WARRANTY OF REPRESENTATION, EITHER EXPRESS OR
|
||||
IMPLIED, WITH RESPECT TO THIS SPECIFICATION, ITS QUALITY,
|
||||
PERFORMANCE, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
|
||||
NEITHER MICROSOFT NOR LOTUS NOR INTEL NOR AST RESEARCH SHALL HAVE ANY
|
||||
LIABILITY FOR SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF OR RESULTING FROM THE USE OR MODIFICATION OF THIS
|
||||
SPECIFICATION.
|
||||
|
||||
This specification uses the following trademarks:
|
||||
|
||||
Intel is a registered trademark of Intel Corporation, Microsoft is a
|
||||
registered trademark of Microsoft Corporation, Lotus is a registered
|
||||
trademark of Lotus Development Corporation, and AST is a registered
|
||||
trademark of AST Research, Inc.
|
||||
|
||||
|
||||
Extended Memory Specification
|
||||
=============================
|
||||
|
||||
The purpose of this document is to define the Extended Memory Specifica-
|
||||
tion (XMS) version 2.00 for MS-DOS. XMS allows DOS programs to utilize
|
||||
additional memory found in Intel's 80286 and 80386 based machines in
|
||||
a consistent, machine independent manner. With some restrictions, XMS adds
|
||||
almost 64K to the 640K which DOS programs can access directly. Depending on
|
||||
available hardware, XMS may provide even more memory to DOS programs. XMS
|
||||
also provides DOS programs with a standard method of storing data in extended
|
||||
memory.
|
||||
|
||||
DEFINITIONS:
|
||||
------------
|
||||
|
||||
Extended
|
||||
Memory - Memory in 80286 and 80386 based machines which is located
|
||||
above the 1MB address boundary.
|
||||
|
||||
High Memory
|
||||
Area (HMA) - The first 64K of extended memory. The High Memory
|
||||
Area is unique because code can be executed in it while
|
||||
in real mode. The HMA officially starts at FFFF:10h
|
||||
and ends at FFFF:FFFFh making it 64K-16 bytes in length.
|
||||
|
||||
Upper Memory
|
||||
Blocks (UMBs)- Blocks of memory available on some 80x86 based machines
|
||||
which are located between DOS's 640K limit and the
|
||||
1MB address boundary. The number, size, and location
|
||||
of these blocks vary widely depending upon the types
|
||||
of hardware adapter cards installed in the machine.
|
||||
|
||||
Extended Memory
|
||||
Blocks (EMBs)- Blocks of extended memory located above the HMA which
|
||||
can only be used for data storage.
|
||||
|
||||
A20 Line - The 21st address line of 80x86 CPUs. Enabling the A20
|
||||
line allows access to the HMA.
|
||||
|
||||
XMM - An Extended Memory Manager. A DOS device driver which
|
||||
implements XMS. XMMs are machine specific but allow
|
||||
programs to use extended memory in a machine-independent
|
||||
manner.
|
||||
|
||||
HIMEM.SYS - The Extended Memory Manager currently being distributed
|
||||
by Microsoft.
|
||||
|
||||
Helpful Diagram:
|
||||
|
||||
| | Top of Memory
|
||||
| |
|
||||
| |
|
||||
| /\ |
|
||||
| /||\ |
|
||||
| || |
|
||||
| || |
|
||||
|.......................................................|
|
||||
| |
|
||||
| |
|
||||
| Possible Extended Memory Block |
|
||||
| |
|
||||
| |
|
||||
|.......................................................|
|
||||
| || |
|
||||
| || |
|
||||
| \||/ |
|
||||
| \/ |
|
||||
| |
|
||||
| |
|
||||
| Other EMBs could exist above 1088K (1MB+64K) |
|
||||
| |
|
||||
| |
|
||||
|-------------------------------------------------------| 1088K
|
||||
| |
|
||||
| |
|
||||
| The High Memory Area |
|
||||
| |
|
||||
| |
|
||||
|=======================================================| 1024K or 1MB
|
||||
| |
|
||||
| /\ |
|
||||
| /||\ |
|
||||
| || |
|
||||
| || |
|
||||
|.......................................................|
|
||||
| |
|
||||
| Possible Upper Memory Block |
|
||||
|.......................................................|
|
||||
| || |
|
||||
| || |
|
||||
| \||/ |
|
||||
| \/ |
|
||||
| |
|
||||
| Other UMBs could exist between 640K and 1MB |
|
||||
| |
|
||||
|-------------------------------------------------------| 640K
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| Conventional or DOS Memory |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
| |
|
||||
+-------------------------------------------------------+ 0K
|
||||
DRIVER INSTALLATION:
|
||||
--------------------
|
||||
|
||||
An XMS driver is installed by including a DEVICE= statement in the
|
||||
machine's CONFIG.SYS file. It must be installed prior to any other
|
||||
devices or TSRs which use it. An optional parameter after the driver's
|
||||
name (suggested name "/HMAMIN=") indicates the minimum amount of space in
|
||||
the HMA a program can use. Programs which use less than the minimum will
|
||||
not be placed in the HMA. See "Prioritizing HMA Usage" below for more
|
||||
information. A second optional parameter (suggested name "/NUMHANDLES=")
|
||||
allows users to specify the maximum number of extended memory blocks which
|
||||
may be allocated at any time.
|
||||
|
||||
NOTE: XMS requires DOS 3.00 or above.
|
||||
|
||||
|
||||
THE PROGRAMMING API:
|
||||
--------------------
|
||||
|
||||
The XMS API Functions are accessed via the XMS driver's Control Function.
|
||||
The address of the Control Function is determined via INT 2Fh. First, a
|
||||
program should determine if an XMS driver is installed. Next, it should
|
||||
retrieve the address of the driver's Control Function. It can then use any
|
||||
of the available XMS functions. The functions are divided into several
|
||||
groups:
|
||||
|
||||
1. Driver Information Functions (0h)
|
||||
2. HMA Management Functions (1h-2h)
|
||||
3. A20 Management Functions (3h-7h)
|
||||
4. Extended Memory Management Functions (8h-Fh)
|
||||
5. Upper Memory Management Functions (10h-11h)
|
||||
|
||||
|
||||
DETERMINING IF AN XMS DRIVER IS INSTALLED:
|
||||
------------------------------------------
|
||||
|
||||
The recommended way of determining if an XMS driver is installed is to
|
||||
set AH=43h and AL=00h and then execute INT 2Fh. If an XMS driver is available,
|
||||
80h will be returned in AL.
|
||||
|
||||
Example:
|
||||
; Is an XMS driver installed?
|
||||
mov ax,4300h
|
||||
int 2Fh
|
||||
cmp al,80h
|
||||
jne NoXMSDriver
|
||||
|
||||
|
||||
CALLING THE API FUNCTIONS:
|
||||
--------------------------
|
||||
|
||||
Programs can execute INT 2Fh with AH=43h and AL=10h to obtain the address
|
||||
of the driver's control function. The address is returned in ES:BX. This
|
||||
function is called to access all of the XMS functions. It should be called
|
||||
with AH set to the number of the API function requested. The API function
|
||||
will put a success code of 0001h or 0000h in AX. If the function succeeded
|
||||
(AX=0001h), additional information may be passed back in BX and DX. If the
|
||||
function failed (AX=0000h), an error code may be returned in BL. Valid
|
||||
error codes have their high bit set. Developers should keep in mind that
|
||||
some of the XMS API functions may not be implemented by all drivers and will
|
||||
return failure in all cases.
|
||||
|
||||
Example:
|
||||
; Get the address of the driver's control function
|
||||
mov ax,4310h
|
||||
int 2Fh
|
||||
mov word ptr [XMSControl],bx ; XMSControl is a DWORD
|
||||
mov word ptr [XMSControl+2],es
|
||||
|
||||
; Get the XMS driver's version number
|
||||
mov ah,00h
|
||||
call [XMSControl] ; Get XMS Version Number
|
||||
|
||||
NOTE: Programs should make sure that at least 256 bytes of stack space
|
||||
is available before calling XMS API functions.
|
||||
|
||||
|
||||
API FUNCTION DESCRIPTIONS:
|
||||
--------------------------
|
||||
|
||||
The following XMS API functions are available:
|
||||
|
||||
0h) Get XMS Version Number
|
||||
1h) Request High Memory Area
|
||||
2h) Release High Memory Area
|
||||
3h) Global Enable A20
|
||||
4h) Global Disable A20
|
||||
5h) Local Enable A20
|
||||
6h) Local Disable A20
|
||||
7h) Query A20
|
||||
8h) Query Free Extended Memory
|
||||
9h) Allocate Extended Memory Block
|
||||
Ah) Free Extended Memory Block
|
||||
Bh) Move Extended Memory Block
|
||||
Ch) Lock Extended Memory Block
|
||||
Dh) Unlock Extended Memory Block
|
||||
Eh) Get Handle Information
|
||||
Fh) Reallocate Extended Memory Block
|
||||
10h) Request Upper Memory Block
|
||||
11h) Release Upper Memory Block
|
||||
|
||||
Each is described below.
|
||||
|
||||
|
||||
Get XMS Version Number (Function 00h):
|
||||
--------------------------------------
|
||||
|
||||
ARGS: AH = 00h
|
||||
RETS: AX = XMS version number
|
||||
BX = Driver internal revision number
|
||||
DX = 0001h if the HMA exists, 0000h otherwise
|
||||
ERRS: None
|
||||
|
||||
This function returns with AX equal to a 16-bit BCD number representing
|
||||
the revision of the DOS Extended Memory Specification which the driver
|
||||
implements (e.g. AX=0235h would mean that the driver implemented XMS version
|
||||
2.35). BX is set equal to the driver's internal revision number mainly for
|
||||
debugging purposes. DX indicates the existence of the HMA (not its
|
||||
availability) and is intended mainly for installation programs.
|
||||
|
||||
NOTE: This document defines version 2.00 of the specification.
|
||||
|
||||
|
||||
Request High Memory Area (Function 01h):
|
||||
----------------------------------------
|
||||
|
||||
ARGS: AH = 01h
|
||||
If the caller is a TSR or device driver,
|
||||
DX = Space needed in the HMA by the caller in bytes
|
||||
If the caller is an application program,
|
||||
DX = FFFFh
|
||||
RETS: AX = 0001h if the HMA is assigned to the caller, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 90h if the HMA does not exist
|
||||
BL = 91h if the HMA is already in use
|
||||
BL = 92h if DX is less than the /HMAMIN= parameter
|
||||
|
||||
This function attempts to reserve the 64K-16 byte high memory area for
|
||||
the caller. If the HMA is currently unused, the caller's size parameter is
|
||||
compared to the /HMAMIN= parameter on the driver's command line. If the
|
||||
value passed by the caller is greater than or equal to the amount specified
|
||||
by the driver's parameter, the request succeeds. This provides the ability
|
||||
to ensure that programs which use the HMA efficiently have priority over
|
||||
those which do not.
|
||||
|
||||
NOTE: See the sections "Prioritizing HMA Usage" and "High Memory Area
|
||||
Restrictions" below for more information.
|
||||
|
||||
|
||||
Release High Memory Area (Function 02h):
|
||||
----------------------------------------
|
||||
|
||||
ARGS: AH = 02h
|
||||
RETS: AX = 0001h if the HMA is successfully released, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 90h if the HMA does not exist
|
||||
BL = 93h if the HMA was not allocated
|
||||
|
||||
This function releases the high memory area and allows other programs to
|
||||
use it. Programs which allocate the HMA must release it before exiting.
|
||||
When the HMA has been released, any code or data stored in it becomes invalid
|
||||
and should not be accessed.
|
||||
|
||||
|
||||
Global Enable A20 (Function 03h):
|
||||
---------------------------------
|
||||
|
||||
ARGS: AH = 03h
|
||||
RETS: AX = 0001h if the A20 line is enabled, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 82h if an A20 error occurs
|
||||
|
||||
This function attempts to enable the A20 line. It should only be used
|
||||
by programs which have control of the HMA. The A20 line should be turned
|
||||
off via Function 04h (Global Disable A20) before a program releases control
|
||||
of the system.
|
||||
|
||||
NOTE: On many machines, toggling the A20 line is a relatively slow
|
||||
operation.
|
||||
|
||||
|
||||
Global Disable A20 (Function 04h):
|
||||
----------------------------------
|
||||
|
||||
ARGS: AH = 04h
|
||||
RETS: AX = 0001h if the A20 line is disabled, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 82h if an A20 error occurs
|
||||
BL = 94h if the A20 line is still enabled
|
||||
|
||||
This function attempts to disable the A20 line. It should only be used
|
||||
by programs which have control of the HMA. The A20 line should be disabled
|
||||
before a program releases control of the system.
|
||||
|
||||
NOTE: On many machines, toggling the A20 line is a relatively slow
|
||||
operation.
|
||||
|
||||
|
||||
Local Enable A20 (Function 05h):
|
||||
--------------------------------
|
||||
|
||||
ARGS: AH = 05h
|
||||
RETS: AX = 0001h if the A20 line is enabled, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 82h if an A20 error occurs
|
||||
|
||||
This function attempts to enable the A20 line. It should only be used
|
||||
by programs which need direct access to extended memory. Programs which use
|
||||
this function should call Function 06h (Local Disable A20) before releasing
|
||||
control of the system.
|
||||
|
||||
NOTE: On many machines, toggling the A20 line is a relatively slow
|
||||
operation.
|
||||
|
||||
|
||||
Local Disable A20 (Function 06h):
|
||||
---------------------------------
|
||||
|
||||
ARGS: AH = 06h
|
||||
RETS: AX = 0001h if the function succeeds, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 82h if an A20 error occurs
|
||||
BL = 94h if the A20 line is still enabled
|
||||
|
||||
This function cancels a previous call to Function 05h (Local Enable
|
||||
A20). It should only be used by programs which need direct access to
|
||||
extended memory. Previous calls to Function 05h must be canceled before
|
||||
releasing control of the system.
|
||||
|
||||
NOTE: On many machines, toggling the A20 line is a relatively slow
|
||||
operation.
|
||||
|
||||
|
||||
Query A20 (Function 07h):
|
||||
-------------------------
|
||||
|
||||
ARGS: AH = 07h
|
||||
RETS: AX = 0001h if the A20 line is physically enabled, 0000h otherwise
|
||||
ERRS: BL = 00h if the function succeeds
|
||||
BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
|
||||
This function checks to see if the A20 line is physically enabled. It
|
||||
does this in a hardware independent manner by seeing if "memory wrap" occurs.
|
||||
|
||||
|
||||
Query Free Extended Memory (Function 08h):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 08h
|
||||
RETS: AX = Size of the largest free extended memory block in K-bytes
|
||||
DX = Total amount of free extended memory in K-bytes
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A0h if all extended memory is allocated
|
||||
|
||||
This function returns the size of the largest available extended memory
|
||||
block in the system.
|
||||
|
||||
NOTE: The 64K HMA is not included in the returned value even if it is
|
||||
not in use.
|
||||
|
||||
|
||||
Allocate Extended Memory Block (Function 09h):
|
||||
----------------------------------------------
|
||||
|
||||
ARGS: AH = 09h
|
||||
DX = Amount of extended memory being requested in K-bytes
|
||||
RETS: AX = 0001h if the block is allocated, 0000h otherwise
|
||||
DX = 16-bit handle to the allocated block
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A0h if all available extended memory is allocated
|
||||
BL = A1h if all available extended memory handles are in use
|
||||
|
||||
This function attempts to allocate a block of the given size out of the
|
||||
pool of free extended memory. If a block is available, it is reserved
|
||||
for the caller and a 16-bit handle to that block is returned. The handle
|
||||
should be used in all subsequent extended memory calls. If no memory was
|
||||
allocated, the returned handle is null.
|
||||
|
||||
NOTE: Extended memory handles are scarce resources. Programs should
|
||||
try to allocate as few as possible at any one time. When all
|
||||
of a driver's handles are in use, any free extended memory is
|
||||
unavailable.
|
||||
|
||||
|
||||
Free Extended Memory Block (Function 0Ah):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 0Ah
|
||||
DX = Handle to the allocated block which should be freed
|
||||
RETS: AX = 0001h if the block is successfully freed, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A2h if the handle is invalid
|
||||
BL = ABh if the handle is locked
|
||||
|
||||
This function frees a block of extended memory which was previously
|
||||
allocated using Function 09h (Allocate Extended Memory Block). Programs
|
||||
which allocate extended memory should free their memory blocks before
|
||||
exiting. When an extended memory buffer is freed, its handle and all data
|
||||
stored in it become invalid and should not be accessed.
|
||||
|
||||
|
||||
Move Extended Memory Block (Function 0Bh):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 0Bh
|
||||
DS:SI = Pointer to an Extended Memory Move Structure (see below)
|
||||
RETS: AX = 0001h if the move is successful, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = 82h if an A20 error occurs
|
||||
BL = A3h if the SourceHandle is invalid
|
||||
BL = A4h if the SourceOffset is invalid
|
||||
BL = A5h if the DestHandle is invalid
|
||||
BL = A6h if the DestOffset is invalid
|
||||
BL = A7h if the Length is invalid
|
||||
BL = A8h if the move has an invalid overlap
|
||||
BL = A9h if a parity error occurs
|
||||
|
||||
Extended Memory Move Structure Definition:
|
||||
|
||||
ExtMemMoveStruct struc
|
||||
Length dd ? ; 32-bit number of bytes to transfer
|
||||
SourceHandle dw ? ; Handle of source block
|
||||
SourceOffset dd ? ; 32-bit offset into source
|
||||
DestHandle dw ? ; Handle of destination block
|
||||
DestOffset dd ? ; 32-bit offset into destination block
|
||||
ExtMemMoveStruct ends
|
||||
|
||||
This function attempts to transfer a block of data from one location to
|
||||
another. It is primarily intended for moving blocks of data between
|
||||
conventional memory and extended memory, however it can be used for moving
|
||||
blocks within conventional memory and within extended memory.
|
||||
|
||||
NOTE: If SourceHandle is set to 0000h, the SourceOffset is interpreted
|
||||
as a standard segment:offset pair which refers to memory that is
|
||||
directly accessible by the processor. The segment:offset pair
|
||||
is stored in Intel DWORD notation. The same is true for DestHandle
|
||||
and DestOffset.
|
||||
|
||||
SourceHandle and DestHandle do not have to refer to locked memory
|
||||
blocks.
|
||||
|
||||
Length must be even. Although not required, WORD-aligned moves
|
||||
can be significantly faster on most machines. DWORD aligned move
|
||||
can be even faster on 80386 machines.
|
||||
|
||||
If the source and destination blocks overlap, only forward moves
|
||||
(i.e. where the source base is less than the destination base) are
|
||||
guaranteed to work properly.
|
||||
|
||||
Programs should not enable the A20 line before calling this
|
||||
function. The state of the A20 line is preserved.
|
||||
|
||||
This function is guaranteed to provide a reasonable number of
|
||||
interrupt windows during long transfers.
|
||||
|
||||
|
||||
Lock Extended Memory Block (Function 0Ch):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 0Ch
|
||||
DX = Extended memory block handle to lock
|
||||
RETS: AX = 0001h if the block is locked, 0000h otherwise
|
||||
DX:BX = 32-bit linear address of the locked block
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A2h if the handle is invalid
|
||||
BL = ACh if the block's lock count overflows
|
||||
BL = ADh if the lock fails
|
||||
|
||||
This function locks an extended memory block and returns its base
|
||||
address as a 32-bit linear address. Locked memory blocks are guaranteed not
|
||||
to move. The 32-bit pointer is only valid while the block is locked.
|
||||
Locked blocks should be unlocked as soon as possible.
|
||||
|
||||
NOTE: A block does not have to be locked before using Function 0Bh (Move
|
||||
Extended Memory Block).
|
||||
|
||||
"Lock counts" are maintained for EMBs.
|
||||
|
||||
|
||||
Unlock Extended Memory Block (Function 0Dh):
|
||||
--------------------------------------------
|
||||
|
||||
ARGS: AH = 0Dh
|
||||
DX = Extended memory block handle to unlock
|
||||
RETS: AX = 0001h if the block is unlocked, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A2h if the handle is invalid
|
||||
BL = AAh if the block is not locked
|
||||
|
||||
This function unlocks a locked extended memory block. Any 32-bit
|
||||
pointers into the block become invalid and should no longer be used.
|
||||
|
||||
|
||||
Get EMB Handle Information (Function 0Eh):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 0Eh
|
||||
DX = Extended memory block handle
|
||||
RETS: AX = 0001h if the block's information is found, 0000h otherwise
|
||||
BH = The block's lock count
|
||||
BL = Number of free EMB handles in the system
|
||||
DX = The block's length in K-bytes
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A2h if the handle is invalid
|
||||
|
||||
This function returns additional information about an extended memory
|
||||
block to the caller.
|
||||
|
||||
NOTE: To get the block's base address, use Function 0Ch (Lock Extended
|
||||
Memory Block).
|
||||
|
||||
|
||||
Reallocate Extended Memory Block (Function 0Fh):
|
||||
------------------------------------------------
|
||||
|
||||
ARGS: AH = 0Fh
|
||||
BX = New size for the extended memory block in K-bytes
|
||||
DX = Unlocked extended memory block handle to reallocate
|
||||
RETS: AX = 0001h if the block is reallocated, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = 81h if a VDISK device is detected
|
||||
BL = A0h if all available extended memory is allocated
|
||||
BL = A1h if all available extended memory handles are in use
|
||||
BL = A2h if the handle is invalid
|
||||
BL = ABh if the block is locked
|
||||
|
||||
This function attempts to reallocate an unlocked extended memory block
|
||||
so that it becomes the newly specified size. If the new size is smaller
|
||||
than the old block's size, all data at the upper end of the old block is
|
||||
lost.
|
||||
|
||||
|
||||
Request Upper Memory Block (Function 10h):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 10h
|
||||
DX = Size of requested memory block in paragraphs
|
||||
RETS: AX = 0001h if the request is granted, 0000h otherwise
|
||||
BX = Segment number of the upper memory block
|
||||
If the request is granted,
|
||||
DX = Actual size of the allocated block in paragraphs
|
||||
otherwise,
|
||||
DX = Size of the largest available UMB in paragraphs
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = B0h if a smaller UMB is available
|
||||
BL = B1h if no UMBs are available
|
||||
|
||||
This function attempts to allocate an upper memory block to the caller.
|
||||
If the function fails, the size of the largest free UMB is returned in DX.
|
||||
|
||||
NOTE: By definition UMBs are located below the 1MB address boundary.
|
||||
The A20 Line does not need to be enabled before accessing an
|
||||
allocated UMB.
|
||||
|
||||
UMBs are paragraph aligned.
|
||||
|
||||
To determine the size of the largest available UMB, attempt to
|
||||
allocate one with a size of FFFFh.
|
||||
|
||||
UMBs are unaffected by EMS calls.
|
||||
|
||||
|
||||
Release Upper Memory Block (Function 11h):
|
||||
------------------------------------------
|
||||
|
||||
ARGS: AH = 11h
|
||||
DX = Segment number of the upper memory block
|
||||
RETS: AX = 0001h if the block was released, 0000h otherwise
|
||||
ERRS: BL = 80h if the function is not implemented
|
||||
BL = B2h if the UMB segment number is invalid
|
||||
|
||||
This function frees a previously allocated upper memory block. When an
|
||||
UMB has been released, any code or data stored in it becomes invalid and
|
||||
should not be accessed.
|
||||
|
||||
|
||||
PRIORITIZING HMA USAGE:
|
||||
-----------------------
|
||||
|
||||
For DOS users to receive the maximum benefit from the High Memory Area,
|
||||
programs which use the HMA must store as much of their resident code in it as
|
||||
is possible. It is very important that developers realize that the HMA is
|
||||
allocated as a single unit.
|
||||
|
||||
For example, a TSR program which grabs the HMA and puts 10K of code into
|
||||
it may prevent a later TSR from putting 62K into the HMA. Obviously, regular
|
||||
DOS programs would have more memory available to them below the 640K line if
|
||||
the 62K TSR was moved into the HMA instead of the 10K one.
|
||||
|
||||
The first method for dealing with conflicts such as this is to require
|
||||
programs which use the HMA to provide a command line option for disabling
|
||||
this feature. It is crucial that TSRs which do not make full use of the HMA
|
||||
provide such a switch on their own command line (suggested name "/NOHMA").
|
||||
|
||||
The second method for optimizing HMA usage is through the /HMAMIN=
|
||||
parameter on the XMS device driver line. The number after the parameter
|
||||
is defined to be the minimum amount of HMA space (in K-bytes) used by any
|
||||
driver or TSR. For example, if "DEVICE=HIMEM.SYS /HMAMIN=48" is in a
|
||||
user's CONFIG.SYS file, only programs which request at least 48K would be
|
||||
allowed to allocate the HMA. This number can be adjusted either by
|
||||
installation programs or by the user himself. If this parameter is not
|
||||
specified, the default value of 0 is used causing the HMA to be allocated
|
||||
on a first come, first served basis.
|
||||
|
||||
Note that this problem does not impact application programs. If the HMA
|
||||
is available when an application program starts, the application is free to
|
||||
use as much or as little of the HMA as it wants. For this reason,
|
||||
applications should pass FFFFh in DX when calling Function 01h.
|
||||
|
||||
|
||||
HIGH MEMORY AREA RESTRICTIONS:
|
||||
------------------------------
|
||||
|
||||
- Far pointers to data located in the HMA cannot be passed to DOS. DOS
|
||||
normalizes any pointer which is passed into it. This will cause data
|
||||
addresses in the HMA to be invalidated.
|
||||
|
||||
- Disk I/O directly into the HMA (via DOS, INT 13h, or otherwise) is not
|
||||
recommended.
|
||||
|
||||
- Programs, especially drivers and TSRs, which use the HMA *MUST* use
|
||||
as much of it as possible. If a driver or TSR is unable to use at
|
||||
least 90% of the available HMA (typically ~58K), they must provide
|
||||
a command line switch for overriding HMA usage. This will allow
|
||||
the user to configure his machine for optimum use of the HMA.
|
||||
|
||||
- Device drivers and TSRs cannot leave the A20 line permanently turned
|
||||
on. Several applications rely on 1MB memory wrap and will overwrite the
|
||||
HMA if the A20 line is left enabled potentially causing a system crash.
|
||||
|
||||
- Interrupt vectors must not point into the HMA. This is a result of
|
||||
the previous restriction. Note that interrupt vectors can point into
|
||||
any allocated upper memory blocks however.
|
||||
|
||||
ERROR CODE INDEX:
|
||||
-----------------
|
||||
|
||||
If AX=0000h when a function returns and the high bit of BL is set,
|
||||
|
||||
BL=80h if the function is not implemented
|
||||
81h if a VDISK device is detected
|
||||
82h if an A20 error occurs
|
||||
8Eh if a general driver error occurs
|
||||
8Fh if an unrecoverable driver error occurs
|
||||
90h if the HMA does not exist
|
||||
91h if the HMA is already in use
|
||||
92h if DX is less than the /HMAMIN= parameter
|
||||
93h if the HMA is not allocated
|
||||
94h if the A20 line is still enabled
|
||||
A0h if all extended memory is allocated
|
||||
A1h if all available extended memory handles are in use
|
||||
A2h if the handle is invalid
|
||||
A3h if the SourceHandle is invalid
|
||||
A4h if the SourceOffset is invalid
|
||||
A5h if the DestHandle is invalid
|
||||
A6h if the DestOffset is invalid
|
||||
A7h if the Length is invalid
|
||||
A8h if the move has an invalid overlap
|
||||
A9h if a parity error occurs
|
||||
AAh if the block is not locked
|
||||
ABh if the block is locked
|
||||
ACh if the block's lock count overflows
|
||||
ADh if the lock fails
|
||||
B0h if a smaller UMB is available
|
||||
B1h if no UMBs are available
|
||||
B2h if the UMB segment number is invalid
|
||||
|
||||
IMPLEMENTATION NOTES FOR DOS XMS DRIVERS:
|
||||
-----------------------------------------
|
||||
|
||||
- A DOS XMS driver's control function must begin with code similar to the
|
||||
following:
|
||||
|
||||
XMMControl proc far
|
||||
|
||||
jmp short XCControlEntry ; For "hookability"
|
||||
nop ; NOTE: The jump must be a short
|
||||
nop ; jump to indicate the end of
|
||||
nop ; any hook chain. The nop's
|
||||
; allow a far jump to be
|
||||
; patched in.
|
||||
XCControlEntry:
|
||||
|
||||
|
||||
- XMS drivers must preserve all registers except those containing
|
||||
returned values across any function call.
|
||||
|
||||
- XMS drivers are required to hook INT 15h and watch for calls to
|
||||
functions 87h (Block Move) and 88h (Extended Memory Available). The
|
||||
INT 15h Block Move function must be hooked so that the state of the A20
|
||||
line is preserved across the call. The INT 15h Extended Memory
|
||||
Available function must be hooked to return 0h to protect the HMA.
|
||||
|
||||
- In order to maintain compatibility with existing device drivers, DOS XMS
|
||||
drivers must not hook INT 15h until the first non-Version Number call
|
||||
to the control function is made.
|
||||
|
||||
- XMS drivers are required to check for the presence of drivers which
|
||||
use the IBM VDISK allocation scheme. Note that it is not sufficient to
|
||||
check for VDISK users at installation time but at the time when the HMA
|
||||
is first allocated. If a VDISK user is detected, the HMA must not be
|
||||
allocated. Microsoft will publish a standard method for detecting
|
||||
drivers which use the VDISK allocation scheme.
|
||||
|
||||
- XMS drivers which have a fixed number of extended memory handles (most
|
||||
do) should implement a command line parameter for adjusting that number
|
||||
(suggested name "/NUMHANDLES=")
|
||||
|
||||
- XMS drivers should make sure that the major DOS version number is
|
||||
greater than or equal to 3 before installing themselves.
|
||||
|
||||
- UMBs cannot occupy memory addresses that can be banked by EMS 4.0.
|
||||
EMS 4.0 takes precedence over UMBs for physically addressable memory.
|
||||
|
||||
- All driver functions must be re-entrant. Care should be taken to not
|
||||
leave interrupts disabled for long periods of time.
|
||||
|
||||
- Allocation of a zero length extended memory buffer is allowed. Programs
|
||||
which hook XMS drivers may need to reserve a handle for private use via
|
||||
this method. Programs which hook an XMS driver should pass all requests
|
||||
for zero length EMBs to the next driver in the chain.
|
||||
|
||||
- Drivers should control the A20 line via an "enable count." Local En-
|
||||
able only enables the A20 line if the count is zero. It then increments
|
||||
the count. Local Disable only disables A20 if the count is one. It
|
||||
then decrements the count. Global Enable/Disable keeps a flag which
|
||||
indicates the state of A20. They use Local Enable/Disable to actually
|
||||
change the state.
|
||||
|
||||
IMPLEMENTATION NOTES FOR HIMEM.SYS:
|
||||
-----------------------------------
|
||||
|
||||
- HIMEM.SYS currently supports true AT-compatibles, 386 AT machines, IBM
|
||||
PS/2s, AT&T 6300 Plus systems and Hewlett Packard Vectras.
|
||||
|
||||
- If HIMEM finds that it cannot properly control the A20 line or if there
|
||||
is no extended memory available when HIMEM.SYS is invoked, the driver
|
||||
does not install itself. HIMEM.SYS displays the message "High Memory
|
||||
Area Unavailable" when this situation occurs.
|
||||
|
||||
- If HIMEM finds that the A20 line is already enabled when it is invoked,
|
||||
it will NOT change the A20 line's state. The assumption is that whoever
|
||||
enabled it knew what they were doing. HIMEM.SYS displays the message "A20
|
||||
Line Permanently Enabled" when this situation occurs.
|
||||
|
||||
- HIMEM.SYS is incompatible with IBM's VDISK.SYS driver and other drivers
|
||||
which use the VDISK scheme for allocating extended memory. However,
|
||||
HIMEM does attempt to detect these drivers and will not allocate the
|
||||
HMA if one is found.
|
||||
|
||||
- HIMEM.SYS supports the optional "/HMAMIN=" parameter. The valid values
|
||||
are decimal numbers between 0 and 63.
|
||||
|
||||
- By default, HIMEM.SYS has 32 extended memory handles available for use.
|
||||
This number may be adjusted with the "/NUMHANDLES=" parameter. The
|
||||
maximum value for this parameter is 128 and the minimum is 0. Each
|
||||
handle currently requires 6 bytes of resident space.
|
||||
|
||||
|
||||
Copyright (c) 1988, Microsoft Corporation
|
||||
1079
study/sabre/os/files/MemoryManagement/XMS30.txt
Normal file
1079
study/sabre/os/files/MemoryManagement/XMS30.txt
Normal file
File diff suppressed because it is too large
Load Diff
7
study/sabre/os/files/MemoryManagement/index.html
Normal file
7
study/sabre/os/files/MemoryManagement/index.html
Normal file
@@ -0,0 +1,7 @@
|
||||
<html>
|
||||
<head>
|
||||
<meta http-equiv="refresh" content="0;url=/Linux.old/sabre/os/articles">
|
||||
</head>
|
||||
<body lang="zh-CN">
|
||||
</body>
|
||||
</html>
|
||||
BIN
study/sabre/os/files/MemoryManagement/malloc1.gif
Normal file
BIN
study/sabre/os/files/MemoryManagement/malloc1.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 4.1 KiB |
BIN
study/sabre/os/files/MemoryManagement/malloc2.gif
Normal file
BIN
study/sabre/os/files/MemoryManagement/malloc2.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 3.0 KiB |
Reference in New Issue
Block a user