310 lines
15 KiB
HTML
310 lines
15 KiB
HTML
<html><head><title>NASM Manual</title></head>
|
|
<body><h1 align=center>The Netwide Assembler: NASM</h1>
|
|
|
|
<p align=center><a href="nasmdoc6.html">Next Chapter</a> |
|
|
<a href="nasmdoc4.html">Previous Chapter</a> |
|
|
<a href="nasmdoc0.html">Contents</a> |
|
|
<a href="nasmdoci.html">Index</a>
|
|
<h2><a name="chapter-5">Chapter 5: Assembler Directives</a></h2>
|
|
<p>NASM, though it attempts to avoid the bureaucracy of assemblers like
|
|
MASM and TASM, is nevertheless forced to support a <em>few</em> directives.
|
|
These are described in this chapter.
|
|
<p>NASM's directives come in two types: <em>user-level</em> directives and
|
|
<em>primitive</em> directives. Typically, each directive has a user-level
|
|
form and a primitive form. In almost all cases, we recommend that users use
|
|
the user-level forms of the directives, which are implemented as macros
|
|
which call the primitive forms.
|
|
<p>Primitive directives are enclosed in square brackets; user-level
|
|
directives are not.
|
|
<p>In addition to the universal directives described in this chapter, each
|
|
object file format can optionally supply extra directives in order to
|
|
control particular features of that file format. These
|
|
<em>format-specific</em> directives are documented along with the formats
|
|
that implement them, in <a href="nasmdoc6.html">chapter 6</a>.
|
|
<h3><a name="section-5.1">5.1 <code><nobr>BITS</nobr></code>: Specifying Target Processor Mode</a></h3>
|
|
<p>The <code><nobr>BITS</nobr></code> directive specifies whether NASM
|
|
should generate code designed to run on a processor operating in 16-bit
|
|
mode, or code designed to run on a processor operating in 32-bit mode. The
|
|
syntax is <code><nobr>BITS 16</nobr></code> or
|
|
<code><nobr>BITS 32</nobr></code>.
|
|
<p>In most cases, you should not need to use <code><nobr>BITS</nobr></code>
|
|
explicitly. The <code><nobr>aout</nobr></code>,
|
|
<code><nobr>coff</nobr></code>, <code><nobr>elf</nobr></code> and
|
|
<code><nobr>win32</nobr></code> object formats, which are designed for use
|
|
in 32-bit operating systems, all cause NASM to select 32-bit mode by
|
|
default. The <code><nobr>obj</nobr></code> object format allows you to
|
|
specify each segment you define as either <code><nobr>USE16</nobr></code>
|
|
or <code><nobr>USE32</nobr></code>, and NASM will set its operating mode
|
|
accordingly, so the use of the <code><nobr>BITS</nobr></code> directive is
|
|
once again unnecessary.
|
|
<p>The most likely reason for using the <code><nobr>BITS</nobr></code>
|
|
directive is to write 32-bit code in a flat binary file; this is because
|
|
the <code><nobr>bin</nobr></code> output format defaults to 16-bit mode in
|
|
anticipation of it being used most frequently to write DOS
|
|
<code><nobr>.COM</nobr></code> programs, DOS <code><nobr>.SYS</nobr></code>
|
|
device drivers and boot loader software.
|
|
<p>You do <em>not</em> need to specify <code><nobr>BITS 32</nobr></code>
|
|
merely in order to use 32-bit instructions in a 16-bit DOS program; if you
|
|
do, the assembler will generate incorrect code because it will be writing
|
|
code targeted at a 32-bit platform, to be run on a 16-bit one.
|
|
<p>When NASM is in <code><nobr>BITS 16</nobr></code> state, instructions
|
|
which use 32-bit data are prefixed with an 0x66 byte, and those referring
|
|
to 32-bit addresses have an 0x67 prefix. In
|
|
<code><nobr>BITS 32</nobr></code> state, the reverse is true: 32-bit
|
|
instructions require no prefixes, whereas instructions using 16-bit data
|
|
need an 0x66 and those working on 16-bit addresses need an 0x67.
|
|
<p>The <code><nobr>BITS</nobr></code> directive has an exactly equivalent
|
|
primitive form, <code><nobr>[BITS 16]</nobr></code> and
|
|
<code><nobr>[BITS 32]</nobr></code>. The user-level form is a macro which
|
|
has no function other than to call the primitive form.
|
|
<p>Note that the space is neccessary, <code><nobr>BITS32</nobr></code> will
|
|
<em>not</em> work!
|
|
<h4><a name="section-5.1.1">5.1.1 <code><nobr>USE16</nobr></code> & <code><nobr>USE32</nobr></code>: Aliases for BITS</a></h4>
|
|
<p>The `<code><nobr>USE16</nobr></code>' and
|
|
`<code><nobr>USE32</nobr></code>' directives can be used in place of
|
|
`<code><nobr>BITS 16</nobr></code>' and
|
|
`<code><nobr>BITS 32</nobr></code>', for compatibility with other
|
|
assemblers.
|
|
<h3><a name="section-5.2">5.2 <code><nobr>SECTION</nobr></code> or <code><nobr>SEGMENT</nobr></code>: Changing and Defining Sections</a></h3>
|
|
<p>The <code><nobr>SECTION</nobr></code> directive
|
|
(<code><nobr>SEGMENT</nobr></code> is an exactly equivalent synonym)
|
|
changes which section of the output file the code you write will be
|
|
assembled into. In some object file formats, the number and names of
|
|
sections are fixed; in others, the user may make up as many as they wish.
|
|
Hence <code><nobr>SECTION</nobr></code> may sometimes give an error
|
|
message, or may define a new section, if you try to switch to a section
|
|
that does not (yet) exist.
|
|
<p>The Unix object formats, and the <code><nobr>bin</nobr></code> object
|
|
format (but see <a href="nasmdoc6.html#section-6.1.3">section 6.1.3</a>,
|
|
all support the standardised section names <code><nobr>.text</nobr></code>,
|
|
<code><nobr>.data</nobr></code> and <code><nobr>.bss</nobr></code> for the
|
|
code, data and uninitialised-data sections. The
|
|
<code><nobr>obj</nobr></code> format, by contrast, does not recognise these
|
|
section names as being special, and indeed will strip off the leading
|
|
period of any section name that has one.
|
|
<h4><a name="section-5.2.1">5.2.1 The <code><nobr>__SECT__</nobr></code> Macro</a></h4>
|
|
<p>The <code><nobr>SECTION</nobr></code> directive is unusual in that its
|
|
user-level form functions differently from its primitive form. The
|
|
primitive form, <code><nobr>[SECTION xyz]</nobr></code>, simply switches
|
|
the current target section to the one given. The user-level form,
|
|
<code><nobr>SECTION xyz</nobr></code>, however, first defines the
|
|
single-line macro <code><nobr>__SECT__</nobr></code> to be the primitive
|
|
<code><nobr>[SECTION]</nobr></code> directive which it is about to issue,
|
|
and then issues it. So the user-level directive
|
|
<p><pre>
|
|
SECTION .text
|
|
</pre>
|
|
<p>expands to the two lines
|
|
<p><pre>
|
|
%define __SECT__ [SECTION .text]
|
|
[SECTION .text]
|
|
</pre>
|
|
<p>Users may find it useful to make use of this in their own macros. For
|
|
example, the <code><nobr>writefile</nobr></code> macro defined in
|
|
<a href="nasmdoc4.html#section-4.3.3">section 4.3.3</a> can be usefully
|
|
rewritten in the following more sophisticated form:
|
|
<p><pre>
|
|
%macro writefile 2+
|
|
|
|
[section .data]
|
|
|
|
%%str: db %2
|
|
%%endstr:
|
|
|
|
__SECT__
|
|
|
|
mov dx,%%str
|
|
mov cx,%%endstr-%%str
|
|
mov bx,%1
|
|
mov ah,0x40
|
|
int 0x21
|
|
|
|
%endmacro
|
|
</pre>
|
|
<p>This form of the macro, once passed a string to output, first switches
|
|
temporarily to the data section of the file, using the primitive form of
|
|
the <code><nobr>SECTION</nobr></code> directive so as not to modify
|
|
<code><nobr>__SECT__</nobr></code>. It then declares its string in the data
|
|
section, and then invokes <code><nobr>__SECT__</nobr></code> to switch back
|
|
to <em>whichever</em> section the user was previously working in. It thus
|
|
avoids the need, in the previous version of the macro, to include a
|
|
<code><nobr>JMP</nobr></code> instruction to jump over the data, and also
|
|
does not fail if, in a complicated <code><nobr>OBJ</nobr></code> format
|
|
module, the user could potentially be assembling the code in any of several
|
|
separate code sections.
|
|
<h3><a name="section-5.3">5.3 <code><nobr>ABSOLUTE</nobr></code>: Defining Absolute Labels</a></h3>
|
|
<p>The <code><nobr>ABSOLUTE</nobr></code> directive can be thought of as an
|
|
alternative form of <code><nobr>SECTION</nobr></code>: it causes the
|
|
subsequent code to be directed at no physical section, but at the
|
|
hypothetical section starting at the given absolute address. The only
|
|
instructions you can use in this mode are the
|
|
<code><nobr>RESB</nobr></code> family.
|
|
<p><code><nobr>ABSOLUTE</nobr></code> is used as follows:
|
|
<p><pre>
|
|
absolute 0x1A
|
|
|
|
kbuf_chr resw 1
|
|
kbuf_free resw 1
|
|
kbuf resw 16
|
|
</pre>
|
|
<p>This example describes a section of the PC BIOS data area, at segment
|
|
address 0x40: the above code defines <code><nobr>kbuf_chr</nobr></code> to
|
|
be 0x1A, <code><nobr>kbuf_free</nobr></code> to be 0x1C, and
|
|
<code><nobr>kbuf</nobr></code> to be 0x1E.
|
|
<p>The user-level form of <code><nobr>ABSOLUTE</nobr></code>, like that of
|
|
<code><nobr>SECTION</nobr></code>, redefines the
|
|
<code><nobr>__SECT__</nobr></code> macro when it is invoked.
|
|
<p><code><nobr>STRUC</nobr></code> and <code><nobr>ENDSTRUC</nobr></code>
|
|
are defined as macros which use <code><nobr>ABSOLUTE</nobr></code> (and
|
|
also <code><nobr>__SECT__</nobr></code>).
|
|
<p><code><nobr>ABSOLUTE</nobr></code> doesn't have to take an absolute
|
|
constant as an argument: it can take an expression (actually, a critical
|
|
expression: see <a href="nasmdoc3.html#section-3.8">section 3.8</a>) and it
|
|
can be a value in a segment. For example, a TSR can re-use its setup code
|
|
as run-time BSS like this:
|
|
<p><pre>
|
|
org 100h ; it's a .COM program
|
|
|
|
jmp setup ; setup code comes last
|
|
|
|
; the resident part of the TSR goes here
|
|
setup:
|
|
; now write the code that installs the TSR here
|
|
|
|
absolute setup
|
|
|
|
runtimevar1 resw 1
|
|
runtimevar2 resd 20
|
|
|
|
tsr_end:
|
|
</pre>
|
|
<p>This defines some variables `on top of' the setup code, so that after
|
|
the setup has finished running, the space it took up can be re-used as data
|
|
storage for the running TSR. The symbol `tsr_end' can be used to calculate
|
|
the total size of the part of the TSR that needs to be made resident.
|
|
<h3><a name="section-5.4">5.4 <code><nobr>EXTERN</nobr></code>: Importing Symbols from Other Modules</a></h3>
|
|
<p><code><nobr>EXTERN</nobr></code> is similar to the MASM directive
|
|
<code><nobr>EXTRN</nobr></code> and the C keyword
|
|
<code><nobr>extern</nobr></code>: it is used to declare a symbol which is
|
|
not defined anywhere in the module being assembled, but is assumed to be
|
|
defined in some other module and needs to be referred to by this one. Not
|
|
every object-file format can support external variables: the
|
|
<code><nobr>bin</nobr></code> format cannot.
|
|
<p>The <code><nobr>EXTERN</nobr></code> directive takes as many arguments
|
|
as you like. Each argument is the name of a symbol:
|
|
<p><pre>
|
|
extern _printf
|
|
extern _sscanf,_fscanf
|
|
</pre>
|
|
<p>Some object-file formats provide extra features to the
|
|
<code><nobr>EXTERN</nobr></code> directive. In all cases, the extra
|
|
features are used by suffixing a colon to the symbol name followed by
|
|
object-format specific text. For example, the <code><nobr>obj</nobr></code>
|
|
format allows you to declare that the default segment base of an external
|
|
should be the group <code><nobr>dgroup</nobr></code> by means of the
|
|
directive
|
|
<p><pre>
|
|
extern _variable:wrt dgroup
|
|
</pre>
|
|
<p>The primitive form of <code><nobr>EXTERN</nobr></code> differs from the
|
|
user-level form only in that it can take only one argument at a time: the
|
|
support for multiple arguments is implemented at the preprocessor level.
|
|
<p>You can declare the same variable as <code><nobr>EXTERN</nobr></code>
|
|
more than once: NASM will quietly ignore the second and later
|
|
redeclarations. You can't declare a variable as
|
|
<code><nobr>EXTERN</nobr></code> as well as something else, though.
|
|
<h3><a name="section-5.5">5.5 <code><nobr>GLOBAL</nobr></code>: Exporting Symbols to Other Modules</a></h3>
|
|
<p><code><nobr>GLOBAL</nobr></code> is the other end of
|
|
<code><nobr>EXTERN</nobr></code>: if one module declares a symbol as
|
|
<code><nobr>EXTERN</nobr></code> and refers to it, then in order to prevent
|
|
linker errors, some other module must actually <em>define</em> the symbol
|
|
and declare it as <code><nobr>GLOBAL</nobr></code>. Some assemblers use the
|
|
name <code><nobr>PUBLIC</nobr></code> for this purpose.
|
|
<p>The <code><nobr>GLOBAL</nobr></code> directive applying to a symbol must
|
|
appear <em>before</em> the definition of the symbol.
|
|
<p><code><nobr>GLOBAL</nobr></code> uses the same syntax as
|
|
<code><nobr>EXTERN</nobr></code>, except that it must refer to symbols
|
|
which <em>are</em> defined in the same module as the
|
|
<code><nobr>GLOBAL</nobr></code> directive. For example:
|
|
<p><pre>
|
|
global _main
|
|
_main:
|
|
; some code
|
|
</pre>
|
|
<p><code><nobr>GLOBAL</nobr></code>, like <code><nobr>EXTERN</nobr></code>,
|
|
allows object formats to define private extensions by means of a colon. The
|
|
<code><nobr>elf</nobr></code> object format, for example, lets you specify
|
|
whether global data items are functions or data:
|
|
<p><pre>
|
|
global hashlookup:function, hashtable:data
|
|
</pre>
|
|
<p>Like <code><nobr>EXTERN</nobr></code>, the primitive form of
|
|
<code><nobr>GLOBAL</nobr></code> differs from the user-level form only in
|
|
that it can take only one argument at a time.
|
|
<h3><a name="section-5.6">5.6 <code><nobr>COMMON</nobr></code>: Defining Common Data Areas</a></h3>
|
|
<p>The <code><nobr>COMMON</nobr></code> directive is used to declare
|
|
<em>common variables</em>. A common variable is much like a global variable
|
|
declared in the uninitialised data section, so that
|
|
<p><pre>
|
|
common intvar 4
|
|
</pre>
|
|
<p>is similar in function to
|
|
<p><pre>
|
|
global intvar
|
|
section .bss
|
|
|
|
intvar resd 1
|
|
</pre>
|
|
<p>The difference is that if more than one module defines the same common
|
|
variable, then at link time those variables will be <em>merged</em>, and
|
|
references to <code><nobr>intvar</nobr></code> in all modules will point at
|
|
the same piece of memory.
|
|
<p>Like <code><nobr>GLOBAL</nobr></code> and
|
|
<code><nobr>EXTERN</nobr></code>, <code><nobr>COMMON</nobr></code> supports
|
|
object-format specific extensions. For example, the
|
|
<code><nobr>obj</nobr></code> format allows common variables to be NEAR or
|
|
FAR, and the <code><nobr>elf</nobr></code> format allows you to specify the
|
|
alignment requirements of a common variable:
|
|
<p><pre>
|
|
common commvar 4:near ; works in OBJ
|
|
common intarray 100:4 ; works in ELF: 4 byte aligned
|
|
</pre>
|
|
<p>Once again, like <code><nobr>EXTERN</nobr></code> and
|
|
<code><nobr>GLOBAL</nobr></code>, the primitive form of
|
|
<code><nobr>COMMON</nobr></code> differs from the user-level form only in
|
|
that it can take only one argument at a time.
|
|
<h3><a name="section-5.7">5.7 <code><nobr>CPU</nobr></code>: Defining CPU Dependencies</a></h3>
|
|
<p>The <code><nobr>CPU</nobr></code> directive restricts assembly to those
|
|
instructions which are available on the specified CPU.
|
|
<p>Options are:
|
|
<ul>
|
|
<li><code><nobr>CPU 8086</nobr></code> Assemble only 8086 instruction set
|
|
<li><code><nobr>CPU 186</nobr></code> Assemble instructions up to the 80186
|
|
instruction set
|
|
<li><code><nobr>CPU 286</nobr></code> Assemble instructions up to the 286
|
|
instruction set
|
|
<li><code><nobr>CPU 386</nobr></code> Assemble instructions up to the 386
|
|
instruction set
|
|
<li><code><nobr>CPU 486</nobr></code> 486 instruction set
|
|
<li><code><nobr>CPU 586</nobr></code> Pentium instruction set
|
|
<li><code><nobr>CPU PENTIUM</nobr></code> Same as 586
|
|
<li><code><nobr>CPU 686</nobr></code> P6 instruction set
|
|
<li><code><nobr>CPU PPRO</nobr></code> Same as 686
|
|
<li><code><nobr>CPU P2</nobr></code> Same as 686
|
|
<li><code><nobr>CPU P3</nobr></code> Pentium III (Katmai) instruction sets
|
|
<li><code><nobr>CPU KATMAI</nobr></code> Same as P3
|
|
<li><code><nobr>CPU P4</nobr></code> Pentium 4 (Willamette) instruction set
|
|
<li><code><nobr>CPU WILLAMETTE</nobr></code> Same as P4
|
|
<li><code><nobr>CPU PRESCOTT</nobr></code> Prescott instruction set
|
|
<li><code><nobr>CPU IA64</nobr></code> IA64 CPU (in x86 mode) instruction
|
|
set
|
|
</ul>
|
|
<p>All options are case insensitive. All instructions will be selected only
|
|
if they apply to the selected CPU or lower. By default, all instructions
|
|
are available.
|
|
<p align=center><a href="nasmdoc6.html">Next Chapter</a> |
|
|
<a href="nasmdoc4.html">Previous Chapter</a> |
|
|
<a href="nasmdoc0.html">Contents</a> |
|
|
<a href="nasmdoci.html">Index</a>
|
|
</body></html>
|