412 lines
15 KiB
HTML
412 lines
15 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<TITLE>as(9)</TITLE>
|
|
</HEAD>
|
|
<BODY>
|
|
<H1>as(9)</H1>
|
|
<HR>
|
|
<PRE>
|
|
<STRONG>Command:</STRONG> <STRONG>as</STRONG> <STRONG>-</STRONG> <STRONG>assembler</STRONG>
|
|
|
|
<STRONG>AS</STRONG>--<STRONG>--ASSEMBLER</STRONG> <STRONG>[IBM]</STRONG>
|
|
|
|
|
|
|
|
This document describes the language accepted by the 80386
|
|
assembler that is part of the Amsterdam Compiler Kit. Note that only
|
|
the syntax is described, only a few 386 instructions are shown as
|
|
examples.
|
|
|
|
<STRONG>Tokens,</STRONG> <STRONG>Numbers,</STRONG> <STRONG>Character</STRONG> <STRONG>Constants,</STRONG> <STRONG>and</STRONG> <STRONG>Strings</STRONG>
|
|
|
|
The syntax of numbers is the same as in C. The constants 32, 040,
|
|
and 0x20 all represent the same number, but are written in decimal,
|
|
octal, and hex, respectively. The rules for character constants and
|
|
strings are also the same as in C. For example, 'a' is a character
|
|
constant. A typical string is "string". Expressions may be formed with
|
|
C operators, but must use [ and ] for parentheses. (Normal parentheses
|
|
are claimed by the operand syntax.)
|
|
|
|
<STRONG>Symbols</STRONG>
|
|
|
|
Symbols contain letters and digits, as well as three special
|
|
characters: dot, tilde, and underscore. The first character may not be
|
|
a digit or tilde.
|
|
|
|
The names of the 80386 registers are reserved. These are:
|
|
|
|
al, bl, cl, dl
|
|
ah, bh, ch, dh
|
|
ax, bx, cx, dx, eax, ebx, ecx, edx
|
|
si, di, bp, sp, esi, edi, ebp, esp
|
|
cs, ds, ss, es, fs, gs
|
|
|
|
The xx and exx variants of the eight general registers are treated as
|
|
synonyms by the assembler. Normally "ax" is the 16-bit low half of the
|
|
32-bit "eax" register. The assembler determines if a 16 or 32 bit
|
|
operation is meant solely by looking at the instruction or the
|
|
instruction prefixes. It is however best to use the proper registers
|
|
when writing assembly to not confuse those who read the code.
|
|
|
|
The last group of 6 segment registers are used for selector + offset
|
|
mode addressing, in which the effective address is at a given offset in
|
|
one of the 6 segments.
|
|
|
|
Names of instructions and pseudo-ops are not reserved. Alphabetic
|
|
characters in opcodes and pseudo-ops must be in lower case.
|
|
|
|
<STRONG>Separators</STRONG>
|
|
Commas, blanks, and tabs are separators and can be interspersed
|
|
freely between tokens, but not within tokens. Commas are only legal
|
|
between operands.
|
|
|
|
<STRONG>Comments</STRONG>
|
|
|
|
The comment character is '!'. The rest of the line is ignored.
|
|
|
|
<STRONG>Opcodes</STRONG>
|
|
|
|
The opcodes are listed below. Notes: (1) Different names for the
|
|
same instruction are separated by '/'. (2) Square brackets ([])
|
|
indicate that 0 or 1 of the enclosed characters can be included. (3)
|
|
Curly brackets ({}) work similarly, except that one of the enclosed
|
|
characters <EM>must</EM> be included. Thus square brackets indicate an option,
|
|
whereas curly brackets indicate that a choice must be made.
|
|
|
|
<STRONG>Data</STRONG> <STRONG>Transfer</STRONG>
|
|
|
|
mov[b] dest, source ! Move word/byte from source to dest
|
|
pop dest ! Pop stack
|
|
push source ! Push stack
|
|
xchg[b] op1, op2 ! Exchange word/byte
|
|
xlat ! Translate
|
|
o16 ! Operate on a 16 bit object instead of 32 bit
|
|
|
|
<STRONG>Input/Output</STRONG>
|
|
|
|
in[b] source ! Input from source I/O port
|
|
in[b] ! Input from DX I/O port
|
|
out[b] dest ! Output to dest I/O port
|
|
out[b] ! Output to DX I/O port
|
|
|
|
<STRONG>Address</STRONG> <STRONG>Object</STRONG>
|
|
|
|
lds reg,source ! Load reg and DS from source
|
|
les reg,source ! Load reg and ES from source
|
|
lea reg,source ! Load effect address of source to reg and DS
|
|
{cdsefg}seg ! Specify seg register for next instruction
|
|
a16 ! Use 16 bit addressing mode instead of 32 bit
|
|
|
|
<STRONG>Flag</STRONG> <STRONG>Transfer</STRONG>
|
|
|
|
lahf ! Load AH from flag register
|
|
popf ! Pop flags
|
|
pushf ! Push flags
|
|
sahf ! Store AH in flag register
|
|
|
|
<STRONG>Addition</STRONG>
|
|
|
|
aaa ! Adjust result of BCD addition
|
|
add[b] dest,source ! Add
|
|
adc[b] dest,source ! Add with carry
|
|
daa ! Decimal Adjust after addition
|
|
inc[b] dest ! Increment by 1
|
|
|
|
<STRONG>Subtraction</STRONG>
|
|
|
|
aas ! Adjust result of BCD subtraction
|
|
sub[b] dest,source ! Subtract
|
|
sbb[b] dest,source ! Subtract with borrow from dest
|
|
das ! Decimal adjust after subtraction
|
|
dec[b] dest ! Decrement by one
|
|
neg[b] dest ! Negate
|
|
cmp[b] dest,source ! Compare
|
|
|
|
<STRONG>Multiplication</STRONG>
|
|
|
|
aam ! Adjust result of BCD multiply
|
|
imul[b] source ! Signed multiply
|
|
mul[b] source ! Unsigned multiply
|
|
|
|
<STRONG>Division</STRONG>
|
|
|
|
aad ! Adjust AX for BCD division
|
|
o16 cbw ! Sign extend AL into AH
|
|
o16 cwd ! Sign extend AX into DX
|
|
cwde ! Sign extend AX into EAX
|
|
cdq ! Sign extend EAX into EDX
|
|
idiv[b] source ! Signed divide
|
|
div[b] source ! Unsigned divide
|
|
|
|
<STRONG>Logical</STRONG>
|
|
|
|
and[b] dest,source ! Logical and
|
|
not[b] dest ! Logical not
|
|
or[b] dest,source ! Logical inclusive or
|
|
test[b] dest,source ! Logical test
|
|
xor[b] dest,source ! Logical exclusive or
|
|
|
|
<STRONG>Shift</STRONG>
|
|
|
|
sal[b]/shl[b] dest,CL ! Shift logical left
|
|
sar[b] dest,CL ! Shift arithmetic right
|
|
shr[b] dest,CL ! Shift logical right
|
|
|
|
<STRONG>Rotate</STRONG>
|
|
|
|
rcl[b] dest,CL ! Rotate left, with carry
|
|
rcr[b] dest,CL ! Rotate right, with carry
|
|
rol[b] dest,CL ! Rotate left
|
|
ror[b] dest,CL ! Rotate right
|
|
|
|
<STRONG>String</STRONG> <STRONG>Manipulation</STRONG>
|
|
|
|
cmps[b] ! Compare string element ds:esi with es:edi
|
|
lods[b] ! Load from ds:esi into AL, AX, or EAX
|
|
movs[b] ! Move from ds:esi to es:edi
|
|
rep ! Repeat next instruction until ECX=0
|
|
repe/repz ! Repeat next instruction until ECX=0 and ZF=1
|
|
repne/repnz ! Repeat next instruction until ECX!=0 and ZF=0
|
|
scas[b] ! Compare ds:esi with AL/AX/EAX
|
|
stos[b] ! Store AL/AX/EAX in es:edi
|
|
|
|
<STRONG>Control</STRONG> <STRONG>Transfer</STRONG>
|
|
|
|
<EM>As</EM> accepts a number of special jump opcodes that can assemble to
|
|
instructions with either a byte displacement, which can only reach to
|
|
targets within -126 to +129 bytes of the branch, or an instruction with
|
|
a 32-bit displacement. The assembler automatically chooses a byte or
|
|
word displacement instruction.
|
|
|
|
The English translation of the opcodes should be obvious, with
|
|
'l(ess)' and 'g(reater)' for signed comparisions, and 'b(elow)' and
|
|
'a(bove)*(CQ for unsigned comparisions. There are lots of synonyms to
|
|
allow you to write "jump if not that" instead of "jump if this".
|
|
|
|
The 'call', 'jmp', and 'ret' instructions can be either
|
|
intrasegment or intersegment. The intersegment versions are indicated
|
|
with the suffix 'f'.
|
|
|
|
<STRONG>Unconditional</STRONG>
|
|
|
|
jmp[f] dest ! jump to dest (8 or 32-bit displacement)
|
|
call[f] dest ! call procedure
|
|
ret[f] ! return from procedure
|
|
|
|
<STRONG>Conditional</STRONG>
|
|
|
|
ja/jnbe ! if above/not below or equal (unsigned)
|
|
jae/jnb/jnc ! if above or equal/not below/not carry (uns.)
|
|
jb/jnae/jc ! if not above nor equal/below/carry (unsigned)
|
|
jbe/jna ! if below or equal/not above (unsigned)
|
|
jg/jnle ! if greater/not less nor equal (signed)
|
|
jge/jnl ! if greater or equal/not less (signed)
|
|
jl/jnqe ! if less/not greater nor equal (signed)
|
|
jle/jgl ! if less or equal/not greater (signed)
|
|
je/jz ! if equal/zero
|
|
jne/jnz ! if not equal/not zero
|
|
jno ! if overflow not set
|
|
jo ! if overflow set
|
|
jnp/jpo ! if parity not set/parity odd
|
|
jp/jpe ! if parity set/parity even
|
|
jns ! if sign not set
|
|
js ! if sign set
|
|
|
|
<STRONG>Iteration</STRONG> <STRONG>Control</STRONG>
|
|
|
|
jcxz dest ! jump if ECX = 0
|
|
loop dest ! Decrement ECX and jump if CX != 0
|
|
loope/loopz dest ! Decrement ECX and jump if ECX = 0 and ZF = 1
|
|
loopne/loopnz dest ! Decrement ECX and jump if ECX != 0 and ZF = 0
|
|
|
|
<STRONG>Interrupt</STRONG>
|
|
|
|
int n ! Software interrupt n
|
|
into ! Interrupt if overflow set
|
|
iretd ! Return from interrupt
|
|
|
|
<STRONG>Flag</STRONG> <STRONG>Operations</STRONG>
|
|
|
|
clc ! Clear carry flag
|
|
cld ! Clear direction flag
|
|
cli ! Clear interrupt enable flag
|
|
cmc ! Complement carry flag
|
|
stc ! Set carry flag
|
|
std ! Set direction flag
|
|
sti ! Set interrupt enable flag
|
|
|
|
|
|
<STRONG>Location</STRONG> <STRONG>Counter</STRONG>
|
|
|
|
The special symbol '.' is the location counter and its value is the
|
|
address of the first byte of the instruction in which the symbol appears
|
|
and can be used in expressions.
|
|
|
|
<STRONG>Segments</STRONG>
|
|
|
|
There are four different assembly segments: text, rom, data and
|
|
bss. Segments are declared and selected by the .<EM>sect</EM> pseudo-op. It is
|
|
customary to declare all segments at the top of an assembly file like
|
|
this:
|
|
|
|
.sect .text; .sect .rom; .sect .data; .sect .bss
|
|
|
|
The assembler accepts up to 16 different segments, but MINIX expects
|
|
only four to be used. Anything can in principle be assembled into any
|
|
segment, but the MINIX bss segment may only contain uninitialized data.
|
|
Note that the '.' symbol refers to the location in the current segment.
|
|
|
|
<STRONG>Labels</STRONG>
|
|
|
|
There are two types: name and numeric. Name labels consist of a
|
|
name followed by a colon (:).
|
|
|
|
The numeric labels are single digits. The nearest 0: label may be
|
|
referenced as 0f in the forward direction, or 0b backwards.
|
|
|
|
<STRONG>Statement</STRONG> <STRONG>Syntax</STRONG>
|
|
|
|
Each line consists of a single statement. Blank or comment lines
|
|
are allowed.
|
|
|
|
<STRONG>Instruction</STRONG> <STRONG>Statements</STRONG>
|
|
|
|
The most general form of an instruction is
|
|
|
|
label: opcode operand1, operand2 ! comment
|
|
|
|
|
|
<STRONG>Expression</STRONG> <STRONG>Semantics</STRONG>
|
|
|
|
The following operators can be used: + - * / & | ^ ~ << (shift
|
|
left) >> (shift right) - (unary minus). 32-bit integer arithmetic is
|
|
used. Division produces a truncated quotient.
|
|
|
|
<STRONG>Addressing</STRONG> <STRONG>Modes</STRONG>
|
|
|
|
Below is a list of the addressing modes supported. Each one is
|
|
followed by an example.
|
|
|
|
constant mov eax, 123456
|
|
direct access mov eax, (counter)
|
|
register mov eax, esi
|
|
indirect mov eax, (esi)
|
|
base + disp. mov eax, 6(ebp)
|
|
scaled index mov eax, (4*esi)
|
|
base + index mov eax, (ebp)(2*esi)
|
|
base + index + disp. mov eax, 10(edi)(1*esi)
|
|
|
|
Any of the constants or symbols may be replacement by expressions.
|
|
Direct access, constants and displacements may be any type of
|
|
expression. A scaled index with scale 1 may be written without the
|
|
'1*'.
|
|
|
|
<STRONG>Call</STRONG> <STRONG>and</STRONG> <STRONG>Jmp</STRONG>
|
|
|
|
The 'call' and 'jmp' instructions can be interpreted as a load into
|
|
the instruction pointer.
|
|
|
|
call _routine ! Direct, intrasegment
|
|
call (subloc) ! Indirect, intrasegment
|
|
call 6(ebp) ! Indirect, intrasegment
|
|
call ebx ! Direct, intrasegment
|
|
call (ebx) ! Indirect, intrasegment
|
|
callf (subloc) ! Indirect, intersegment
|
|
callf seg:offs ! Direct, intersegment
|
|
|
|
|
|
|
|
<STRONG>Symbol</STRONG> <STRONG>Assigment</STRONG>
|
|
|
|
|
|
Symbols can acquire values in one of two ways. Using a symbol as a
|
|
label sets it to '.' for the current segment with type relocatable.
|
|
Alternative, a symbol may be given a name via an assignment of the form
|
|
|
|
symbol = expression
|
|
|
|
in which the symbol is assigned the value and type of its arguments.
|
|
|
|
|
|
<STRONG>Storage</STRONG> <STRONG>Allocation</STRONG>
|
|
|
|
|
|
Space can be reserved for bytes, words, and longs using pseudo-ops.
|
|
They take one or more operands, and for each generate a value whose size
|
|
is a byte, word (2 bytes) or long (4 bytes). For example:
|
|
|
|
.data1 2, 6 ! allocate 2 bytes initialized to 2 and 6
|
|
.data2 3, 0x10 ! allocate 2 words initialized to 3 and 16
|
|
.data4 010 ! allocate a longword initialized to 8
|
|
.space 40 ! allocates 40 bytes of zeros
|
|
|
|
allocates 50 (decimal) bytes of storage, initializing the first two
|
|
bytes to 2 and 6, the next two words to 3 and 16, then one longword with
|
|
value 8 (010 octal), last 40 bytes of zeros.
|
|
|
|
<STRONG>String</STRONG> <STRONG>Allocation</STRONG>
|
|
|
|
The pseudo-ops .<EM>ascii</EM> and .<EM>asciz</EM> take one string argument and
|
|
generate the ASCII character codes for the letters in the string. The
|
|
latter automatically terminates the string with a null (0) byte. For
|
|
example,
|
|
|
|
.ascii "hello"
|
|
.asciz "world\n"
|
|
|
|
|
|
<STRONG>Alignment</STRONG>
|
|
Sometimes it is necessary to force the next item to begin at a
|
|
word, longword or even a 16 byte address boundary. The .<EM>align</EM> pseudo-op
|
|
zero or more null byte if the current location is a multiple of the
|
|
argument of .align.
|
|
|
|
<STRONG>Segment</STRONG> <STRONG>Control</STRONG>
|
|
|
|
Every item assembled goes in one of the four segments: text, rom,
|
|
data, or bss. By using the .<EM>sect</EM> pseudo-op with argument .<EM>text</EM>, .<EM>rom</EM>,
|
|
.<EM>data</EM> or .<EM>bss</EM>, the programmer can force the next items to go in a
|
|
particular segment.
|
|
|
|
<STRONG>External</STRONG> <STRONG>Names</STRONG>
|
|
|
|
A symbol can be given global scope by including it in a .<EM>define</EM>
|
|
pseudo-op. Multiple names may be listed, separate by commas. It must
|
|
be used to export symbols defined in the current program. Names not
|
|
defined in the current program are treated as "undefined external"
|
|
automatically, although it is customary to make this explicit with the
|
|
.<EM>extern</EM> pseudo-op.
|
|
|
|
<STRONG>Common</STRONG>
|
|
|
|
The .<EM>comm</EM> pseudo-op declares storage that can be common to more
|
|
than one module. There are two arguments: a name and an absolute
|
|
expression giving the size in bytes of the area named by the symbol. The
|
|
type of the symbol becomes external. The statement can appear in any
|
|
segment. If you think this has something to do with FORTRAN, you are
|
|
right.
|
|
|
|
<STRONG>Examples</STRONG>
|
|
|
|
In the kernel directory, there are several assembly code files that
|
|
are worth inspecting as examples. However, note that these files, are
|
|
designed to first be run through the C preprocessor. (The very first
|
|
character is a # to signal this.) Thus they contain numerous constructs
|
|
that are not pure assembler. For true assembler examples, compile any C
|
|
program provided with MINIX using the <STRONG>-S</STRONG> flag. This will result in an
|
|
assembly language file with a suffix with the same name as the C source
|
|
file, but ending with the .s suffix.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</PRE>
|
|
</BODY>
|
|
</HTML>
|