473 lines
16 KiB
Groff
473 lines
16 KiB
Groff
|
||
|
||
|
||
|
||
|
||
Command: as - assembler
|
||
|
||
AS----ASSEMBLER [IBM]
|
||
|
||
|
||
|
||
This document describes the language accepted by the 80386
|
||
assembler that is part of the Amsterdam Compiler Kit. Note that only
|
||
the syntax is described, only a few 386 instructions are shown as
|
||
examples.
|
||
|
||
Tokens, Numbers, Character Constants, and Strings
|
||
|
||
The syntax of numbers is the same as in C. The constants 32, 040,
|
||
and 0x20 all represent the same number, but are written in decimal,
|
||
octal, and hex, respectively. The rules for character constants and
|
||
strings are also the same as in C. For example, 'a' is a character
|
||
constant. A typical string is "string". Expressions may be formed with
|
||
C operators, but must use [ and ] for parentheses. (Normal parentheses
|
||
are claimed by the operand syntax.)
|
||
|
||
Symbols
|
||
|
||
Symbols contain letters and digits, as well as three special
|
||
characters: dot, tilde, and underscore. The first character may not be
|
||
a digit or tilde.
|
||
|
||
The names of the 80386 registers are reserved. These are:
|
||
|
||
al, bl, cl, dl
|
||
ah, bh, ch, dh
|
||
ax, bx, cx, dx, eax, ebx, ecx, edx
|
||
si, di, bp, sp, esi, edi, ebp, esp
|
||
cs, ds, ss, es, fs, gs
|
||
|
||
The xx and exx variants of the eight general registers are treated as
|
||
synonyms by the assembler. Normally "ax" is the 16-bit low half of the
|
||
32-bit "eax" register. The assembler determines if a 16 or 32 bit
|
||
operation is meant solely by looking at the instruction or the
|
||
instruction prefixes. It is however best to use the proper registers
|
||
when writing assembly to not confuse those who read the code.
|
||
|
||
The last group of 6 segment registers are used for selector + offset
|
||
mode addressing, in which the effective address is at a given offset in
|
||
one of the 6 segments.
|
||
|
||
Names of instructions and pseudo-ops are not reserved. Alphabetic
|
||
characters in opcodes and pseudo-ops must be in lower case.
|
||
|
||
Separators
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Commas, blanks, and tabs are separators and can be interspersed
|
||
freely between tokens, but not within tokens. Commas are only legal
|
||
between operands.
|
||
|
||
Comments
|
||
|
||
The comment character is '!'. The rest of the line is ignored.
|
||
|
||
Opcodes
|
||
|
||
The opcodes are listed below. Notes: (1) Different names for the
|
||
same instruction are separated by '/'. (2) Square brackets ([])
|
||
indicate that 0 or 1 of the enclosed characters can be included. (3)
|
||
Curly brackets ({}) work similarly, except that one of the enclosed
|
||
characters must be included. Thus square brackets indicate an option,
|
||
whereas curly brackets indicate that a choice must be made.
|
||
|
||
Data Transfer
|
||
|
||
mov[b] dest, source ! Move word/byte from source to dest
|
||
pop dest ! Pop stack
|
||
push source ! Push stack
|
||
xchg[b] op1, op2 ! Exchange word/byte
|
||
xlat ! Translate
|
||
o16 ! Operate on a 16 bit object instead of 32 bit
|
||
|
||
Input/Output
|
||
|
||
in[b] source ! Input from source I/O port
|
||
in[b] ! Input from DX I/O port
|
||
out[b] dest ! Output to dest I/O port
|
||
out[b] ! Output to DX I/O port
|
||
|
||
Address Object
|
||
|
||
lds reg,source ! Load reg and DS from source
|
||
les reg,source ! Load reg and ES from source
|
||
lea reg,source ! Load effect address of source to reg and DS
|
||
{cdsefg}seg ! Specify seg register for next instruction
|
||
a16 ! Use 16 bit addressing mode instead of 32 bit
|
||
|
||
Flag Transfer
|
||
|
||
lahf ! Load AH from flag register
|
||
popf ! Pop flags
|
||
pushf ! Push flags
|
||
sahf ! Store AH in flag register
|
||
|
||
Addition
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
aaa ! Adjust result of BCD addition
|
||
add[b] dest,source ! Add
|
||
adc[b] dest,source ! Add with carry
|
||
daa ! Decimal Adjust after addition
|
||
inc[b] dest ! Increment by 1
|
||
|
||
Subtraction
|
||
|
||
aas ! Adjust result of BCD subtraction
|
||
sub[b] dest,source ! Subtract
|
||
sbb[b] dest,source ! Subtract with borrow from dest
|
||
das ! Decimal adjust after subtraction
|
||
dec[b] dest ! Decrement by one
|
||
neg[b] dest ! Negate
|
||
cmp[b] dest,source ! Compare
|
||
|
||
Multiplication
|
||
|
||
aam ! Adjust result of BCD multiply
|
||
imul[b] source ! Signed multiply
|
||
mul[b] source ! Unsigned multiply
|
||
|
||
Division
|
||
|
||
aad ! Adjust AX for BCD division
|
||
o16 cbw ! Sign extend AL into AH
|
||
o16 cwd ! Sign extend AX into DX
|
||
cwde ! Sign extend AX into EAX
|
||
cdq ! Sign extend EAX into EDX
|
||
idiv[b] source ! Signed divide
|
||
div[b] source ! Unsigned divide
|
||
|
||
Logical
|
||
|
||
and[b] dest,source ! Logical and
|
||
not[b] dest ! Logical not
|
||
or[b] dest,source ! Logical inclusive or
|
||
test[b] dest,source ! Logical test
|
||
xor[b] dest,source ! Logical exclusive or
|
||
|
||
Shift
|
||
|
||
sal[b]/shl[b] dest,CL ! Shift logical left
|
||
sar[b] dest,CL ! Shift arithmetic right
|
||
shr[b] dest,CL ! Shift logical right
|
||
|
||
Rotate
|
||
|
||
rcl[b] dest,CL ! Rotate left, with carry
|
||
rcr[b] dest,CL ! Rotate right, with carry
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
rol[b] dest,CL ! Rotate left
|
||
ror[b] dest,CL ! Rotate right
|
||
|
||
String Manipulation
|
||
|
||
cmps[b] ! Compare string element ds:esi with es:edi
|
||
lods[b] ! Load from ds:esi into AL, AX, or EAX
|
||
movs[b] ! Move from ds:esi to es:edi
|
||
rep ! Repeat next instruction until ECX=0
|
||
repe/repz ! Repeat next instruction until ECX=0 and ZF=1
|
||
repne/repnz ! Repeat next instruction until ECX!=0 and ZF=0
|
||
scas[b] ! Compare ds:esi with AL/AX/EAX
|
||
stos[b] ! Store AL/AX/EAX in es:edi
|
||
|
||
Control Transfer
|
||
|
||
As accepts a number of special jump opcodes that can assemble to
|
||
instructions with either a byte displacement, which can only reach to
|
||
targets within -126 to +129 bytes of the branch, or an instruction with
|
||
a 32-bit displacement. The assembler automatically chooses a byte or
|
||
word displacement instruction.
|
||
|
||
The English translation of the opcodes should be obvious, with
|
||
'l(ess)' and 'g(reater)' for signed comparisions, and 'b(elow)' and
|
||
'a(bove)*(CQ for unsigned comparisions. There are lots of synonyms to
|
||
allow you to write "jump if not that" instead of "jump if this".
|
||
|
||
The 'call', 'jmp', and 'ret' instructions can be either
|
||
intrasegment or intersegment. The intersegment versions are indicated
|
||
with the suffix 'f'.
|
||
|
||
Unconditional
|
||
|
||
jmp[f] dest ! jump to dest (8 or 32-bit displacement)
|
||
call[f] dest ! call procedure
|
||
ret[f] ! return from procedure
|
||
|
||
Conditional
|
||
|
||
ja/jnbe ! if above/not below or equal (unsigned)
|
||
jae/jnb/jnc ! if above or equal/not below/not carry (uns.)
|
||
jb/jnae/jc ! if not above nor equal/below/carry (unsigned)
|
||
jbe/jna ! if below or equal/not above (unsigned)
|
||
jg/jnle ! if greater/not less nor equal (signed)
|
||
jge/jnl ! if greater or equal/not less (signed)
|
||
jl/jnqe ! if less/not greater nor equal (signed)
|
||
jle/jgl ! if less or equal/not greater (signed)
|
||
je/jz ! if equal/zero
|
||
jne/jnz ! if not equal/not zero
|
||
jno ! if overflow not set
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
jo ! if overflow set
|
||
jnp/jpo ! if parity not set/parity odd
|
||
jp/jpe ! if parity set/parity even
|
||
jns ! if sign not set
|
||
js ! if sign set
|
||
|
||
Iteration Control
|
||
|
||
jcxz dest ! jump if ECX = 0
|
||
loop dest ! Decrement ECX and jump if CX != 0
|
||
loope/loopz dest ! Decrement ECX and jump if ECX = 0 and ZF = 1
|
||
loopne/loopnz dest ! Decrement ECX and jump if ECX != 0 and ZF = 0
|
||
|
||
Interrupt
|
||
|
||
int n ! Software interrupt n
|
||
into ! Interrupt if overflow set
|
||
iretd ! Return from interrupt
|
||
|
||
Flag Operations
|
||
|
||
clc ! Clear carry flag
|
||
cld ! Clear direction flag
|
||
cli ! Clear interrupt enable flag
|
||
cmc ! Complement carry flag
|
||
stc ! Set carry flag
|
||
std ! Set direction flag
|
||
sti ! Set interrupt enable flag
|
||
|
||
|
||
Location Counter
|
||
|
||
The special symbol '.' is the location counter and its value is the
|
||
address of the first byte of the instruction in which the symbol appears
|
||
and can be used in expressions.
|
||
|
||
Segments
|
||
|
||
There are four different assembly segments: text, rom, data and
|
||
bss. Segments are declared and selected by the .sect pseudo-op. It is
|
||
customary to declare all segments at the top of an assembly file like
|
||
this:
|
||
|
||
.sect .text; .sect .rom; .sect .data; .sect .bss
|
||
|
||
The assembler accepts up to 16 different segments, but MINIX expects
|
||
only four to be used. Anything can in principle be assembled into any
|
||
segment, but the MINIX bss segment may only contain uninitialized data.
|
||
Note that the '.' symbol refers to the location in the current segment.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Labels
|
||
|
||
There are two types: name and numeric. Name labels consist of a
|
||
name followed by a colon (:).
|
||
|
||
The numeric labels are single digits. The nearest 0: label may be
|
||
referenced as 0f in the forward direction, or 0b backwards.
|
||
|
||
Statement Syntax
|
||
|
||
Each line consists of a single statement. Blank or comment lines
|
||
are allowed.
|
||
|
||
Instruction Statements
|
||
|
||
The most general form of an instruction is
|
||
|
||
label: opcode operand1, operand2 ! comment
|
||
|
||
|
||
Expression Semantics
|
||
|
||
The following operators can be used: + - * / & | ^ ~ << (shift
|
||
left) >> (shift right) - (unary minus). 32-bit integer arithmetic is
|
||
used. Division produces a truncated quotient.
|
||
|
||
Addressing Modes
|
||
|
||
Below is a list of the addressing modes supported. Each one is
|
||
followed by an example.
|
||
|
||
constant mov eax, 123456
|
||
direct access mov eax, (counter)
|
||
register mov eax, esi
|
||
indirect mov eax, (esi)
|
||
base + disp. mov eax, 6(ebp)
|
||
scaled index mov eax, (4*esi)
|
||
base + index mov eax, (ebp)(2*esi)
|
||
base + index + disp. mov eax, 10(edi)(1*esi)
|
||
|
||
Any of the constants or symbols may be replacement by expressions.
|
||
Direct access, constants and displacements may be any type of
|
||
expression. A scaled index with scale 1 may be written without the
|
||
'1*'.
|
||
|
||
Call and Jmp
|
||
|
||
The 'call' and 'jmp' instructions can be interpreted as a load into
|
||
the instruction pointer.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
call _routine ! Direct, intrasegment
|
||
call (subloc) ! Indirect, intrasegment
|
||
call 6(ebp) ! Indirect, intrasegment
|
||
call ebx ! Direct, intrasegment
|
||
call (ebx) ! Indirect, intrasegment
|
||
callf (subloc) ! Indirect, intersegment
|
||
callf seg:offs ! Direct, intersegment
|
||
|
||
|
||
|
||
Symbol Assigment
|
||
|
||
|
||
Symbols can acquire values in one of two ways. Using a symbol as a
|
||
label sets it to '.' for the current segment with type relocatable.
|
||
Alternative, a symbol may be given a name via an assignment of the form
|
||
|
||
symbol = expression
|
||
|
||
in which the symbol is assigned the value and type of its arguments.
|
||
|
||
|
||
Storage Allocation
|
||
|
||
|
||
Space can be reserved for bytes, words, and longs using pseudo-ops.
|
||
They take one or more operands, and for each generate a value whose size
|
||
is a byte, word (2 bytes) or long (4 bytes). For example:
|
||
|
||
.data1 2, 6 ! allocate 2 bytes initialized to 2 and 6
|
||
.data2 3, 0x10 ! allocate 2 words initialized to 3 and 16
|
||
.data4 010 ! allocate a longword initialized to 8
|
||
.space 40 ! allocates 40 bytes of zeros
|
||
|
||
allocates 50 (decimal) bytes of storage, initializing the first two
|
||
bytes to 2 and 6, the next two words to 3 and 16, then one longword with
|
||
value 8 (010 octal), last 40 bytes of zeros.
|
||
|
||
String Allocation
|
||
|
||
The pseudo-ops .ascii and .asciz take one string argument and
|
||
generate the ASCII character codes for the letters in the string. The
|
||
latter automatically terminates the string with a null (0) byte. For
|
||
example,
|
||
|
||
.ascii "hello"
|
||
.asciz "world\n"
|
||
|
||
|
||
Alignment
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Sometimes it is necessary to force the next item to begin at a
|
||
word, longword or even a 16 byte address boundary. The .align pseudo-op
|
||
zero or more null byte if the current location is a multiple of the
|
||
argument of .align.
|
||
|
||
Segment Control
|
||
|
||
Every item assembled goes in one of the four segments: text, rom,
|
||
data, or bss. By using the .sect pseudo-op with argument .text, .rom,
|
||
.data or .bss, the programmer can force the next items to go in a
|
||
particular segment.
|
||
|
||
External Names
|
||
|
||
A symbol can be given global scope by including it in a .define
|
||
pseudo-op. Multiple names may be listed, separate by commas. It must
|
||
be used to export symbols defined in the current program. Names not
|
||
defined in the current program are treated as "undefined external"
|
||
automatically, although it is customary to make this explicit with the
|
||
.extern pseudo-op.
|
||
|
||
Common
|
||
|
||
The .comm pseudo-op declares storage that can be common to more
|
||
than one module. There are two arguments: a name and an absolute
|
||
expression giving the size in bytes of the area named by the symbol. The
|
||
type of the symbol becomes external. The statement can appear in any
|
||
segment. If you think this has something to do with FORTRAN, you are
|
||
right.
|
||
|
||
Examples
|
||
|
||
In the kernel directory, there are several assembly code files that
|
||
are worth inspecting as examples. However, note that these files, are
|
||
designed to first be run through the C preprocessor. (The very first
|
||
character is a # to signal this.) Thus they contain numerous constructs
|
||
that are not pure assembler. For true assembler examples, compile any C
|
||
program provided with MINIX using the -S flag. This will result in an
|
||
assembly language file with a suffix with the same name as the C source
|
||
file, but ending with the .s suffix.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|