321 lines
15 KiB
Plaintext
321 lines
15 KiB
Plaintext
A.OUT(5) OpenBSD Programmer's Manual A.OUT(5)
|
|
|
|
NAME
|
|
a.out - format of executable binary files
|
|
|
|
SYNOPSIS
|
|
#include <a.out.h>
|
|
|
|
DESCRIPTION
|
|
The include file <a.out.h> declares three structures and several macros.
|
|
The structures describe the format of executable machine code files
|
|
(``binaries'') on the system.
|
|
|
|
A binary file consists of up to 7 sections. In order, these sections
|
|
are:
|
|
|
|
exec header Contains parameters used by the kernel to load a binary
|
|
file into memory and execute it, and by the link editor
|
|
ld(1) to combine a binary file with other binary files.
|
|
This section is the only mandatory one.
|
|
|
|
text segment Contains machine code and related data that are loaded
|
|
into memory when a program executes. May be loaded
|
|
read-only.
|
|
|
|
data segment Contains initialized data; always loaded into writable
|
|
memory.
|
|
|
|
text relocations Contains records used by the link editor to update
|
|
pointers in the text segment when combining binary
|
|
files.
|
|
|
|
data relocations Like the text relocation section, but for data segment
|
|
pointers.
|
|
|
|
symbol table Contains records used by the link editor to cross ref-
|
|
erence the addresses of named variables and functions
|
|
(``symbols'') between binary files.
|
|
|
|
string table Contains the character strings corresponding to the
|
|
symbol names.
|
|
|
|
Every binary file begins with an exec structure:
|
|
|
|
struct exec {
|
|
u_int32_t a_midmag;
|
|
u_int32_t a_text;
|
|
u_int32_t a_data;
|
|
u_int32_t a_bss;
|
|
u_int32_t a_syms;
|
|
u_int32_t a_entry;
|
|
u_int32_t a_trsize;
|
|
u_int32_t a_drsize;
|
|
};
|
|
|
|
The fields have the following functions:
|
|
|
|
a_midmag This field is stored in network byte-order so that binaries for
|
|
machines with alternate byte orders can be distinguished. It
|
|
has a number of sub-components accessed by the macros
|
|
N_GETFLAG(), N_GETMID(),and N_GETMAGIC(), and set by the macro
|
|
N_SETMAGIC().
|
|
|
|
|
|
|
|
The macro N_GETFLAG()() returns a few flags:
|
|
|
|
EX_DYNAMIC Indicates that the executable requires the services
|
|
of the run-time link editor.
|
|
|
|
EX_PIC Indicates that the object contains position inde-
|
|
pendent code. This flag is set by as(1) when given
|
|
the -k flag and is preserved by ld(1) if necessary.
|
|
|
|
If both EX_DYNAMIC and EX_PIC are set, the object file is a po-
|
|
sition independent executable image (e.g., a shared library),
|
|
which is to be loaded into the process address space by the
|
|
run-time link editor.
|
|
|
|
The macro N_GETMID() returns the machine-id. This indicates
|
|
which machine(s) the binary is intended to run on.
|
|
|
|
N_GETMAGIC() specifies the magic number, which uniquely identi-
|
|
fies binary files and distinguishes different loading conven-
|
|
tions. The field must contain one of the following values:
|
|
|
|
OMAGIC The text and data segments immediately follow the head-
|
|
er and are contiguous. The kernel loads both text and
|
|
data segments into writable memory.
|
|
|
|
NMAGIC As with OMAGIC, text and data segments immediately fol-
|
|
low the header and are contiguous. However, the kernel
|
|
loads the text into read-only memory and loads the data
|
|
into writable memory at the next page boundary after
|
|
the text.
|
|
|
|
ZMAGIC The kernel loads individual pages on demand from the
|
|
binary. The header, text segment and data segment are
|
|
all padded by the link editor to a multiple of the page
|
|
size. Pages that the kernel loads from the text seg-
|
|
ment are read-only, while pages from the data segment
|
|
are writable.
|
|
|
|
a_text Contains the size of the text segment in bytes.
|
|
|
|
a_data Contains the size of the data segment in bytes.
|
|
|
|
a_bss Contains the number of bytes in the ``bss segment'' and is used
|
|
by the kernel to set the initial break (brk(2)) after the data
|
|
segment. The kernel loads the program so that this amount of
|
|
writable memory appears to follow the data segment and initial-
|
|
ly reads as zeroes.
|
|
|
|
a_syms Contains the size in bytes of the symbol table section.
|
|
|
|
a_entry Contains the address in memory of the entry point of the pro-
|
|
gram after the kernel has loaded it; the kernel starts the exe-
|
|
cution of the program from the machine instruction at this ad-
|
|
dress.
|
|
|
|
a_trsize Contains the size in bytes of the text relocation table.
|
|
|
|
a_drsize Contains the size in bytes of the data relocation table.
|
|
|
|
The a.out.h include file defines several macros which use an exec struc-
|
|
ture to test consistency or to locate section offsets in the binary file.
|
|
|
|
N_BADMAG(exec) Non-zero if the a_magic field does not contain a recog-
|
|
nized value.
|
|
|
|
N_TXTOFF(exec) The byte offset in the binary file of the beginning of
|
|
|
|
the text segment.
|
|
|
|
N_SYMOFF(exec) The byte offset of the beginning of the symbol table.
|
|
|
|
N_STROFF(exec) The byte offset of the beginning of the string table.
|
|
|
|
Relocation records have a standard format which is described by the
|
|
relocation_info structure:
|
|
|
|
struct relocation_info {
|
|
int r_address;
|
|
unsigned int r_symbolnum : 24,
|
|
r_pcrel : 1,
|
|
r_length : 2,
|
|
r_extern : 1,
|
|
r_baserel : 1,
|
|
r_jmptable : 1,
|
|
r_relative : 1,
|
|
r_copy : 1;
|
|
};
|
|
|
|
The relocation_info fields are used as follows:
|
|
|
|
r_address Contains the byte offset of a pointer that needs to be link-
|
|
edited. Text relocation offsets are reckoned from the start
|
|
of the text segment, and data relocation offsets from the
|
|
start of the data segment. The link editor adds the value
|
|
that is already stored at this offset into the new value
|
|
that it computes using this relocation record.
|
|
|
|
r_symbolnum Contains the ordinal number of a symbol structure in the
|
|
symbol table (it is not a byte offset). After the link edi-
|
|
tor resolves the absolute address for this symbol, it adds
|
|
that address to the pointer that is undergoing relocation.
|
|
(If the r_extern bit is clear, the situation is different;
|
|
see below.)
|
|
|
|
r_pcrel If this is set, the link editor assumes that it is updating
|
|
a pointer that is part of a machine code instruction using
|
|
pc-relative addressing. The address of the relocated point-
|
|
er is implicitly added to its value when the running program
|
|
uses it.
|
|
|
|
r_length Contains the log base 2 of the length of the pointer in
|
|
bytes; 0 for 1-byte displacements, 1 for 2-byte displace-
|
|
ments, 2 for 4-byte displacements.
|
|
|
|
r_extern Set if this relocation requires an external reference; the
|
|
link editor must use a symbol address to update the pointer.
|
|
When the r_extern bit is clear, the relocation is ``local'';
|
|
the link editor updates the pointer to reflect changes in
|
|
the load addresses of the various segments, rather than
|
|
changes in the value of a symbol (except when r_baserel is
|
|
also set, see below). In this case, the content of the
|
|
r_symbolnum field is an n_type value (see below); this type
|
|
field tells the link editor what segment the relocated
|
|
pointer points into.
|
|
|
|
r_baserel If set, the symbol, as identified by the r_symbolnum field,
|
|
is to be relocated to an offset into the Global Offset
|
|
Table. At run-time, the entry in the Global Offset Table at
|
|
this offset is set to be the address of the symbol.
|
|
|
|
r_jmptable If set, the symbol, as identified by the r_symbolnum field,
|
|
is to be relocated to an offset into the Procedure Linkage
|
|
|
|
Table.
|
|
|
|
r_relative If set, this relocation is relative to the (run-time) load
|
|
address of the image this object file is going to be a part
|
|
of. This type of relocation only occurs in shared objects.
|
|
|
|
r_copy If set, this relocation record identifies a symbol whose
|
|
contents should be copied to the location given in
|
|
r_address. The copying is done by the run-time link editor
|
|
from a suitable data item in a shared object.
|
|
|
|
Symbols map names to addresses (or more generally, strings to values).
|
|
Since the link editor adjusts addresses, a symbol's name must be used to
|
|
stand for its address until an absolute value has been assigned. Symbols
|
|
consist of a fixed-length record in the symbol table and a variable-
|
|
length name in the string table. The symbol table is an array of nlist
|
|
structures:
|
|
|
|
struct nlist {
|
|
union {
|
|
char *n_name;
|
|
long n_strx;
|
|
} n_un;
|
|
unsigned char n_type;
|
|
char n_other;
|
|
short n_desc;
|
|
unsigned long n_value;
|
|
};
|
|
|
|
The fields are used as follows:
|
|
|
|
n_un.n_strx Contains a byte offset into the string table for the name of
|
|
this symbol. When a program accesses a symbol table with
|
|
the nlist(3) function, this field is replaced with the
|
|
n_un.n_name field, which is a pointer to the string in memo-
|
|
ry.
|
|
|
|
n_type Used by the link editor to determine how to update the sym-
|
|
bol's value. The n_type field is broken down into three
|
|
sub-fields using bitmasks. The link editor treats symbols
|
|
with the N_EXT type bit set as ``external'' symbols and per-
|
|
mits references to them from other binary files. The N_TYPE
|
|
mask selects bits of interest to the link editor:
|
|
|
|
N_UNDF An undefined symbol. The link editor must locate an
|
|
external symbol with the same name in another binary
|
|
file to determine the absolute value of this symbol.
|
|
As a special case, if the n_value field is non-zero
|
|
and no binary file in the link-edit defines this
|
|
symbol, the link editor will resolve this symbol to
|
|
an address in the bss segment, reserving an amount
|
|
of bytes equal to n_value. If this symbol is unde-
|
|
fined in more than one binary file and the binary
|
|
files do not agree on the size, the link editor
|
|
chooses the greatest size found across all binaries.
|
|
|
|
N_ABS An absolute symbol. The link editor does not update
|
|
an absolute symbol.
|
|
|
|
N_TEXT A text symbol. This symbol's value is a text ad-
|
|
dress and the link editor will update it when it
|
|
merges binary files.
|
|
|
|
N_DATA A data symbol; similar to N_TEXT but for data ad-
|
|
dresses. The values for text and data symbols are
|
|
not file offsets but addresses; to recover the file
|
|
offsets, it is necessary to identify the loaded ad-
|
|
dress of the beginning of the corresponding section
|
|
and subtract it, then add the offset of the section.
|
|
|
|
N_BSS A bss symbol; like text or data symbols but has no
|
|
corresponding offset in the binary file.
|
|
|
|
N_FN A filename symbol. The link editor inserts this
|
|
symbol before the other symbols from a binary file
|
|
when merging binary files. The name of the symbol
|
|
is the filename given to the link editor, and its
|
|
value is the first text address from that binary
|
|
file. Filename symbols are not needed for link
|
|
editing or loading, but are useful for debuggers.
|
|
|
|
The N_STAB mask selects bits of interest to symbolic debug-
|
|
gers such as gdb(1); the values are described in stab(5).
|
|
|
|
n_other This field provides information on the nature of the symbol
|
|
independent of the symbol's location in terms of segments as
|
|
determined by the n_type field. Currently, the lower 4 bits
|
|
of the n_other field hold one of two values: AUX_FUNC and
|
|
AUX_OBJECT (see <link.h> for their definitions). AUX_FUNC
|
|
associates the symbol with a callable function, while
|
|
AUX_OBJECT associates the symbol with data, irrespective of
|
|
their locations in either the text or the data segment.
|
|
This field is intended to be used by ld(1) for the construc-
|
|
tion of dynamic executables.
|
|
|
|
n_desc Reserved for use by debuggers; passed untouched by the link
|
|
editor. Different debuggers use this field for different
|
|
purposes.
|
|
|
|
n_value Contains the value of the symbol. For text, data and bss
|
|
symbols, this is an address; for other symbols (such as de-
|
|
bugger symbols), the value may be arbitrary.
|
|
|
|
The string table consists of an u_int32_t length followed by null-termi-
|
|
nated symbol strings. The length represents the size of the entire table
|
|
in bytes, so its minimum value (or the offset of the first string) is al-
|
|
ways 4 on 32-bit machines.
|
|
|
|
SEE ALSO
|
|
as(1), gdb(1), ld(1), brk(2), execve(2), nlist(3), core(5),
|
|
link(5), stab(5)
|
|
|
|
HISTORY
|
|
The a.out.h include file appeared in Version 7 AT&T UNIX.
|
|
|
|
BUGS
|
|
Nobody seems to agree on what bss stands for.
|
|
|
|
New binary file formats may be supported in the future, and they probably
|
|
will not be compatible at any level with this ancient format.
|
|
|
|
OpenBSD 2.6 June 5, 1993 5
|