396 lines
14 KiB
HTML
396 lines
14 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<!-- This HTML file has been created by texi2html 1.52
|
|
from ../texi/as.texinfo on 24 April 1999 -->
|
|
|
|
<TITLE>Using as - Sections and Relocation</TITLE>
|
|
</HEAD>
|
|
<BODY>
|
|
Go to the <A HREF="as_1.html">first</A>, <A HREF="as_3.html">previous</A>, <A HREF="as_5.html">next</A>, <A HREF="as_27.html">last</A> section, <A HREF="as_toc.html">table of contents</A>.
|
|
<P><HR><P>
|
|
|
|
|
|
<H1><A NAME="SEC39" HREF="as_toc.html#TOC39">Sections and Relocation</A></H1>
|
|
<P>
|
|
<A NAME="IDX167"></A>
|
|
<A NAME="IDX168"></A>
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC40" HREF="as_toc.html#TOC40">Background</A></H2>
|
|
|
|
<P>
|
|
Roughly, a section is a range of addresses, with no gaps; all data
|
|
"in" those addresses is treated the same for some particular purpose.
|
|
For example there may be a "read only" section.
|
|
|
|
</P>
|
|
<P>
|
|
<A NAME="IDX169"></A>
|
|
<A NAME="IDX170"></A>
|
|
The linker <CODE>ld</CODE> reads many object files (partial programs) and
|
|
combines their contents to form a runnable program. When <CODE>as</CODE>
|
|
emits an object file, the partial program is assumed to start at address 0.
|
|
<CODE>ld</CODE> assigns the final addresses for the partial program, so that
|
|
different partial programs do not overlap. This is actually an
|
|
oversimplification, but it suffices to explain how <CODE>as</CODE> uses
|
|
sections.
|
|
|
|
</P>
|
|
<P>
|
|
<CODE>ld</CODE> moves blocks of bytes of your program to their run-time
|
|
addresses. These blocks slide to their run-time addresses as rigid
|
|
units; their length does not change and neither does the order of bytes
|
|
within them. Such a rigid unit is called a <EM>section</EM>. Assigning
|
|
run-time addresses to sections is called <EM>relocation</EM>. It includes
|
|
the task of adjusting mentions of object-file addresses so they refer to
|
|
the proper run-time addresses.
|
|
For the H8/300 and H8/500,
|
|
and for the Hitachi SH,
|
|
<CODE>as</CODE> pads sections if needed to
|
|
ensure they end on a word (sixteen bit) boundary.
|
|
|
|
</P>
|
|
<P>
|
|
<A NAME="IDX171"></A>
|
|
An object file written by <CODE>as</CODE> has at least three sections, any
|
|
of which may be empty. These are named <EM>text</EM>, <EM>data</EM> and
|
|
<EM>bss</EM> sections.
|
|
|
|
</P>
|
|
<P>
|
|
When it generates COFF output,
|
|
<CODE>as</CODE> can also generate whatever other named sections you specify
|
|
using the <SAMP>`.section'</SAMP> directive (see section <A HREF="as_7.html#SEC119"><CODE>.section <VAR>name</CODE></VAR></A>).
|
|
If you do not use any directives that place output in the <SAMP>`.text'</SAMP>
|
|
or <SAMP>`.data'</SAMP> sections, these sections still exist, but are empty.
|
|
|
|
</P>
|
|
<P>
|
|
When <CODE>as</CODE> generates SOM or ELF output for the HPPA,
|
|
<CODE>as</CODE> can also generate whatever other named sections you
|
|
specify using the <SAMP>`.space'</SAMP> and <SAMP>`.subspace'</SAMP> directives. See
|
|
<CITE>HP9000 Series 800 Assembly Language Reference Manual</CITE>
|
|
(HP 92432-90001) for details on the <SAMP>`.space'</SAMP> and <SAMP>`.subspace'</SAMP>
|
|
assembler directives.
|
|
|
|
</P>
|
|
<P>
|
|
Additionally, <CODE>as</CODE> uses different names for the standard
|
|
text, data, and bss sections when generating SOM output. Program text
|
|
is placed into the <SAMP>`$CODE$'</SAMP> section, data into <SAMP>`$DATA$'</SAMP>, and
|
|
BSS into <SAMP>`$BSS$'</SAMP>.
|
|
|
|
</P>
|
|
<P>
|
|
Within the object file, the text section starts at address <CODE>0</CODE>, the
|
|
data section follows, and the bss section follows the data section.
|
|
|
|
</P>
|
|
<P>
|
|
When generating either SOM or ELF output files on the HPPA, the text
|
|
section starts at address <CODE>0</CODE>, the data section at address
|
|
<CODE>0x4000000</CODE>, and the bss section follows the data section.
|
|
|
|
</P>
|
|
<P>
|
|
To let <CODE>ld</CODE> know which data changes when the sections are
|
|
relocated, and how to change that data, <CODE>as</CODE> also writes to the
|
|
object file details of the relocation needed. To perform relocation
|
|
<CODE>ld</CODE> must know, each time an address in the object
|
|
file is mentioned:
|
|
|
|
<UL>
|
|
<LI>
|
|
|
|
Where in the object file is the beginning of this reference to
|
|
an address?
|
|
<LI>
|
|
|
|
How long (in bytes) is this reference?
|
|
<LI>
|
|
|
|
Which section does the address refer to? What is the numeric value of
|
|
|
|
<PRE>
|
|
(<VAR>address</VAR>) - (<VAR>start-address of section</VAR>)?
|
|
</PRE>
|
|
|
|
<LI>
|
|
|
|
Is the reference to an address "Program-Counter relative"?
|
|
</UL>
|
|
|
|
<P>
|
|
<A NAME="IDX172"></A>
|
|
<A NAME="IDX173"></A>
|
|
In fact, every address <CODE>as</CODE> ever uses is expressed as
|
|
|
|
<PRE>
|
|
(<VAR>section</VAR>) + (<VAR>offset into section</VAR>)
|
|
</PRE>
|
|
|
|
<P>
|
|
Further, most expressions <CODE>as</CODE> computes have this section-relative
|
|
nature.
|
|
(For some object formats, such as SOM for the HPPA, some expressions are
|
|
symbol-relative instead.)
|
|
|
|
</P>
|
|
<P>
|
|
In this manual we use the notation {<VAR>secname</VAR> <VAR>N</VAR>} to mean "offset
|
|
<VAR>N</VAR> into section <VAR>secname</VAR>."
|
|
|
|
</P>
|
|
<P>
|
|
Apart from text, data and bss sections you need to know about the
|
|
<EM>absolute</EM> section. When <CODE>ld</CODE> mixes partial programs,
|
|
addresses in the absolute section remain unchanged. For example, address
|
|
<CODE>{absolute 0}</CODE> is "relocated" to run-time address 0 by
|
|
<CODE>ld</CODE>. Although the linker never arranges two partial programs'
|
|
data sections with overlapping addresses after linking, <EM>by definition</EM>
|
|
their absolute sections must overlap. Address <CODE>{absolute 239}</CODE> in one
|
|
part of a program is always the same address when the program is running as
|
|
address <CODE>{absolute 239}</CODE> in any other part of the program.
|
|
|
|
</P>
|
|
<P>
|
|
The idea of sections is extended to the <EM>undefined</EM> section. Any
|
|
address whose section is unknown at assembly time is by definition
|
|
rendered {undefined <VAR>U</VAR>}---where <VAR>U</VAR> is filled in later.
|
|
Since numbers are always defined, the only way to generate an undefined
|
|
address is to mention an undefined symbol. A reference to a named
|
|
common block would be such a symbol: its value is unknown at assembly
|
|
time so it has section <EM>undefined</EM>.
|
|
|
|
</P>
|
|
<P>
|
|
By analogy the word <EM>section</EM> is used to describe groups of sections in
|
|
the linked program. <CODE>ld</CODE> puts all partial programs' text
|
|
sections in contiguous addresses in the linked program. It is
|
|
customary to refer to the <EM>text section</EM> of a program, meaning all
|
|
the addresses of all partial programs' text sections. Likewise for
|
|
data and bss sections.
|
|
|
|
</P>
|
|
<P>
|
|
Some sections are manipulated by <CODE>ld</CODE>; others are invented for
|
|
use of <CODE>as</CODE> and have no meaning except during assembly.
|
|
|
|
</P>
|
|
|
|
|
|
<H2><A NAME="SEC41" HREF="as_toc.html#TOC41">Linker Sections</A></H2>
|
|
<P>
|
|
<CODE>ld</CODE> deals with just four kinds of sections, summarized below.
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><STRONG>named sections</STRONG>
|
|
<DD>
|
|
<A NAME="IDX174"></A>
|
|
<A NAME="IDX175"></A>
|
|
|
|
<A NAME="IDX176"></A>
|
|
<A NAME="IDX177"></A>
|
|
<DT><STRONG>text section</STRONG>
|
|
<DD>
|
|
<DT><STRONG>data section</STRONG>
|
|
<DD>
|
|
These sections hold your program. <CODE>as</CODE> and <CODE>ld</CODE> treat them as
|
|
separate but equal sections. Anything you can say of one section is
|
|
true another.
|
|
When the program is running, however, it is
|
|
customary for the text section to be unalterable. The
|
|
text section is often shared among processes: it contains
|
|
instructions, constants and the like. The data section of a running
|
|
program is usually alterable: for example, C variables would be stored
|
|
in the data section.
|
|
|
|
<A NAME="IDX178"></A>
|
|
<DT><STRONG>bss section</STRONG>
|
|
<DD>
|
|
This section contains zeroed bytes when your program begins running. It
|
|
is used to hold unitialized variables or common storage. The length of
|
|
each partial program's bss section is important, but because it starts
|
|
out containing zeroed bytes there is no need to store explicit zero
|
|
bytes in the object file. The bss section was invented to eliminate
|
|
those explicit zeros from object files.
|
|
|
|
<A NAME="IDX179"></A>
|
|
<DT><STRONG>absolute section</STRONG>
|
|
<DD>
|
|
Address 0 of this section is always "relocated" to runtime address 0.
|
|
This is useful if you want to refer to an address that <CODE>ld</CODE> must
|
|
not change when relocating. In this sense we speak of absolute
|
|
addresses being "unrelocatable": they do not change during relocation.
|
|
|
|
<A NAME="IDX180"></A>
|
|
<DT><STRONG>undefined section</STRONG>
|
|
<DD>
|
|
This "section" is a catch-all for address references to objects not in
|
|
the preceding sections.
|
|
</DL>
|
|
|
|
<P>
|
|
<A NAME="IDX181"></A>
|
|
An idealized example of three relocatable sections follows.
|
|
The example uses the traditional section names <SAMP>`.text'</SAMP> and <SAMP>`.data'</SAMP>.
|
|
Memory addresses are on the horizontal axis.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC42" HREF="as_toc.html#TOC42">Assembler Internal Sections</A></H2>
|
|
|
|
<P>
|
|
<A NAME="IDX182"></A>
|
|
<A NAME="IDX183"></A>
|
|
These sections are meant only for the internal use of <CODE>as</CODE>. They
|
|
have no meaning at run-time. You do not really need to know about these
|
|
sections for most purposes; but they can be mentioned in <CODE>as</CODE>
|
|
warning messages, so it might be helpful to have an idea of their
|
|
meanings to <CODE>as</CODE>. These sections are used to permit the
|
|
value of every expression in your assembly language program to be a
|
|
section-relative address.
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><B>ASSEMBLER-INTERNAL-LOGIC-ERROR!</B>
|
|
<DD>
|
|
<A NAME="IDX184"></A>
|
|
|
|
An internal assembler logic error has been found. This means there is a
|
|
bug in the assembler.
|
|
|
|
<A NAME="IDX185"></A>
|
|
<DT><B>expr section</B>
|
|
<DD>
|
|
The assembler stores complex expression internally as combinations of
|
|
symbols. When it needs to represent an expression as a symbol, it puts
|
|
it in the expr section.
|
|
</DL>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC43" HREF="as_toc.html#TOC43">Sub-Sections</A></H2>
|
|
|
|
<P>
|
|
<A NAME="IDX186"></A>
|
|
<A NAME="IDX187"></A>
|
|
Assembled bytes
|
|
conventionally
|
|
fall into two sections: text and data.
|
|
You may have separate groups of
|
|
data in named sections
|
|
text or data
|
|
that you want to end up near to each other in the object file, even though they
|
|
are not contiguous in the assembler source. <CODE>as</CODE> allows you to
|
|
use <EM>subsections</EM> for this purpose. Within each section, there can be
|
|
numbered subsections with values from 0 to 8192. Objects assembled into the
|
|
same subsection go into the object file together with other objects in the same
|
|
subsection. For example, a compiler might want to store constants in the text
|
|
section, but might not want to have them interspersed with the program being
|
|
assembled. In this case, the compiler could issue a <SAMP>`.text 0'</SAMP> before each
|
|
section of code being output, and a <SAMP>`.text 1'</SAMP> before each group of
|
|
constants being output.
|
|
|
|
</P>
|
|
<P>
|
|
Subsections are optional. If you do not use subsections, everything
|
|
goes in subsection number zero.
|
|
|
|
</P>
|
|
<P>
|
|
Each subsection is zero-padded up to a multiple of four bytes.
|
|
(Subsections may be padded a different amount on different flavors
|
|
of <CODE>as</CODE>.)
|
|
|
|
</P>
|
|
<P>
|
|
Subsections appear in your object file in numeric order, lowest numbered
|
|
to highest. (All this to be compatible with other people's assemblers.)
|
|
The object file contains no representation of subsections; <CODE>ld</CODE> and
|
|
other programs that manipulate object files see no trace of them.
|
|
They just see all your text subsections as a text section, and all your
|
|
data subsections as a data section.
|
|
|
|
</P>
|
|
<P>
|
|
To specify which subsection you want subsequent statements assembled
|
|
into, use a numeric argument to specify it, in a <SAMP>`.text
|
|
<VAR>expression</VAR>'</SAMP> or a <SAMP>`.data <VAR>expression</VAR>'</SAMP> statement.
|
|
When generating COFF output, you
|
|
can also use an extra subsection
|
|
argument with arbitrary named sections: <SAMP>`.section <VAR>name</VAR>,
|
|
<VAR>expression</VAR>'</SAMP>.
|
|
<VAR>Expression</VAR> should be an absolute expression.
|
|
(See section <A HREF="as_6.html#SEC60">Expressions</A>.) If you just say <SAMP>`.text'</SAMP> then <SAMP>`.text 0'</SAMP>
|
|
is assumed. Likewise <SAMP>`.data'</SAMP> means <SAMP>`.data 0'</SAMP>. Assembly
|
|
begins in <CODE>text 0</CODE>. For instance:
|
|
|
|
<PRE>
|
|
.text 0 # The default subsection is text 0 anyway.
|
|
.ascii "This lives in the first text subsection. *"
|
|
.text 1
|
|
.ascii "But this lives in the second text subsection."
|
|
.data 0
|
|
.ascii "This lives in the data section,"
|
|
.ascii "in the first data subsection."
|
|
.text 0
|
|
.ascii "This lives in the first text section,"
|
|
.ascii "immediately following the asterisk (*)."
|
|
</PRE>
|
|
|
|
<P>
|
|
Each section has a <EM>location counter</EM> incremented by one for every byte
|
|
assembled into that section. Because subsections are merely a convenience
|
|
restricted to <CODE>as</CODE> there is no concept of a subsection location
|
|
counter. There is no way to directly manipulate a location counter--but the
|
|
<CODE>.align</CODE> directive changes it, and any label definition captures its
|
|
current value. The location counter of the section where statements are being
|
|
assembled is said to be the <EM>active</EM> location counter.
|
|
|
|
</P>
|
|
|
|
|
|
<H2><A NAME="SEC44" HREF="as_toc.html#TOC44">bss Section</A></H2>
|
|
|
|
<P>
|
|
<A NAME="IDX188"></A>
|
|
<A NAME="IDX189"></A>
|
|
The bss section is used for local common variable storage.
|
|
You may allocate address space in the bss section, but you may
|
|
not dictate data to load into it before your program executes. When
|
|
your program starts running, all the contents of the bss
|
|
section are zeroed bytes.
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>.lcomm</CODE> pseudo-op defines a symbol in the bss section; see
|
|
section <A HREF="as_7.html#SEC101"><CODE>.lcomm <VAR>symbol</CODE> , <VAR>length</VAR></VAR></A>.
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>.comm</CODE> pseudo-op may be used to declare a common symbol, which is
|
|
another form of uninitialized symbol; see See section <A HREF="as_7.html#SEC76"><CODE>.comm <VAR>symbol</CODE> , <VAR>length</VAR> </VAR></A>.
|
|
|
|
</P>
|
|
<P>
|
|
When assembling for a target which supports multiple sections, such as ELF or
|
|
COFF, you may switch into the <CODE>.bss</CODE> section and define symbols as usual;
|
|
see section <A HREF="as_7.html#SEC119"><CODE>.section <VAR>name</CODE></VAR></A>. You may only assemble zero values into the
|
|
section. Typically the section will only contain symbol definitions and
|
|
<CODE>.skip</CODE> directives (see section <A HREF="as_7.html#SEC125"><CODE>.skip <VAR>size</CODE> , <VAR>fill</VAR></VAR></A>).
|
|
|
|
</P>
|
|
<P><HR><P>
|
|
Go to the <A HREF="as_1.html">first</A>, <A HREF="as_3.html">previous</A>, <A HREF="as_5.html">next</A>, <A HREF="as_27.html">last</A> section, <A HREF="as_toc.html">table of contents</A>.
|
|
</BODY>
|
|
</HTML>
|