1053 lines
46 KiB
Plaintext
1053 lines
46 KiB
Plaintext
This is Info file gcc.info, produced by Makeinfo-1.47 from the input
|
||
file gcc.texi.
|
||
|
||
This file documents the use and the internals of the GNU compiler.
|
||
|
||
Copyright (C) 1988, 1989, 1992 Free Software Foundation, Inc.
|
||
|
||
Permission is granted to make and distribute verbatim copies of this
|
||
manual provided the copyright notice and this permission notice are
|
||
preserved on all copies.
|
||
|
||
Permission is granted to copy and distribute modified versions of
|
||
this manual under the conditions for verbatim copying, provided also
|
||
that the sections entitled "GNU General Public License" and "Boycott"
|
||
are included exactly as in the original, and provided that the entire
|
||
resulting derived work is distributed under the terms of a permission
|
||
notice identical to this one.
|
||
|
||
Permission is granted to copy and distribute translations of this
|
||
manual into another language, under the above conditions for modified
|
||
versions, except that the sections entitled "GNU General Public
|
||
License" and "Boycott", and this permission notice, may be included in
|
||
translations approved by the Free Software Foundation instead of in the
|
||
original English.
|
||
|
||
|
||
File: gcc.info, Node: Standard Names, Next: Pattern Ordering, Prev: Constraints, Up: Machine Desc
|
||
|
||
Standard Names for Patterns Used in Generation
|
||
==============================================
|
||
|
||
Here is a table of the instruction names that are meaningful in the
|
||
RTL generation pass of the compiler. Giving one of these names to an
|
||
instruction pattern tells the RTL generation pass that it can use the
|
||
pattern in to accomplish a certain task.
|
||
|
||
`movM'
|
||
Here M stands for a two-letter machine mode name, in lower case.
|
||
This instruction pattern moves data with that machine mode from
|
||
operand 1 to operand 0. For example, `movsi' moves full-word data.
|
||
|
||
If operand 0 is a `subreg' with mode M of a register whose own
|
||
mode is wider than M, the effect of this instruction is to store
|
||
the specified value in the part of the register that corresponds
|
||
to mode M. The effect on the rest of the register is undefined.
|
||
|
||
This class of patterns is special in several ways. First of all,
|
||
each of these names *must* be defined, because there is no other
|
||
way to copy a datum from one place to another.
|
||
|
||
Second, these patterns are not used solely in the RTL generation
|
||
pass. Even the reload pass can generate move insns to copy values
|
||
from stack slots into temporary registers. When it does so, one
|
||
of the operands is a hard register and the other is an operand
|
||
that can need to be reloaded into a register.
|
||
|
||
Therefore, when given such a pair of operands, the pattern must
|
||
generate RTL which needs no reloading and needs no temporary
|
||
registers--no registers other than the operands. For example, if
|
||
you support the pattern with a `define_expand', then in such a
|
||
case the `define_expand' mustn't call `force_reg' or any other such
|
||
function which might generate new pseudo registers.
|
||
|
||
This requirement exists even for subword modes on a RISC machine
|
||
where fetching those modes from memory normally requires several
|
||
insns and some temporary registers. Look in `spur.md' to see how
|
||
the requirement can be satisfied.
|
||
|
||
During reload a memory reference with an invalid address may be
|
||
passed as an operand. Such an address will be replaced with a
|
||
valid address later in the reload pass. In this case, nothing may
|
||
be done with the address except to use it as it stands. If it is
|
||
copied, it will not be replaced with a valid address. No attempt
|
||
should be made to make such an address into a valid address and no
|
||
routine (such as `change_address') that will do so may be called.
|
||
Note that `general_operand' will fail when applied to such an
|
||
address.
|
||
|
||
The global variable `reload_in_progress' (which must be explicitly
|
||
declared if required) can be used to determine whether such special
|
||
handling is required.
|
||
|
||
The variety of operands that have reloads depends on the rest of
|
||
the machine description, but typically on a RISC machine these can
|
||
only be pseudo registers that did not get hard registers, while on
|
||
other machines explicit memory references will get optional
|
||
reloads.
|
||
|
||
If a scratch register is required to move an object to or from
|
||
memory, it can be allocated using `gen_reg_rtx' prior to reload.
|
||
But this is impossible during and after reload. If there are
|
||
cases needing scratch registers after reload, you must define
|
||
`SECONDARY_INPUT_RELOAD_CLASS' and/or
|
||
`SECONDARY_OUTPUT_RELOAD_CLASS' to detect them, and provide
|
||
patterns `reload_inM' or `reload_outM' to handle them. *Note
|
||
Register Classes::.
|
||
|
||
The constraints on a `moveM' must permit moving any hard register
|
||
to any other hard register provided that `HARD_REGNO_MODE_OK'
|
||
permits mode M in both registers and `REGISTER_MOVE_COST' applied
|
||
to their classes returns a value of 2.
|
||
|
||
It is obligatory to support floating point `moveM' instructions
|
||
into and out of any registers that can hold fixed point values,
|
||
because unions and structures (which have modes `SImode' or
|
||
`DImode') can be in those registers and they may have floating
|
||
point members.
|
||
|
||
There may also be a need to support fixed point `moveM'
|
||
instructions in and out of floating point registers.
|
||
Unfortunately, I have forgotten why this was so, and I don't know
|
||
whether it is still true. If `HARD_REGNO_MODE_OK' rejects fixed
|
||
point values in floating point registers, then the constraints of
|
||
the fixed point `moveM' instructions must be designed to avoid
|
||
ever trying to reload into a floating point register.
|
||
|
||
`reload_inM'
|
||
`reload_outM'
|
||
Like `movM', but used when a scratch register is required to move
|
||
between operand 0 and operand 1. Operand 2 describes the scratch
|
||
register. See the discussion of the `SECONDARY_RELOAD_CLASS'
|
||
macro in *note Register Classes::..
|
||
|
||
`movstrictM'
|
||
Like `movM' except that if operand 0 is a `subreg' with mode M of
|
||
a register whose natural mode is wider, the `movstrictM'
|
||
instruction is guaranteed not to alter any of the register except
|
||
the part which belongs to mode M.
|
||
|
||
`load_multiple'
|
||
Load several consecutive memory locations into consecutive
|
||
registers. Operand 0 is the first of the consecutive registers,
|
||
operand 1 is the first memory location, and operand 2 is a
|
||
constant: the number of consecutive registers.
|
||
|
||
Define this only if the target machine really has such an
|
||
instruction; do not define this if the most efficient way of
|
||
loading consecutive registers from memory is to do them one at a
|
||
time.
|
||
|
||
On some machines, there are restrictions as to which consecutive
|
||
registers can be stored into memory, such as particular starting or
|
||
ending register numbers or only a range of valid counts. For those
|
||
machines, use a `define_expand' (*note Expander Definitions::.)
|
||
and make the pattern fail if the restrictions are not met.
|
||
|
||
Write the generated insn as a `parallel' with elements being a
|
||
`set' of one register from the appropriate memory location (you may
|
||
also need `use' or `clobber' elements). Use a `match_parallel'
|
||
(*note RTL Template::.) to recognize the insn. See `a29k.md' and
|
||
`rs6000.md' for examples of the use of this insn pattern.
|
||
|
||
`store_multiple'
|
||
Similar to `load_multiple', but store several consecutive registers
|
||
into consecutive memory locations. Operand 0 is the first of the
|
||
consecutive memory locations, operand 1 is the first register, and
|
||
operand 2 is a constant: the number of consecutive registers.
|
||
|
||
`addM3'
|
||
Add operand 2 and operand 1, storing the result in operand 0. All
|
||
operands must have mode M. This can be used even on two-address
|
||
machines, by means of constraints requiring operands 1 and 0 to be
|
||
the same location.
|
||
|
||
`subM3', `mulM3'
|
||
`divM3', `udivM3', `modM3', `umodM3'
|
||
`sminM3', `smaxM3', `uminM3', `umaxM3'
|
||
`andM3', `iorM3', `xorM3'
|
||
Similar, for other arithmetic operations.
|
||
|
||
`mulhisi3'
|
||
Multiply operands 1 and 2, which have mode `HImode', and store a
|
||
`SImode' product in operand 0.
|
||
|
||
`mulqihi3', `mulsidi3'
|
||
Similar widening-multiplication instructions of other widths.
|
||
|
||
`umulqihi3', `umulhisi3', `umulsidi3'
|
||
Similar widening-multiplication instructions that do unsigned
|
||
multiplication.
|
||
|
||
`divmodM4'
|
||
Signed division that produces both a quotient and a remainder.
|
||
Operand 1 is divided by operand 2 to produce a quotient stored in
|
||
operand 0 and a remainder stored in operand 3.
|
||
|
||
For machines with an instruction that produces both a quotient and
|
||
a remainder, provide a pattern for `divmodM4' but do not provide
|
||
patterns for `divM3' and `modM3'. This allows optimization in the
|
||
relatively common case when both the quotient and remainder are
|
||
computed.
|
||
|
||
If an instruction that just produces a quotient or just a remainder
|
||
exists and is more efficient than the instruction that produces
|
||
both, write the output routine of `divmodM4' to call
|
||
`find_reg_note' and look for a `REG_UNUSED' note on the quotient
|
||
or remainder and generate the appropriate instruction.
|
||
|
||
`udivmodM4'
|
||
Similar, but does unsigned division.
|
||
|
||
`ashlM3'
|
||
Arithmetic-shift operand 1 left by a number of bits specified by
|
||
operand 2, and store the result in operand 0. Here M is the mode
|
||
of operand 0 and operand 1; operand 2's mode is specified by the
|
||
instruction pattern, and the compiler will convert the operand to
|
||
that mode before generating the instruction.
|
||
|
||
`ashrM3', `lshlM3', `lshrM3', `rotlM3', `rotrM3'
|
||
Other shift and rotate instructions, analogous to the `ashlM3'
|
||
instructions.
|
||
|
||
Logical and arithmetic left shift are the same. Machines that do
|
||
not allow negative shift counts often have only one instruction for
|
||
shifting left. On such machines, you should define a pattern named
|
||
`ashlM3' and leave `lshlM3' undefined.
|
||
|
||
`negM2'
|
||
Negate operand 1 and store the result in operand 0.
|
||
|
||
`absM2'
|
||
Store the absolute value of operand 1 into operand 0.
|
||
|
||
`sqrtM2'
|
||
Store the square root of operand 1 into operand 0.
|
||
|
||
The `sqrt' built-in function of C always uses the mode which
|
||
corresponds to the C data type `double'.
|
||
|
||
`ffsM2'
|
||
Store into operand 0 one plus the index of the least significant
|
||
1-bit of operand 1. If operand 1 is zero, store zero. M is the
|
||
mode of operand 0; operand 1's mode is specified by the instruction
|
||
pattern, and the compiler will convert the operand to that mode
|
||
before generating the instruction.
|
||
|
||
The `ffs' built-in function of C always uses the mode which
|
||
corresponds to the C data type `int'.
|
||
|
||
`one_cmplM2'
|
||
Store the bitwise-complement of operand 1 into operand 0.
|
||
|
||
`cmpM'
|
||
Compare operand 0 and operand 1, and set the condition codes. The
|
||
RTL pattern should look like this:
|
||
|
||
(set (cc0) (compare (match_operand:M 0 ...)
|
||
(match_operand:M 1 ...)))
|
||
|
||
`tstM'
|
||
Compare operand 0 against zero, and set the condition codes. The
|
||
RTL pattern should look like this:
|
||
|
||
(set (cc0) (match_operand:M 0 ...))
|
||
|
||
`tstM' patterns should not be defined for machines that do not use
|
||
`(cc0)'. Doing so would confuse the optimizer since it would no
|
||
longer be clear which `set' operations were comparisons. The
|
||
`cmpM' patterns should be used instead.
|
||
|
||
`movstrM'
|
||
Block move instruction. The addresses of the destination and
|
||
source strings are the first two operands, and both are in mode
|
||
`Pmode'. The number of bytes to move is the third operand, in mode
|
||
M.
|
||
|
||
The fourth operand is the known shared alignment of the source and
|
||
destination, in the form of a `const_int' rtx. Thus, if the
|
||
compiler knows that both source and destination are word-aligned,
|
||
it may provide the value 4 for this operand.
|
||
|
||
These patterns need not give special consideration to the
|
||
possibility that the source and destination strings might overlap.
|
||
|
||
`cmpstrM'
|
||
Block compare instruction, with five operands. Operand 0 is the
|
||
output; it has mode M. The remaining four operands are like the
|
||
operands of `movstrM'. The two memory blocks specified are
|
||
compared byte by byte in lexicographic order. The effect of the
|
||
instruction is to store a value in operand 0 whose sign indicates
|
||
the result of the comparison.
|
||
|
||
`floatMN2'
|
||
Convert signed integer operand 1 (valid for fixed point mode M) to
|
||
floating point mode N and store in operand 0 (which has mode N).
|
||
|
||
`floatunsMN2'
|
||
Convert unsigned integer operand 1 (valid for fixed point mode M)
|
||
to floating point mode N and store in operand 0 (which has mode N).
|
||
|
||
`fixMN2'
|
||
Convert operand 1 (valid for floating point mode M) to fixed point
|
||
mode N as a signed number and store in operand 0 (which has mode
|
||
N). This instruction's result is defined only when the value of
|
||
operand 1 is an integer.
|
||
|
||
`fixunsMN2'
|
||
Convert operand 1 (valid for floating point mode M) to fixed point
|
||
mode N as an unsigned number and store in operand 0 (which has
|
||
mode N). This instruction's result is defined only when the value
|
||
of operand 1 is an integer.
|
||
|
||
`ftruncM2'
|
||
Convert operand 1 (valid for floating point mode M) to an integer
|
||
value, still represented in floating point mode M, and store it in
|
||
operand 0 (valid for floating point mode M).
|
||
|
||
`fix_truncMN2'
|
||
Like `fixMN2' but works for any floating point value of mode M by
|
||
converting the value to an integer.
|
||
|
||
`fixuns_truncMN2'
|
||
Like `fixunsMN2' but works for any floating point value of mode M
|
||
by converting the value to an integer.
|
||
|
||
`truncMN'
|
||
Truncate operand 1 (valid for mode M) to mode N and store in
|
||
operand 0 (which has mode N). Both modes must be fixed point or
|
||
both floating point.
|
||
|
||
`extendMN'
|
||
Sign-extend operand 1 (valid for mode M) to mode N and store in
|
||
operand 0 (which has mode N). Both modes must be fixed point or
|
||
both floating point.
|
||
|
||
`zero_extendMN'
|
||
Zero-extend operand 1 (valid for mode M) to mode N and store in
|
||
operand 0 (which has mode N). Both modes must be fixed point.
|
||
|
||
`extv'
|
||
Extract a bit field from operand 1 (a register or memory operand),
|
||
where operand 2 specifies the width in bits and operand 3 the
|
||
starting bit, and store it in operand 0. Operand 0 must have mode
|
||
`word_mode'. Operand 1 may have mode `byte_mode' or `word_mode';
|
||
often `word_mode' is allowed only for registers. Operands 2 and 3
|
||
must be valid for `word_mode'.
|
||
|
||
The RTL generation pass generates this instruction only with
|
||
constants for operands 2 and 3.
|
||
|
||
The bit-field value is sign-extended to a full word integer before
|
||
it is stored in operand 0.
|
||
|
||
`extzv'
|
||
Like `extv' except that the bit-field value is zero-extended.
|
||
|
||
`insv'
|
||
Store operand 3 (which must be valid for `word_mode') into a bit
|
||
field in operand 0, where operand 1 specifies the width in bits and
|
||
operand 2 the starting bit. Operand 0 may have mode `byte_mode' or
|
||
`word_mode'; often `word_mode' is allowed only for registers.
|
||
Operands 1 and 2 must be valid for `word_mode'.
|
||
|
||
The RTL generation pass generates this instruction only with
|
||
constants for operands 1 and 2.
|
||
|
||
`sCOND'
|
||
Store zero or nonzero in the operand according to the condition
|
||
codes. Value stored is nonzero iff the condition COND is true.
|
||
COND is the name of a comparison operation expression code, such
|
||
as `eq', `lt' or `leu'.
|
||
|
||
You specify the mode that the operand must have when you write the
|
||
`match_operand' expression. The compiler automatically sees which
|
||
mode you have used and supplies an operand of that mode.
|
||
|
||
The value stored for a true condition must have 1 as its low bit,
|
||
or else must be negative. Otherwise the instruction is not
|
||
suitable and you should omit it from the machine description. You
|
||
describe to the compiler exactly which value is stored by defining
|
||
the macro `STORE_FLAG_VALUE' (*note Misc::.). If a description
|
||
cannot be found that can be used for all the `sCOND' patterns, you
|
||
should omit those operations from the machine description.
|
||
|
||
These operations may fail, but should do so only in relatively
|
||
uncommon cases; if they would fail for common cases involving
|
||
integer comparisons, it is best to omit these patterns.
|
||
|
||
If these operations are omitted, the compiler will usually
|
||
generate code that copies the constant one to the target and
|
||
branches around an assignment of zero to the target. If this code
|
||
is more efficient than the potential instructions used for the
|
||
`sCOND' pattern followed by those required to convert the result
|
||
into a 1 or a zero in `SImode', you should omit the `sCOND'
|
||
operations from the machine description.
|
||
|
||
`bCOND'
|
||
Conditional branch instruction. Operand 0 is a `label_ref' that
|
||
refers to the label to jump to. Jump if the condition codes meet
|
||
condition COND.
|
||
|
||
Some machines do not follow the model assumed here where a
|
||
comparison instruction is followed by a conditional branch
|
||
instruction. In that case, the `cmpM' (and `tstM') patterns should
|
||
simply store the operands away and generate all the required insns
|
||
in a `define_expand' (*note Expander Definitions::.) for the
|
||
conditional branch operations. All calls to expand `vCOND'
|
||
patterns are immediately preceded by calls to expand either a
|
||
`cmpM' pattern or a `tstM' pattern.
|
||
|
||
Machines that use a pseudo register for the condition code value,
|
||
or where the mode used for the comparison depends on the condition
|
||
being tested, should also use the above mechanism. *Note Jump
|
||
Patterns::
|
||
|
||
The above discussion also applies to `sCOND' patterns.
|
||
|
||
`call'
|
||
Subroutine call instruction returning no value. Operand 0 is the
|
||
function to call; operand 1 is the number of bytes of arguments
|
||
pushed (in mode `SImode', except it is normally a `const_int');
|
||
operand 2 is the number of registers used as operands.
|
||
|
||
On most machines, operand 2 is not actually stored into the RTL
|
||
pattern. It is supplied for the sake of some RISC machines which
|
||
need to put this information into the assembler code; they can put
|
||
it in the RTL instead of operand 1.
|
||
|
||
Operand 0 should be a `mem' RTX whose address is the address of the
|
||
function. Note, however, that this address can be a `symbol_ref'
|
||
expression even if it would not be a legitimate memory address on
|
||
the target machine. If it is also not a valid argument for a call
|
||
instruction, the pattern for this operation should be a
|
||
`define_expand' (*note Expander Definitions::.) that places the
|
||
address into a register and uses that register in the call
|
||
instruction.
|
||
|
||
`call_value'
|
||
Subroutine call instruction returning a value. Operand 0 is the
|
||
hard register in which the value is returned. There are three more
|
||
operands, the same as the three operands of the `call' instruction
|
||
(but with numbers increased by one).
|
||
|
||
Subroutines that return `BLKmode' objects use the `call' insn.
|
||
|
||
`call_pop', `call_value_pop'
|
||
Similar to `call' and `call_value', except used if defined and if
|
||
`RETURN_POPS_ARGS' is non-zero. They should emit a `parallel'
|
||
that contains both the function call and a `set' to indicate the
|
||
adjustment made to the frame pointer.
|
||
|
||
For machines where `RETURN_POPS_ARGS' can be non-zero, the use of
|
||
these patterns increases the number of functions for which the
|
||
frame pointer can be eliminated, if desired.
|
||
|
||
`return'
|
||
Subroutine return instruction. This instruction pattern name
|
||
should be defined only if a single instruction can do all the work
|
||
of returning from a function.
|
||
|
||
Like the `movM' patterns, this pattern is also used after the RTL
|
||
generation phase. In this case it is to support machines where
|
||
multiple instructions are usually needed to return from a
|
||
function, but some class of functions only requires one
|
||
instruction to implement a return. Normally, the applicable
|
||
functions are those which do not need to save any registers or
|
||
allocate stack space.
|
||
|
||
For such machines, the condition specified in this pattern should
|
||
only be true when `reload_completed' is non-zero and the function's
|
||
epilogue would only be a single instruction. For machines with
|
||
register windows, the routine `leaf_function_p' may be used to
|
||
determine if a register window push is required.
|
||
|
||
Machines that have conditional return instructions should define
|
||
patterns such as
|
||
|
||
(define_insn ""
|
||
[(set (pc)
|
||
(if_then_else (match_operator 0 "comparison_operator"
|
||
[(cc0) (const_int 0)])
|
||
(return)
|
||
(pc)))]
|
||
"CONDITION"
|
||
"...")
|
||
|
||
where CONDITION would normally be the same condition specified on
|
||
the named `return' pattern.
|
||
|
||
`nop'
|
||
No-op instruction. This instruction pattern name should always be
|
||
defined to output a no-op in assembler code. `(const_int 0)' will
|
||
do as an RTL pattern.
|
||
|
||
`indirect_jump'
|
||
An instruction to jump to an address which is operand zero. This
|
||
pattern name is mandatory on all machines.
|
||
|
||
`casesi'
|
||
Instruction to jump through a dispatch table, including bounds
|
||
checking. This instruction takes five operands:
|
||
|
||
1. The index to dispatch on, which has mode `SImode'.
|
||
|
||
2. The lower bound for indices in the table, an integer constant.
|
||
|
||
3. The total range of indices in the table--the largest index
|
||
minus the smallest one (both inclusive).
|
||
|
||
4. A label that precedes the table itself.
|
||
|
||
5. A label to jump to if the index has a value outside the
|
||
bounds. (If the machine-description macro
|
||
`CASE_DROPS_THROUGH' is defined, then an out-of-bounds index
|
||
drops through to the code following the jump table instead of
|
||
jumping to this label. In that case, this label is not
|
||
actually used by the `casesi' instruction, but it is always
|
||
provided as an operand.)
|
||
|
||
The table is a `addr_vec' or `addr_diff_vec' inside of a
|
||
`jump_insn'. The number of elements in the table is one plus the
|
||
difference between the upper bound and the lower bound.
|
||
|
||
`tablejump'
|
||
Instruction to jump to a variable address. This is a low-level
|
||
capability which can be used to implement a dispatch table when
|
||
there is no `casesi' pattern.
|
||
|
||
This pattern requires two operands: the address or offset, and a
|
||
label which should immediately precede the jump table. If the
|
||
macro `CASE_VECTOR_PC_RELATIVE' is defined then the first operand
|
||
is an offset which counts from the address of the table;
|
||
otherwise, it is an absolute address to jump to. In either case,
|
||
the first operand has mode `Pmode'.
|
||
|
||
The `tablejump' insn is always the last insn before the jump table
|
||
it uses. Its assembler code normally has no need to use the
|
||
second operand, but you should incorporate it in the RTL pattern so
|
||
that the jump optimizer will not delete the table as unreachable
|
||
code.
|
||
|
||
`save_stack_block'
|
||
`save_stack_function'
|
||
`save_stack_nonlocal'
|
||
`restore_stack_block'
|
||
`restore_stack_function'
|
||
`restore_stack_nonlocal'
|
||
Most machines save and restore the stack pointer by copying it to
|
||
or from an object of mode `Pmode'. Do not define these patterns on
|
||
such machines.
|
||
|
||
Some machines require special handling for stack pointer saves and
|
||
restores. On those machines, define the patterns corresponding to
|
||
the non-standard cases by using a `define_expand' (*note Expander
|
||
Definitions::.) that produces the required insns. The three types
|
||
of saves and restores are:
|
||
|
||
1. `save_stack_block' saves the stack pointer at the start of a
|
||
block that allocates a variable-sized object and
|
||
`restore_stack_block' restores the stack pointer when the
|
||
block is exited.
|
||
|
||
2. `save_stack_function' and `restore_stack_function' operate
|
||
similarly for the outermost block of a function and are used
|
||
when the function allocates variable-sized objects or calls
|
||
`alloca'. Only the epilogue uses the restored stack pointer,
|
||
allowing a simpler save or restore sequence on some machines.
|
||
|
||
3. `save_stack_nonlocal' is used in functions that contain labels
|
||
branched to by nested functions. It saves the stack pointer
|
||
in such a way that the inner function can use
|
||
`restore_stack_nonlocal' to restore the stack pointer. The
|
||
compiler generates code to restore the frame and argument
|
||
pointer registers, but some machines require saving and
|
||
restoring additional data such as register window information
|
||
or stack backchains. Place insns in these patterns to save
|
||
and restore any such required data.
|
||
|
||
When saving the stack pointer, operand 0 is the save area and
|
||
operand 1 is the stack pointer. The mode used to allocate the
|
||
save area is the mode of operand 0. You must specify an integral
|
||
mode, or `VOIDmode' if no save area is needed for a particular
|
||
type of save (either because no save is needed or because a
|
||
machine-specific save area can be used). Operand 0 is the stack
|
||
pointer and operand 1 is the save area for restore operations. If
|
||
`save_stack_block' is defined, operand 0 must not be `VOIDmode'
|
||
since these saves can be arbitrarily nested.
|
||
|
||
A save area is a `mem' that is at a constant offset from
|
||
`virtual_stack_vars_rtx' when the stack pointer is saved for use by
|
||
nonlocal gotos and a `reg' in the other two cases.
|
||
|
||
`allocate_stack'
|
||
Subtract operand 0 from the stack pointer to create space for for
|
||
dynamically allocated data.
|
||
|
||
Do not define this pattern if all that must be done is the
|
||
subtraction. On some machines require other operations such as
|
||
stack probes or maintaining the back chain. Define this pattern
|
||
to emit those operations in addition to updating the stack pointer.
|
||
|
||
|
||
File: gcc.info, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc
|
||
|
||
When the Order of Patterns Matters
|
||
==================================
|
||
|
||
Sometimes an insn can match more than one instruction pattern. Then
|
||
the pattern that appears first in the machine description is the one
|
||
used. Therefore, more specific patterns (patterns that will match fewer
|
||
things) and faster instructions (those that will produce better code
|
||
when they do match) should usually go first in the description.
|
||
|
||
In some cases the effect of ordering the patterns can be used to hide
|
||
a pattern when it is not valid. For example, the 68000 has an
|
||
instruction for converting a fullword to floating point and another for
|
||
converting a byte to floating point. An instruction converting an
|
||
integer to floating point could match either one. We put the pattern
|
||
to convert the fullword first to make sure that one will be used rather
|
||
than the other. (Otherwise a large integer might be generated as a
|
||
single-byte immediate quantity, which would not work.) Instead of using
|
||
this pattern ordering it would be possible to make the pattern for
|
||
convert-a-byte smart enough to deal properly with any constant value.
|
||
|
||
|
||
File: gcc.info, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc
|
||
|
||
Interdependence of Patterns
|
||
===========================
|
||
|
||
Every machine description must have a named pattern for each of the
|
||
conditional branch names `bCOND'. The recognition template must always
|
||
have the form
|
||
|
||
(set (pc)
|
||
(if_then_else (COND (cc0) (const_int 0))
|
||
(label_ref (match_operand 0 "" ""))
|
||
(pc)))
|
||
|
||
In addition, every machine description must have an anonymous pattern
|
||
for each of the possible reverse-conditional branches. Their templates
|
||
look like
|
||
|
||
(set (pc)
|
||
(if_then_else (COND (cc0) (const_int 0))
|
||
(pc)
|
||
(label_ref (match_operand 0 "" ""))))
|
||
|
||
They are necessary because jump optimization can turn direct-conditional
|
||
branches into reverse-conditional branches.
|
||
|
||
It is often convenient to use the `match_operator' construct to
|
||
reduce the number of patterns that must be specified for branches. For
|
||
example,
|
||
|
||
(define_insn ""
|
||
[(set (pc)
|
||
(if_then_else (match_operator 0 "comparison_operator"
|
||
[(cc0) (const_int 0)])
|
||
(pc)
|
||
(label_ref (match_operand 1 "" ""))))]
|
||
"CONDITION"
|
||
"...")
|
||
|
||
In some cases machines support instructions identical except for the
|
||
machine mode of one or more operands. For example, there may be
|
||
"sign-extend halfword" and "sign-extend byte" instructions whose
|
||
patterns are
|
||
|
||
(set (match_operand:SI 0 ...)
|
||
(extend:SI (match_operand:HI 1 ...)))
|
||
|
||
(set (match_operand:SI 0 ...)
|
||
(extend:SI (match_operand:QI 1 ...)))
|
||
|
||
Constant integers do not specify a machine mode, so an instruction to
|
||
extend a constant value could match either pattern. The pattern it
|
||
actually will match is the one that appears first in the file. For
|
||
correct results, this must be the one for the widest possible mode
|
||
(`HImode', here). If the pattern matches the `QImode' instruction, the
|
||
results will be incorrect if the constant value does not actually fit
|
||
that mode.
|
||
|
||
Such instructions to extend constants are rarely generated because
|
||
they are optimized away, but they do occasionally happen in nonoptimized
|
||
compilations.
|
||
|
||
If a constraint in a pattern allows a constant, the reload pass may
|
||
replace a register with a constant permitted by the constraint in some
|
||
cases. Similarly for memory references. You must ensure that the
|
||
predicate permits all objects allowed by the constraints to prevent the
|
||
compiler from crashing.
|
||
|
||
Because of this substitution, you should not provide separate
|
||
patterns for increment and decrement instructions. Instead, they
|
||
should be generated from the same pattern that supports
|
||
register-register add insns by examining the operands and generating
|
||
the appropriate machine instruction.
|
||
|
||
|
||
File: gcc.info, Node: Jump Patterns, Next: Insn Canonicalizations, Prev: Dependent Patterns, Up: Machine Desc
|
||
|
||
Defining Jump Instruction Patterns
|
||
==================================
|
||
|
||
For most machines, GNU CC assumes that the machine has a condition
|
||
code. A comparison insn sets the condition code, recording the results
|
||
of both signed and unsigned comparison of the given operands. A
|
||
separate branch insn tests the condition code and branches or not
|
||
according its value. The branch insns come in distinct signed and
|
||
unsigned flavors. Many common machines, such as the Vax, the 68000 and
|
||
the 32000, work this way.
|
||
|
||
Some machines have distinct signed and unsigned compare
|
||
instructions, and only one set of conditional branch instructions. The
|
||
easiest way to handle these machines is to treat them just like the
|
||
others until the final stage where assembly code is written. At this
|
||
time, when outputting code for the compare instruction, peek ahead at
|
||
the following branch using `next_cc0_user (insn)'. (The variable
|
||
`insn' refers to the insn being output, in the output-writing code in
|
||
an instruction pattern.) If the RTL says that is an unsigned branch,
|
||
output an unsigned compare; otherwise output a signed compare. When
|
||
the branch itself is output, you can treat signed and unsigned branches
|
||
identically.
|
||
|
||
The reason you can do this is that GNU CC always generates a pair of
|
||
consecutive RTL insns, possibly separated by `note' insns, one to set
|
||
the condition code and one to test it, and keeps the pair inviolate
|
||
until the end.
|
||
|
||
To go with this technique, you must define the machine-description
|
||
macro `NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no
|
||
compare instruction is superfluous.
|
||
|
||
Some machines have compare-and-branch instructions and no condition
|
||
code. A similar technique works for them. When it is time to "output" a
|
||
compare instruction, record its operands in two static variables. When
|
||
outputting the branch-on-condition-code instruction that follows,
|
||
actually output a compare-and-branch instruction that uses the
|
||
remembered operands.
|
||
|
||
It also works to define patterns for compare-and-branch instructions.
|
||
In optimizing compilation, the pair of compare and branch instructions
|
||
will be combined according to these patterns. But this does not happen
|
||
if optimization is not requested. So you must use one of the solutions
|
||
above in addition to any special patterns you define.
|
||
|
||
In many RISC machines, most instructions do not affect the condition
|
||
code and there may not even be a separate condition code register. On
|
||
these machines, the restriction that the definition and use of the
|
||
condition code be adjacent insns is not necessary and can prevent
|
||
important optimizations. For example, on the IBM RS/6000, there is a
|
||
delay for taken branches unless the condition code register is set three
|
||
instructions earlier than the conditional branch. The instruction
|
||
scheduler cannot perform this optimization if it is not permitted to
|
||
separate the definition and use of the condition code register.
|
||
|
||
On these machines, do not use `(cc0)', but instead use a register to
|
||
represent the condition code. If there is a specific condition code
|
||
register in the machine, use a hard register. If the condition code or
|
||
comparison result can be placed in any general register, or if there are
|
||
multiple condition registers, use a pseudo register.
|
||
|
||
On some machines, the type of branch instruction generated may
|
||
depend on the way the condition code was produced; for example, on the
|
||
68k and Sparc, setting the condition code directly from an add or
|
||
subtract instruction does not clear the overflow bit the way that a test
|
||
instruction does, so a different branch instruction must be used for
|
||
some conditional branches. For machines that use `(cc0)', the set and
|
||
use of the condition code must be adjacent (separated only by `note'
|
||
insns) allowing flags in `cc_status' to be used. (*Note Condition
|
||
Code::.) Also, the comparison and branch insns can be located from
|
||
each other by using the functions `prev_cc0_setter' and `next_cc0_user'.
|
||
|
||
However, this is not true on machines that do not use `(cc0)'. On
|
||
those machines, no assumptions can be made about the adjacency of the
|
||
compare and branch insns and the above methods cannot be used. Instead,
|
||
we use the machine mode of the condition code register to record
|
||
different formats of the condition code register.
|
||
|
||
Registers used to store the condition code value should have a mode
|
||
that is in class `MODE_CC'. Normally, it will be `CCmode'. If
|
||
additional modes are required (as for the add example mentioned above in
|
||
the Sparc), define the macro `EXTRA_CC_MODES' to list the additional
|
||
modes required (*note Condition Code::.). Also define `EXTRA_CC_NAMES'
|
||
to list the names of those modes and `SELECT_CC_MODE' to choose a mode
|
||
given an operand of a compare.
|
||
|
||
If it is known during RTL generation that a different mode will be
|
||
required (for example, if the machine has separate compare instructions
|
||
for signed and unsigned quantities, like most IBM processors), they can
|
||
be specified at that time.
|
||
|
||
If the cases that require different modes would be made by
|
||
instruction combination, the macro `SELECT_CC_MODE' determines which
|
||
machine mode should be used for the comparison result. The patterns
|
||
should be written using that mode. To support the case of the add on
|
||
the Sparc discussed above, we have the pattern
|
||
|
||
(define_insn ""
|
||
[(set (reg:CC_NOOV 0)
|
||
(compare:CC_NOOV (plus:SI (match_operand:SI 0 "register_operand" "%r")
|
||
(match_operand:SI 1 "arith_operand" "rI"))
|
||
(const_int 0)))]
|
||
""
|
||
"...")
|
||
|
||
The `SELECT_CC_MODE' macro on the Sparc returns `CC_NOOVmode' for
|
||
comparisons whose argument is a `plus'.
|
||
|
||
|
||
File: gcc.info, Node: Insn Canonicalizations, Next: Peephole Definitions, Prev: Jump Patterns, Up: Machine Desc
|
||
|
||
Canonicalization of Instructions
|
||
================================
|
||
|
||
There are often cases where multiple RTL expressions could represent
|
||
an operation performed by a single machine instruction. This situation
|
||
is most commonly encountered with logical, branch, and
|
||
multiply-accumulate instructions. In such cases, the compiler attempts
|
||
to convert these multiple RTL expressions into a single canonical form
|
||
to reduce the number of insn patterns required.
|
||
|
||
In addition to algebraic simplifications, following canonicalizations
|
||
are performed:
|
||
|
||
* For commutative and comparison operators, a constant is always
|
||
made the second operand. If a machine only supports a constant as
|
||
the second operand, only patterns that match a constant in the
|
||
second operand need be supplied.
|
||
|
||
For these operators, if only one operand is a `neg', `not',
|
||
`mult', `plus', or `minus' expression, it will be the first
|
||
operand.
|
||
|
||
* For the `compare' operator, a constant is always the second operand
|
||
on machines where `cc0' is used (*note Jump Patterns::.). On other
|
||
machines, there are rare cases where the compiler might want to
|
||
construct a `compare' with a constant as the first operand.
|
||
However, these cases are not common enough for it to be worthwhile
|
||
to provide a pattern matching a constant as the first operand
|
||
unless the machine actually has such an instruction.
|
||
|
||
An operand of `neg', `not', `mult', `plus', or `minus' is made the
|
||
first operand under the same conditions as above.
|
||
|
||
* `(minus X (const_int N))' is converted to `(plus X (const_int
|
||
-N))'.
|
||
|
||
* Within address computations (i.e., inside `mem'), a left shift is
|
||
converted into the appropriate multiplication by a power of two.
|
||
|
||
De`Morgan's Law is used to move bitwise negation inside a bitwise
|
||
logical-and or logical-or operation. If this results in only one
|
||
operand being a `not' expression, it will be the first one.
|
||
|
||
A machine that has an instruction that performs a bitwise
|
||
logical-and of one operand with the bitwise negation of the other
|
||
should specify the pattern for that instruction as
|
||
|
||
(define_insn ""
|
||
[(set (match_operand:M 0 ...)
|
||
(and:M (not:M (match_operand:M 1 ...))
|
||
(match_operand:M 2 ...)))]
|
||
"..."
|
||
"...")
|
||
|
||
Similarly, a pattern for a "NAND" instruction should be written
|
||
|
||
(define_insn ""
|
||
[(set (match_operand:M 0 ...)
|
||
(ior:M (not:M (match_operand:M 1 ...))
|
||
(not:M (match_operand:M 2 ...))))]
|
||
"..."
|
||
"...")
|
||
|
||
In both cases, it is not necessary to include patterns for the many
|
||
logically equivalent RTL expressions.
|
||
|
||
* The only possible RTL expressions involving both bitwise
|
||
exclusive-or and bitwise negation are `(xor:M X) Y)' and `(not:M
|
||
(xor:M X Y))'.
|
||
|
||
* The sum of three items, one of which is a constant, will only
|
||
appear in the form
|
||
|
||
(plus:M (plus:M X Y) CONSTANT)
|
||
|
||
* On machines that do not use `cc0', `(compare X (const_int 0))'
|
||
will be converted to X.
|
||
|
||
* Equality comparisons of a group of bits (usually a single bit)
|
||
with zero will be written using `zero_extract' rather than the
|
||
equivalent `and' or `sign_extract' operations.
|
||
|
||
|
||
File: gcc.info, Node: Peephole Definitions, Next: Expander Definitions, Prev: Insn Canonicalizations, Up: Machine Desc
|
||
|
||
Defining Machine-Specific Peephole Optimizers
|
||
=============================================
|
||
|
||
In addition to instruction patterns the `md' file may contain
|
||
definitions of machine-specific peephole optimizations.
|
||
|
||
The combiner does not notice certain peephole optimizations when the
|
||
data flow in the program does not suggest that it should try them. For
|
||
example, sometimes two consecutive insns related in purpose can be
|
||
combined even though the second one does not appear to use a register
|
||
computed in the first one. A machine-specific peephole optimizer can
|
||
detect such opportunities.
|
||
|
||
A definition looks like this:
|
||
|
||
(define_peephole
|
||
[INSN-PATTERN-1
|
||
INSN-PATTERN-2
|
||
...]
|
||
"CONDITION"
|
||
"TEMPLATE"
|
||
"OPTIONAL INSN-ATTRIBUTES")
|
||
|
||
The last string operand may be omitted if you are not using any
|
||
machine-specific information in this machine description. If present,
|
||
it must obey the same rules as in a `define_insn'.
|
||
|
||
In this skeleton, INSN-PATTERN-1 and so on are patterns to match
|
||
consecutive insns. The optimization applies to a sequence of insns when
|
||
INSN-PATTERN-1 matches the first one, INSN-PATTERN-2 matches the next,
|
||
and so on.
|
||
|
||
Each of the insns matched by a peephole must also match a
|
||
`define_insn'. Peepholes are checked only at the last stage just
|
||
before code generation, and only optionally. Therefore, any insn which
|
||
would match a peephole but no `define_insn' will cause a crash in code
|
||
generation in an unoptimized compilation, or at various optimization
|
||
stages.
|
||
|
||
The operands of the insns are matched with `match_operands',
|
||
`match_operator', and `match_dup', as usual. What is not usual is that
|
||
the operand numbers apply to all the insn patterns in the definition.
|
||
So, you can check for identical operands in two insns by using
|
||
`match_operand' in one insn and `match_dup' in the other.
|
||
|
||
The operand constraints used in `match_operand' patterns do not have
|
||
any direct effect on the applicability of the peephole, but they will
|
||
be validated afterward, so make sure your constraints are general enough
|
||
to apply whenever the peephole matches. If the peephole matches but
|
||
the constraints are not satisfied, the compiler will crash.
|
||
|
||
It is safe to omit constraints in all the operands of the peephole;
|
||
or you can write constraints which serve as a double-check on the
|
||
criteria previously tested.
|
||
|
||
Once a sequence of insns matches the patterns, the CONDITION is
|
||
checked. This is a C expression which makes the final decision whether
|
||
to perform the optimization (we do so if the expression is nonzero). If
|
||
CONDITION is omitted (in other words, the string is empty) then the
|
||
optimization is applied to every sequence of insns that matches the
|
||
patterns.
|
||
|
||
The defined peephole optimizations are applied after register
|
||
allocation is complete. Therefore, the peephole definition can check
|
||
which operands have ended up in which kinds of registers, just by
|
||
looking at the operands.
|
||
|
||
The way to refer to the operands in CONDITION is to write
|
||
`operands[I]' for operand number I (as matched by `(match_operand I
|
||
...)'). Use the variable `insn' to refer to the last of the insns
|
||
being matched; use `prev_nonnote_insn' to find the preceding insns.
|
||
|
||
When optimizing computations with intermediate results, you can use
|
||
CONDITION to match only when the intermediate results are not used
|
||
elsewhere. Use the C expression `dead_or_set_p (INSN, OP)', where INSN
|
||
is the insn in which you expect the value to be used for the last time
|
||
(from the value of `insn', together with use of `prev_nonnote_insn'),
|
||
and OP is the intermediate value (from `operands[I]').
|
||
|
||
Applying the optimization means replacing the sequence of insns with
|
||
one new insn. The TEMPLATE controls ultimate output of assembler code
|
||
for this combined insn. It works exactly like the template of a
|
||
`define_insn'. Operand numbers in this template are the same ones used
|
||
in matching the original sequence of insns.
|
||
|
||
The result of a defined peephole optimizer does not need to match
|
||
any of the insn patterns in the machine description; it does not even
|
||
have an opportunity to match them. The peephole optimizer definition
|
||
itself serves as the insn pattern to control how the insn is output.
|
||
|
||
Defined peephole optimizers are run as assembler code is being
|
||
output, so the insns they produce are never combined or rearranged in
|
||
any way.
|
||
|
||
Here is an example, taken from the 68000 machine description:
|
||
|
||
(define_peephole
|
||
[(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
|
||
(set (match_operand:DF 0 "register_operand" "=f")
|
||
(match_operand:DF 1 "register_operand" "ad"))]
|
||
"FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
|
||
"*
|
||
{
|
||
rtx xoperands[2];
|
||
xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
|
||
#ifdef MOTOROLA
|
||
output_asm_insn (\"move.l %1,(sp)\", xoperands);
|
||
output_asm_insn (\"move.l %1,-(sp)\", operands);
|
||
return \"fmove.d (sp)+,%0\";
|
||
#else
|
||
output_asm_insn (\"movel %1,sp@\", xoperands);
|
||
output_asm_insn (\"movel %1,sp@-\", operands);
|
||
return \"fmoved sp@+,%0\";
|
||
#endif
|
||
}
|
||
")
|
||
|
||
The effect of this optimization is to change
|
||
|
||
jbsr _foobar
|
||
addql #4,sp
|
||
movel d1,sp@-
|
||
movel d0,sp@-
|
||
fmoved sp@+,fp0
|
||
|
||
into
|
||
|
||
jbsr _foobar
|
||
movel d1,sp@
|
||
movel d0,sp@-
|
||
fmoved sp@+,fp0
|
||
|
||
INSN-PATTERN-1 and so on look *almost* like the second operand of
|
||
`define_insn'. There is one important difference: the second operand
|
||
of `define_insn' consists of one or more RTX's enclosed in square
|
||
brackets. Usually, there is only one: then the same action can be
|
||
written as an element of a `define_peephole'. But when there are
|
||
multiple actions in a `define_insn', they are implicitly enclosed in a
|
||
`parallel'. Then you must explicitly write the `parallel', and the
|
||
square brackets within it, in the `define_peephole'. Thus, if an insn
|
||
pattern looks like this,
|
||
|
||
(define_insn "divmodsi4"
|
||
[(set (match_operand:SI 0 "general_operand" "=d")
|
||
(div:SI (match_operand:SI 1 "general_operand" "0")
|
||
(match_operand:SI 2 "general_operand" "dmsK")))
|
||
(set (match_operand:SI 3 "general_operand" "=d")
|
||
(mod:SI (match_dup 1) (match_dup 2)))]
|
||
"TARGET_68020"
|
||
"divsl%.l %2,%3:%0")
|
||
|
||
then the way to mention this insn in a peephole is as follows:
|
||
|
||
(define_peephole
|
||
[...
|
||
(parallel
|
||
[(set (match_operand:SI 0 "general_operand" "=d")
|
||
(div:SI (match_operand:SI 1 "general_operand" "0")
|
||
(match_operand:SI 2 "general_operand" "dmsK")))
|
||
(set (match_operand:SI 3 "general_operand" "=d")
|
||
(mod:SI (match_dup 1) (match_dup 2)))])
|
||
...]
|
||
...)
|
||
|
||
|