679 lines
19 KiB
HTML
679 lines
19 KiB
HTML
<html>
|
|
<head><title>Using Assembly Language in Linux</title></head>
|
|
<body>
|
|
<center>
|
|
<h1>Using Assembly Language in Linux.</h1>
|
|
<h2>by Phillip</h2>
|
|
<h3>phillip@ussrback.com</h3>
|
|
</center>
|
|
Last updated: Monday 8th January 2001<br>
|
|
<h2>Contents:</h2>
|
|
<ul>
|
|
<li> <a href=#Intro>Introduction</a><br>
|
|
<li> <a href=#Syntax>Intel and AT&T Syntax</a><br>
|
|
<ul>
|
|
<li><a href=#Prefixes>Prefixes</a><br>
|
|
<li><a href=#Direction>Direction of Operands</a><br>
|
|
<li><a href=#Memory>Memory Operands</a><br>
|
|
<li><a href=#Suffixes>Suffixes</a><br>
|
|
</ul>
|
|
<li><a href=#Syscalls>Syscalls</a><br>
|
|
<ul>
|
|
<li><a href=#Syscall5>Syscalls with < 6 args</a><br>
|
|
<li><a href=#Syscall6>Syscalls with > 5 args</a><br>
|
|
<li><a href=#Sockets>Socket syscalls</a><br>
|
|
</ul>
|
|
<li><a href=#Command>Command Line Arguments</a><br>
|
|
<li><a href=#InlineASM>GCC Inline ASM</a><br>
|
|
<li><a href=#Compiling>Compiling</a><br>
|
|
<li><a href=#Links>Further reference/Links</a><br>
|
|
<li><a href=#k0de>Example Code.</a><br>
|
|
</ul>
|
|
|
|
<hr>
|
|
|
|
<a name=Intro><h2>Introduction.</h2></a>
|
|
|
|
<p> This article will describe assembly language programming under Linux.
|
|
Contained within the bounds of the article is a comparison between Intel
|
|
and AT&T syntax asm, a guide to using syscalls and a introductory guide to
|
|
using inline asm in gcc.</p> <p> This article was written due to the lack
|
|
of (good) info on this field of programming (inline asm section in
|
|
particular), in which case i should remind thee that this is not a
|
|
shellcode writing tutorial because there is no lack of info in this field.
|
|
</p> <p> Various parts of this text I have learnt about through
|
|
experimentation and hence may be prone to error. Should you find any of
|
|
these errors on my part, do not hesitate to notify me via email and
|
|
enlighten me on the given issue.</p> <p> There is only one prerequisite
|
|
for reading this article, and thats obviously a basic knowledge of x86
|
|
assembly language and C. </p>
|
|
|
|
<hr>
|
|
|
|
<a name=Syntax><h2>Intel and AT&T Syntax.</h2></a>
|
|
|
|
<p> Intel and AT&T syntax Assembly language are very different from each
|
|
other in appearance, and this will lead to confusion when one first comes
|
|
across AT&T syntax after having learnt Intel syntax first, or vice versa.
|
|
So lets start with the basics. </p>
|
|
|
|
<a name=Prefixes><h3>Prefixes.</h3></a>
|
|
|
|
<p> In Intel syntax there are no register prefixes or immed prefixes. In
|
|
AT&T however registers are prefixed with a '%' and immed's are prefixed
|
|
with a '$'. Intel syntax hexadecimal or binary immed data are suffixed
|
|
with 'h' and 'b' respectively. Also if the first hexadecimal digit is a
|
|
letter then the value is prefixed by a '0'.</p>
|
|
|
|
Example:<br>
|
|
<table border=1>
|
|
<tr><td>
|
|
Intex Syntax<br>
|
|
<pre>
|
|
mov eax,1
|
|
mov ebx,0ffh
|
|
int 80h
|
|
</pre></td><td>AT&T Syntax<br>
|
|
<pre>
|
|
movl $1,%eax
|
|
movl $0xff,%ebx
|
|
int $0x80
|
|
</pre></td></tr>
|
|
</table>
|
|
|
|
<a name=Direction><h3>Direction of Operands.</h3></a>
|
|
|
|
<p>The direction of the operands in Intel syntax is opposite from that
|
|
of AT&T syntax. In Intel syntax the first operand is the destination, and
|
|
the second operand is the source whereas in AT&T syntax the first operand is
|
|
the source and the second operand is the destination. The advantage of
|
|
AT&T syntax in this situation is obvious. We read from left to right, we
|
|
write from left to right, so this way is only natural.</p>
|
|
|
|
Example:<br>
|
|
<table border=1>
|
|
<tr><td>Intex Syntax<br>
|
|
<pre>
|
|
instr dest,source
|
|
mov eax,[ecx]
|
|
</pre></td><td>AT&T Syntax<br>
|
|
<pre>
|
|
instr source,dest
|
|
movl (%ecx),%eax
|
|
</pre></td></tr>
|
|
</table>
|
|
|
|
<a name=Memory><h3>Memory Operands.</h3></a>
|
|
|
|
<p> Memory operands as seen above are different also. In Intel syntax
|
|
the base register is enclosed in '[' and ']' whereas in AT&T syntax it is
|
|
enclosed in '(' and ')'. </p>
|
|
|
|
Example:<br>
|
|
<table border=1>
|
|
<tr><td>Intex Syntax<br>
|
|
<pre>
|
|
mov eax,[ebx]
|
|
mov eax,[ebx+3]
|
|
</pre></td><td>AT&T Syntax<br>
|
|
<pre>
|
|
movl (%ebx),%eax
|
|
movl 3(%ebx),%eax
|
|
</pre></td></tr>
|
|
</table>
|
|
|
|
<p> The AT&T form for instructions involving complex operations is very
|
|
obscure compared to Intel syntax. The Intel syntax form of these is
|
|
segreg:[base+index*scale+disp]. The AT&T syntax form is
|
|
%segreg:disp(base,index,scale). </p> <p> Index/scale/disp/segreg are all
|
|
optional and can simply be left out. Scale, if not specified and index is
|
|
specified, defaults to 1. Segreg depends on the instruction and whether
|
|
the app is being run in real mode or pmode. In real mode it depends on the
|
|
instruction whereas in pmode its unnecessary. Immediate data used should
|
|
not '$' prefixed in AT&T when used for scale/disp.</p>
|
|
|
|
Example:<br>
|
|
<table border=1>
|
|
<tr><td>Intel Syntax<br>
|
|
<pre>
|
|
instr foo,segreg:[base+index*scale+disp]
|
|
mov eax,[ebx+20h]
|
|
add eax,[ebx+ecx*2h
|
|
lea eax,[ebx+ecx]
|
|
sub eax,[ebx+ecx*4h-20h]
|
|
</pre></td><td>AT&T Syntax<br>
|
|
<pre>
|
|
instr %segreg:disp(base,index,scale),foo
|
|
movl 0x20(%ebx),%eax
|
|
addl (%ebx,%ecx,0x2),%eax
|
|
leal (%ebx,%ecx),%eax
|
|
subl -0x20(%ebx,%ecx,0x4),%eax
|
|
</pre></td></tr>
|
|
</table>
|
|
|
|
<p> As you can see, AT&T is very obscure. [base+index*scale+disp] makes
|
|
more sense at a glance than disp(base,index,scale).</p>
|
|
|
|
<a name=Suffixes><h3>Suffixes.</h3></a>
|
|
|
|
<p> As you may have noticed, the AT&T syntax mnemonics have a suffix. The
|
|
significance of this suffix is that of operand size. 'l' is for long, 'w'
|
|
is for word, and 'b' is for byte. Intel syntax has similar directives for
|
|
use with memory operands, i.e. byte ptr, word ptr, dword ptr. "dword" of
|
|
course corresponding to "long". This is similar to type casting in C but
|
|
it doesnt seem to be necessary since the size of registers used is the
|
|
assumed datatype.</p>
|
|
|
|
Example:<br>
|
|
<table border=1>
|
|
<tr><td>Intel Syntax<br>
|
|
<pre>
|
|
mov al,bl
|
|
mov ax,bx
|
|
mov eax,ebx
|
|
mov eax, dword ptr [ebx]
|
|
</pre></td><td>AT&T Syntax<br>
|
|
<pre>
|
|
movb %bl,%al
|
|
movw %bx,%ax
|
|
movl %ebx,%eax
|
|
movl (%ebx),%eax
|
|
</pre></td></tr>
|
|
</table>
|
|
|
|
**NOTE: ALL EXAMPLES FROM HERE WILL BE IN AT&T SYNTAX**<br>
|
|
|
|
<hr>
|
|
|
|
<a name=Syscalls><h2>Syscalls.</h2></a>
|
|
|
|
<p> This section will outline the use of linux syscalls in assembly
|
|
language. Syscalls consist of all the functions in the second section of
|
|
the manual pages located in /usr/man/man2. They are also listed in:
|
|
/usr/include/sys/syscall.h. A great list is at
|
|
<a href=http://www.linuxassembly.org/syscall.html>http://www.linuxassembly.org/syscall.html.</a>
|
|
These functions can be executed via the linux interrupt service: int
|
|
$0x80. </p>
|
|
|
|
<a name=Syscall5><h3>Syscalls with < 6 args.</h3></a>
|
|
|
|
<p> For all syscalls, the syscall number goes in %eax. For syscalls that
|
|
have less than six args, the args go in %ebx,%ecx,%edx,%esi,%edi in order.
|
|
The return value of the syscall is stored in %eax.</p> <p> The syscall
|
|
number can be found in /usr/include/sys/syscall.h. The macros are defined
|
|
as SYS_<syscall name> i.e. SYS_exit, SYS_close, etc. </p>
|
|
|
|
Example:<br>
|
|
(Hello world program - it had to be done)
|
|
|
|
<p> According to the write(2) man page, write is declared as: ssize_t
|
|
write(int fd, const void *buf, size_t count); </p> <p> Hence fd goes in
|
|
%ebx, buf goes in %ecx, count goes in %edx and SYS_write goes in %eax.
|
|
This is followed by an int $0x80 which executes the syscall. The return
|
|
value of the syscall is stored in %eax.</p>
|
|
|
|
<pre>
|
|
$ cat write.s
|
|
.include "defines.h"
|
|
.data
|
|
hello:
|
|
.string "hello world\n"
|
|
|
|
.globl main
|
|
main:
|
|
movl $SYS_write,%eax
|
|
movl $STDOUT,%ebx
|
|
movl $hello,%ecx
|
|
movl $12,%edx
|
|
int $0x80
|
|
|
|
ret
|
|
$
|
|
</pre>
|
|
|
|
<p> The same process applies to syscalls which have less than five args.
|
|
Just leave the un-used registers unchanged. Syscalls such as open or fcntl
|
|
which have an optional extra arg will know what to use. </p>
|
|
|
|
<a name=Syscall6><h3>Syscalls with > 5 args.</h3></a>
|
|
|
|
<p> Syscalls whos number of args is greater than five still expect the
|
|
syscall number to be in %eax, but the args are arranged in memory and the
|
|
pointer to the first arg is stored in %ebx.</p> <p> If you are using the
|
|
stack, args must be pushed onto it backwards, i.e. from the last arg to
|
|
the first arg. Then the stack pointer should be copied to %ebx. Otherwise
|
|
copy args to an allocated area of memory and store the address of the
|
|
first arg in %ebx.</p>
|
|
|
|
Example: <br>
|
|
(mmap being the example syscall).
|
|
|
|
Using mmap() in C:<br>
|
|
<pre>
|
|
#include <sys/types.h>
|
|
#include <sys/stat.h>
|
|
#include <sys/mman.h>
|
|
#include <fcntl.h>
|
|
#include <unistd.h>
|
|
|
|
#define STDOUT 1
|
|
|
|
void main(void) {
|
|
char file[]="mmap.s";
|
|
char *mappedptr;
|
|
int fd,filelen;
|
|
|
|
fd=fopen(file, O_RDONLY);
|
|
filelen=lseek(fd,0,SEEK_END);
|
|
mappedptr=mmap(NULL,filelen,PROT_READ,MAP_SHARED,fd,0);
|
|
write(STDOUT, mappedptr, filelen);
|
|
munmap(mappedptr, filelen);
|
|
close(fd);
|
|
}
|
|
</pre>
|
|
Arrangement of mmap() args in memory:
|
|
<table border=1>
|
|
<tr><td>%esp</td><td>%esp+4</td><td>%esp+8</td><td>%esp+12</td>
|
|
<td>%esp+16</td><td>%esp+20</td></tr>
|
|
<tr><td>00000000</td><td>filelen</td><td>00000001</td>
|
|
<td>00000001</td><td>fd</td><td>00000000</td></tr>
|
|
</table>
|
|
|
|
|
|
ASM Equivalent:<br>
|
|
<pre>
|
|
$ cat mmap.s
|
|
.include "defines.h"
|
|
|
|
.data
|
|
file:
|
|
.string "mmap.s"
|
|
fd:
|
|
.long 0
|
|
filelen:
|
|
.long 0
|
|
mappedptr:
|
|
.long 0
|
|
|
|
.globl main
|
|
main:
|
|
push %ebp
|
|
movl %esp,%ebp
|
|
subl $24,%esp
|
|
|
|
// open($file, $O_RDONLY);
|
|
|
|
movl $fd,%ebx // save fd
|
|
movl %eax,(%ebx)
|
|
|
|
// lseek($fd,0,$SEEK_END);
|
|
|
|
movl $filelen,%ebx // save file length
|
|
movl %eax,(%ebx)
|
|
|
|
xorl %edx,%edx
|
|
|
|
// mmap(NULL,$filelen,PROT_READ,MAP_SHARED,$fd,0);
|
|
movl %edx,(%esp)
|
|
movl %eax,4(%esp) // file length still in %eax
|
|
movl $PROT_READ,8(%esp)
|
|
movl $MAP_SHARED,12(%esp)
|
|
movl $fd,%ebx // load file descriptor
|
|
movl (%ebx),%eax
|
|
movl %eax,16(%esp)
|
|
movl %edx,20(%esp)
|
|
movl $SYS_mmap,%eax
|
|
movl %esp,%ebx
|
|
int $0x80
|
|
|
|
movl $mappedptr,%ebx // save ptr
|
|
movl %eax,(%ebx)
|
|
|
|
// write($stdout, $mappedptr, $filelen);
|
|
// munmap($mappedptr, $filelen);
|
|
// close($fd);
|
|
|
|
movl %ebp,%esp
|
|
popl %ebp
|
|
|
|
ret
|
|
$
|
|
</pre>
|
|
|
|
**NOTE: The above source listing differs from the example source code
|
|
found at the end of the article. The code listed above does not show the other
|
|
syscalls, as they are not the focus of this section. The source above also
|
|
only opens mmap.s, whereas the example source reads the command line
|
|
arguments. The mmap example also uses lseek to get the filesize.**
|
|
|
|
<a name=Sockets><h3>Socket Syscalls.</h3></a>
|
|
|
|
<p> Socket syscalls make use of only one syscall number: SYS_socketcall
|
|
which goes in %eax. The socket functions are identified via a subfunction
|
|
numbers located in /usr/include/linux/net.h and are stored in %ebx. A
|
|
pointer to the syscall args is stored in %ecx. Socket syscalls are also
|
|
executed with int $0x80.</p>
|
|
|
|
<pre>
|
|
$ cat socket.s
|
|
.include "defines.h"
|
|
|
|
.globl _start
|
|
_start:
|
|
pushl %ebp
|
|
movl %esp,%ebp
|
|
sub $12,%esp
|
|
|
|
// socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);
|
|
movl $AF_INET,(%esp)
|
|
movl $SOCK_STREAM,4(%esp)
|
|
movl $IPPROTO_TCP,8(%esp)
|
|
|
|
movl $SYS_socketcall,%eax
|
|
movl $SYS_socketcall_socket,%ebx
|
|
movl %esp,%ecx
|
|
int $0x80
|
|
|
|
movl $SYS_exit,%eax
|
|
xorl %ebx,%ebx
|
|
int $0x80
|
|
|
|
movl %ebp,%esp
|
|
popl %ebp
|
|
ret
|
|
$
|
|
</pre>
|
|
|
|
<hr>
|
|
|
|
<a name=Command><h2>Command Line Arguments.</h2></a>
|
|
|
|
<p> Command line arguments in linux executables are arranged on the stack.
|
|
argc comes first, followed by an array of pointers (**argv) to the strings
|
|
on the command line followed by a NULL pointer. Next comes an array of
|
|
pointers to the environment (**envp). These are very simply obtained in
|
|
asm, and this is demonstrated in the example code (args.s).</p>
|
|
|
|
<hr>
|
|
|
|
<a name=InlineASM><h2>GCC Inline ASM.</h2></a>
|
|
|
|
<p> This section on GCC inline asm will only cover the x86 applications.
|
|
Operand constraints will differ on other processors. The location of the
|
|
listing will be at the <a href=#Links>end</a> of this article.</p> <p>
|
|
Basic inline assembly in gcc is very straightforward. In its basic form it
|
|
looks like this:</p>
|
|
|
|
<pre>
|
|
__asm__("movl %esp,%eax"); // look familiar ?
|
|
</pre>
|
|
or
|
|
<pre>
|
|
__asm__("
|
|
movl $1,%eax // SYS_exit
|
|
xor %ebx,%ebx
|
|
int $0x80
|
|
");
|
|
</pre>
|
|
|
|
<p> It is possible to use it more effectively by specifying the data that
|
|
will be used as input, output for the asm as well as which registers will
|
|
be modified. No particular input/output/modify field is compulsory. It is
|
|
of the format:</p>
|
|
|
|
<pre>
|
|
__asm__("<asm routine>" : output : input : modify);
|
|
</pre>
|
|
|
|
<p> The output and input fields must consist of an operand constraint
|
|
string followed by a C expression enclosed in parentheses. The output
|
|
operand constraints must be preceded by an '=' which indicates that it is
|
|
an output. There may be multiple outputs, inputs, and modified registers.
|
|
Each "entry" should be separated by commas (',') and there should be no
|
|
more than 10 entries total. The operand constraint string may either
|
|
contain the full register name, or an abbreviation.</p>
|
|
|
|
<table border=1>
|
|
<tr><td>Abbrev Table</td></tr>
|
|
<tr><td>Abbrev</td><td>Register</td></tr>
|
|
<tr><td>a</td><td>%eax/%ax/%al</td></tr>
|
|
<tr><td>b</td><td>%ebx/%bx/%bl</td></tr>
|
|
<tr><td>c</td><td>%ecx/%cx/%cl</td></tr>
|
|
<tr><td>d</td><td>%edx/%dx/%dl</td></tr>
|
|
<tr><td>S</td><td>%esi/%si</td></tr>
|
|
<tr><td>D</td><td>%edi/%di</td></tr>
|
|
<tr><td>m</td><td>memory</td></tr>
|
|
</table>
|
|
|
|
Example:<br>
|
|
<pre>
|
|
__asm__("test %%eax,%%eax", : /* no output */ : "a"(foo));
|
|
</pre>
|
|
OR<br>
|
|
<pre>
|
|
__asm__("test %%eax,%%eax", : /* no output */ : "eax"(foo));
|
|
</pre>
|
|
|
|
<p> You can also use the keyword __volatile__ after __asm__: "You can
|
|
prevent an `asm' instruction from being deleted, moved significantly, or
|
|
combined, by writing the keyword `volatile' after the `asm'."</p>
|
|
|
|
(Quoted from the "Assembler Instructions with C Expression Operands" section
|
|
in the gcc info files.)
|
|
|
|
<pre>
|
|
$ cat inline1.c
|
|
#include <stdio.h>
|
|
|
|
int main(void) {
|
|
int foo=10,bar=15;
|
|
|
|
__asm__ __volatile__ ("addl %%ebxx,%%eax"
|
|
: "=eax"(foo) // ouput
|
|
: "eax"(foo), "ebx"(bar)// input
|
|
: "eax" // modify
|
|
);
|
|
printf("foo+bar=%d\n", foo);
|
|
return 0;
|
|
}
|
|
$
|
|
</pre>
|
|
|
|
<p>You may have noticed that registers are now prefixed with "%%" rather
|
|
than '%'. This is necessary when using the output/input/modify fields
|
|
because register aliases based on the extra fields can also be used. I
|
|
will discuss these shortly.</p> <p>Instead of writing "eax" and forcing
|
|
the use of a particular register such as "eax" or "ax" or "al", you can
|
|
simply specify "a". The same goes for the other general purpose registers
|
|
(as shown in the Abbrev table). This seems useless when within the actual
|
|
code you are using specific registers and hence gcc provides you with
|
|
register aliases. There is a max of 10 (%0-%9) which is also the reason
|
|
why only 10 inputs/outputs are allowed.</p>
|
|
|
|
<pre>
|
|
$ cat inline2.c
|
|
int main(void) {
|
|
long eax;
|
|
short bx;
|
|
char cl;
|
|
|
|
__asm__("nop;nop;nop"); // to separate inline asm from the rest of
|
|
// the code
|
|
__volatile__ __asm__("
|
|
test %0,%0
|
|
test %1,%1
|
|
test %2,%2"
|
|
: /* no outputs */
|
|
: "a"((long)eax), "b"((short)bx), "c"((char)cl)
|
|
);
|
|
__asm__("nop;nop;nop");
|
|
return 0;
|
|
}
|
|
$ gcc -o inline2 inline2.c
|
|
$ gdb ./inline2
|
|
GNU gdb 4.18
|
|
Copyright 1998 Free Software Foundation, Inc.
|
|
GDB is free software, covered by the GNU General Public License, and you are
|
|
welcome to change it and/or distribute copies of it under certain conditions.
|
|
Type "show copying" to see the conditions.
|
|
There is absolutely no warranty for GDB. Type "show warranty" for details.
|
|
This GDB was configured as "i686-pc-linux-gnulibc1"...
|
|
(no debugging symbols found)...
|
|
(gdb) disassemble main
|
|
Dump of assembler code for function main:
|
|
... start: inline asm ...
|
|
0x8048427 <main+7>: nop
|
|
0x8048428 <main+8>: nop
|
|
0x8048429 <main+9>: nop
|
|
0x804842a <main+10>: mov 0xfffffffc(%ebp),%eax
|
|
0x804842d <main+13>: mov 0xfffffffa(%ebp),%bx
|
|
0x8048431 <main+17>: mov 0xfffffff9(%ebp),%cl
|
|
0x8048434 <main+20>: test %eax,%eax
|
|
0x8048436 <main+22>: test %bx,%bx
|
|
0x8048439 <main+25>: test %cl,%cl
|
|
0x804843b <main+27>: nop
|
|
0x804843c <main+28>: nop
|
|
0x804843d <main+29>: nop
|
|
... end: inline asm ...
|
|
End of assembler dump.
|
|
$
|
|
</pre>
|
|
|
|
<p>As you can see, the code that was generated from the inline asm loads
|
|
the values of the variables into the registers they were assigned to in
|
|
the input field and then proceeds to carry out the actual code. The
|
|
compiler auto detects operand size from the size of the variables and so
|
|
the corresponding registers are represented by the aliases %0, %1 and %2.
|
|
(Specifying the operand size in the mnemonic when using the register
|
|
aliases may cause errors while compiling). </p> <p> The aliases may also
|
|
be used in the operand constraints. This does not allow you to specify
|
|
more than 10 entries in the input/output fields. The only use for this i
|
|
can think of is when you specify the operand constraint as "q" which
|
|
allows the compiler to choose between a,b,c,d registers. When this
|
|
register is modified we will not know which register has been chosen and
|
|
consequently cannot specify it in the modify field. In which case you can
|
|
simply specify "<number>".</p>
|
|
|
|
Example:<br>
|
|
<pre>
|
|
$ cat inline3.c
|
|
#include <stdio.h>
|
|
|
|
int main(void) {
|
|
long eax=1,ebx=2;
|
|
|
|
__asm__ __volatile__ ("add %0,%2"
|
|
: "=b"((long)ebx)
|
|
: "a"((long)eax), "q"(ebx)
|
|
: "2"
|
|
);
|
|
printf("ebx=%x\n", ebx);
|
|
return 0;
|
|
}
|
|
$
|
|
</pre>
|
|
|
|
<hr>
|
|
|
|
<a name=#Compiling><h2>Compiling</h2></a>
|
|
|
|
<p>Compiling assembly language programs is much like compiling normal C
|
|
programs. If your program looks like Listing 1, then you would compile it
|
|
like you would a C app. If you use _start instead of main, like in Listing
|
|
2 you would compile the app slightly differently:</p>
|
|
|
|
<table border=1>
|
|
<tr><td valign=top>
|
|
<ul><li>Listing 1<br></ul>
|
|
<pre>
|
|
$ cat write.s
|
|
.data
|
|
hw:
|
|
.string "hello world\n"
|
|
.text
|
|
.globl main
|
|
main:
|
|
movl $SYS_write,%eax
|
|
movl $1,%ebx
|
|
movl $hw,%ecx
|
|
movl $12,%edx
|
|
int $0x80
|
|
movl $SYS_exit,%eax
|
|
xorl %ebx,%ebx
|
|
int $0x80
|
|
ret
|
|
$ gcc -o write write.s
|
|
$ wc -c ./write
|
|
4790 ./write
|
|
$ strip ./write
|
|
$ wc -c ./write
|
|
2556 ./write
|
|
</pre></td><td valign=top>
|
|
<ul><li>Listing 2<br></ul>
|
|
<pre>
|
|
$ cat write.s
|
|
.data
|
|
hw:
|
|
.string "hello world\n"
|
|
.text
|
|
.globl _start
|
|
_start:
|
|
movl $SYS_write,%eax
|
|
movl $1,%ebx
|
|
movl $hw,%ecx
|
|
movl $12,%edx
|
|
int $0x80
|
|
movl $SYS_exit,%eax
|
|
xorl %ebx,%ebx
|
|
int $0x80
|
|
|
|
$ gcc -c write.s
|
|
$ ld -s -o write write.o
|
|
$ wc -c ./write
|
|
408 ./write
|
|
</pre></td></tr>
|
|
</table>
|
|
|
|
<p>The -s switch is optional, it just creates a stripped ELF executable
|
|
which is smaller than a non-stripped one. This method (Listing 2) also
|
|
creates smaller executables, since the compiler isnt adding extra entry
|
|
and exit routines as would normally be the case. </p>
|
|
|
|
<hr>
|
|
|
|
<a name=Links><h2>Links.</h2></a>
|
|
|
|
<h3>Further reference.</h3>
|
|
<a href=http://www.linuxassembly.org>
|
|
http://www.linuxassembly.org</a><br>
|
|
<a href=http://www.gnu.org/manual/gas/>
|
|
GNU Assembler Manual</a><br>
|
|
<a href=http://gcc.gnu.org/onlinedocs/gcc_toc.html>
|
|
GNU C Compiler Manual</a><br>
|
|
<a href=http://www.gnu.org/manual/gdb-4.17/gdb.html>
|
|
GNU Debugger Manual</a><br>
|
|
<a href=http://gcc.gnu.org/onlinedocs/gcc_16.html#SEC181>
|
|
Operand Constraint Reference</a><br>
|
|
<a href=http://www.gnu.org/manual/gas/html_chapter/as_16.html#SEC196>
|
|
AT&T Syntax Reference</a><br>
|
|
|
|
<a name=k0de><h3>Example Code</h3></a>
|
|
|
|
<table>
|
|
<tr><td><a href=linasm-src.html#args>args.s</a></td>
|
|
<td>Reads command line arguments passed to the prog</td></tr>
|
|
<tr><td><a href=linasm-src.html#daemon>daemon.s</a></td>
|
|
<td>Binds a shell to a port (backdoor style)</a></td></tr>
|
|
<tr><td><a href=linasm-src.html#mmap>mmap.s</a></td>
|
|
<td>Maps a file to memory, and dumps its contents</td></tr>
|
|
<tr><td><a href=linasm-src.html#socket>socket.s</a></td>
|
|
<td>Creates a socket</td></tr>
|
|
<tr><td><a href=linasm-src.html#write>write.s</a></td>
|
|
<td>Hello world !</td></tr>
|
|
<tr><td><a href=linasm-src.tgz>linasm-src.tgz</a></td>
|
|
<td>Makefile defines.h args.s daemon.s socket.s write.s</td></tr>
|
|
</table>
|
|
|
|
</body>
|
|
</html>
|