376 lines
15 KiB
Plaintext
376 lines
15 KiB
Plaintext
E L V I S
|
|
|
|
Elvis is a clone of vi/ex. It boasts about 97% of the vi commands and about
|
|
92% of the ex commands. It is generally quite fast. It can edit files that
|
|
are larger than a single process' data space. Elvis also has a few features
|
|
that the real vi lacks. Several related programs are included, too.
|
|
|
|
Elvis runs under BSD/SysV UNIX, MINIX-ST, and MINIX-PC.
|
|
Elvis should be fairly easy to port to any OS that has the termcap functions.
|
|
|
|
It cannot be compiled on MINIX using the standard ACK compiler because asld
|
|
runs out of memory while linking. If you must recompile it, you will have to
|
|
use some other C compiler and cross compiler on MS-DOS. This problem will
|
|
be remedied in MINIX 2.0 when a new ACK compiler becomes available.
|
|
|
|
|
|
------------------------------- COPYING -----------------------------
|
|
The copyright of Elvis is controlled by the author, Steve Kirkendall.
|
|
(That's me. Hello!)
|
|
|
|
This document describes the restrictions on how Elvis can be distributed.
|
|
The restrictions are, basically, that anybody can make & distribute copies,
|
|
and that nobody except me can say otherwise.
|
|
|
|
You can make any number of copies of Elvis, and distribute it in either
|
|
source code form or executable form, either alone or as part of some
|
|
other package, to anybody you wish, provided that the following conditions
|
|
are met:
|
|
|
|
* If you are distributing Elvis as part of a package, then you must
|
|
either distribute the whole package for free, or you must be willing
|
|
to distribute Elvis separately for free to anybody who wants it.
|
|
|
|
You can charge up to $6 for disks and labor, and still say the
|
|
software is free.
|
|
|
|
* If you are distributing Elvis via a BBS or a network, then you
|
|
may charge for connection time & subscription fees, but there must
|
|
not be any surcharge for downloading Elvis.
|
|
|
|
* This document must be included with each copy that you distribute.
|
|
|
|
* The name "Elvis" cannot be changed -- not even in advertisements.
|
|
In particular, I don't want anybody claiming that Elvis is the
|
|
real "vi" from BSD. It's okay to call it "Elvis - a clone of vi",
|
|
though.
|
|
|
|
* This agreement cannot be modified without my permission.
|
|
|
|
My address is: Steve Kirkendall
|
|
16820 SW Tallac Way
|
|
Beaverton, OR 97006
|
|
|
|
My phone# is: (503) 642-9905
|
|
|
|
Email: kirkenda@cs.pdx.edu
|
|
...uunet!tektronix!psueea!eecs!kirkenda
|
|
|
|
|
|
------------------------------- Termcap -----------------------------
|
|
REQUIRED NUMERIC CAPABILITIES
|
|
:co#: number of columns on the screen (characters per line)
|
|
:li#: number of lines on the screen
|
|
|
|
REQUIRED STRING CAPABILITIES
|
|
:ce=: clear to end-of-line
|
|
:cl=: home the cursor & clear the screen
|
|
:cm=: move the cursor to a given row/column
|
|
:up=: move the cursor up one line
|
|
|
|
BOOLEAN CAPABILITIES
|
|
:am: auto margins - wrap when a char is written to the last column?
|
|
:pt: physical tabs?
|
|
|
|
OPTIONAL STRING CAPABILITIES
|
|
:al=: insert a blank row on the screen
|
|
:dl=: delete a row from the screen
|
|
:cd=: clear to end of display
|
|
:ei=: end insert mode
|
|
:ic=: insert a blank character
|
|
:im=: start insert mode
|
|
:dc=: delete a character
|
|
:sr=: scroll reverse (insert a row at the top of the screen)
|
|
:vb=: visible bell
|
|
|
|
OPTIONAL STRINGS RECEIVED FROM THE KEYBOARD
|
|
:kd=: sequence sent by the <down arrow> key
|
|
:kl=: sequence sent by the <left arrow> key
|
|
:kr=: sequence sent by the <right arrow> key
|
|
:ku=: sequence sent by the <up arrow> key
|
|
:PU=: sequence sent by the <PgUp> key
|
|
:PD=: sequence sent by the <PgDn> key
|
|
:HM=: sequence sent by the <Home> key
|
|
:EN=: sequence sent by the <End> key
|
|
|
|
OPTIONAL CAPABILITIES THAT DESCRIBE CHARACTER ATTRIBUTES
|
|
:so=: :se=: start/end standout mode (We don't care about :sg#:)
|
|
:us=: :ue=: start/end underlined mode
|
|
:VB=: :Vb=: start/end boldface mode
|
|
:as=: :ae=: start/end alternate character set (italics)
|
|
:ug#: visible gap left by :us=:, :ue=:, :VB=:, or :Vb=:
|
|
|
|
|
|
------------------------------- Cflags -----------------------------
|
|
Elvis uses many preprocessor symbols to control compilation...
|
|
|
|
-DM_SYSV
|
|
If defined, then Elvis uses SysV ioctl() calls to control the tty;
|
|
normally it uses V7/BSD/MINIX ioctl() calls.
|
|
|
|
-DDATE=\'\"`date`\"\'
|
|
DATE should be defined to be a string constant. It is printed by the
|
|
:version command as the compilation date of the program.
|
|
|
|
It is only used in cmd1.c, and even there you may leave it undefined
|
|
without causing an urp.
|
|
|
|
The form shown above only works if you use "eval". See the Makefile.
|
|
|
|
-DTMPNAME=\"/tmp/vi%04x%04x\"
|
|
This allows you to use a different name for Elvis' temporary files.
|
|
The default value is defined near the top of vi.h, so you only need
|
|
to use this on the commandline if the default name is wrong.
|
|
|
|
It should contain two "%d" or "%x" formats, which are replaced by
|
|
the inode number and device major/minor number.
|
|
|
|
-DCUTNAME=\"/tmp/cut%04x%04x\"
|
|
This is similar to TMPNAME, but is used to generate names for old
|
|
temp files which are being kept around because they are refered to
|
|
by cut buffers.
|
|
|
|
It should contain two "%d" or "%x" formats, which are replaced by the
|
|
Elvis' getpid() and file-descriptor numbers.
|
|
|
|
-DCRUNCH
|
|
This option causes some large & often-used macros to be replaced by
|
|
equivelent functions. It reduces the size of the ".text" segment by
|
|
about 4K, and you don't sacrifice any features -- just a little speed.
|
|
|
|
-DSET_NOCHARATTR
|
|
Permanently disables the charattr option. This reduces the size of
|
|
your ".text" segment by about 850 bytes.
|
|
|
|
-DNO_RECYCLE
|
|
Normally, Elvis will recycle space from the tmp file which contains
|
|
totally obsolete text. This flag disables this recycling. Without
|
|
recycling, the ".text" segment is about 1K smaller that it would
|
|
otherwise be, but the tmp file grows much faster. If you have a lot
|
|
of free space on your harddisk, but Elvis is too bulky to run with
|
|
recycling, then try it without recycling.
|
|
|
|
-DDEBUG
|
|
This adds the ":debug" and ":validate" commands, and also adds many
|
|
internal consistency checks. It increases the size of the ".text"
|
|
segment by about 5K.
|
|
|
|
|
|
------------------------------- Mods -----------------------------
|
|
A few ideas for modifications...
|
|
|
|
MODE INDICATORS
|
|
Elvis always reads keystrokes via the getkey() function. This function
|
|
is called with an argument which describes the context in which the
|
|
character will be processed:
|
|
|
|
WHEN_EX - called from the vgets() function to read
|
|
a single line of text. Either EX command
|
|
mode, EX text entry mode, or VI while reading
|
|
a search string.
|
|
WHEN_VICMD - VI mode, getting a command character.
|
|
WHEN_VIINP - VI's input mode.
|
|
WHEN_VIREP - VI's replace mode (the R command).
|
|
0 - misc times, e.g. "HIT A KEY TO CONTINUE"
|
|
|
|
So, the getkey() function would be a good place to add some kind of
|
|
mode indicator. Like, you could change the shape of the cursor for
|
|
input mode vs. VI command mode.
|
|
|
|
ARROW KEYS IN INPUT MODE
|
|
The arrow keys are not normally mapped during input mode. It might
|
|
be fun, though, to map them to ESC + [hjkl] + a. This way, if you
|
|
hit an arrow key while in input mode, elvis would take you out of input
|
|
mode momentarily, move the cursor, and drop you back into input mode.
|
|
|
|
Neat, huh?
|
|
|
|
Something similar could be done with replace mode.
|
|
|
|
WRAP LONG LINES (INSTEAD OF SCROLLING SIDEWAYS)
|
|
This would mostly require changes to redraw(), mark2phys(), and
|
|
drawtext(). All of these are in the file "redraw.c".
|
|
|
|
ADD MORE SUPPORT FOR NON-ASCII CHARACTER SETS
|
|
Elvis displays 8-bit character sets just fine, but is a bit weak in
|
|
the input and search departments.
|
|
|
|
For input, something similar to :map would be nice. Actually, :abbr
|
|
is a little closer. How about ":digraph" to map a specified pair of
|
|
ASCII characters into a single non-ASCII character?
|
|
|
|
REWRITE THE REGULAR EXPRESSION PARSER AND THE SEARCHING CODE
|
|
The current doesn't allow you to search for non-ASCI characters.
|
|
It could probably be made smaller & faster.
|
|
|
|
Suggestions are welcome.
|
|
|
|
|
|
------------------------------- internal -----------------------------
|
|
INTERNAL TEXT REPRESENTATION
|
|
When elvis starts up, the file is copied into a temporary file. Small
|
|
amounts of extra space are inserted into the temporary file to insure
|
|
that no text lines cross block boundaries; this speeds up processing.
|
|
The "extra space" is filled with NUL charcters; the input file must
|
|
not contain any NULs, to avoid confusion.
|
|
|
|
The first block of the temporary file is an array of shorts which
|
|
describe the order of the blocks; i.e. header[1] is the block number
|
|
of the first block, and so on. This limits the temporary file to
|
|
512 active blocks, so the largest file you can edit is about 400K
|
|
bytes long. I hope that's enough!
|
|
|
|
When blocks are altered, they are rewritten to a *different* block
|
|
in the file, and the in-core version of the header block is updated
|
|
accordingly. The in-core header block will be copied to the temp
|
|
file immediately before the next change... or, to undo this change,
|
|
swap the old header (from the temp file) with the new (in-core)
|
|
header.
|
|
|
|
Elvis maintains another in-core array which contains the line-number
|
|
of the last line in every block. This allows you to go directly to a
|
|
line, given its line number.
|
|
|
|
IMPLEMENTATION OF EDITING
|
|
There are three basic operations which affect text:
|
|
|
|
* delete text - delete(from, to)
|
|
* add text - add(at, text)
|
|
* yank text - cut(from, to)
|
|
|
|
To yank text, all text between two text positions is copied into
|
|
a cut buffer. The original text is not changed. To copy the text
|
|
into a cut buffer, you need only remember which physical blocks that
|
|
contain the cut text, the offset into the first block of the start of
|
|
the cut, the offset into the last block of the end of the cut, and
|
|
what kind of cut it was. (Cuts may be either character cuts or line
|
|
cuts; the kind of a cut affects the way it is later "put".) This is
|
|
implemented in the function cut().
|
|
|
|
To delete text, you must modify the first and last blocks, and remove
|
|
any reference to the intervening blocks in the header's list. The
|
|
text to be deleted is specified by two marks. This is implemented in
|
|
the function delete();
|
|
|
|
To add text, you must specify the text to insert (as a NUL-terminated
|
|
string) and the place to insert it (as a mark). The block into which
|
|
the text is to be inserted may need to be split into as many as four
|
|
blocks, with new intervening blocks needed as well... or it could be
|
|
as simple as modifying the block. This is implemented in the function
|
|
add().
|
|
|
|
Other interesting functions are paste() (to copy text from a cut buffer
|
|
into the file), modify() (for an efficient way to implement a combined
|
|
delete/add sequence), and input() (to get text from the user & insert
|
|
it into the file).
|
|
|
|
When text is modified, an internal file-revision counter, called
|
|
"changes", is incremented. This counter is used to detect when
|
|
certain caches are out of date. (The "changes" counter is also
|
|
incremented when we switch to a different file, and also in one
|
|
or two similar situations -- all related to invalidating caches.)
|
|
|
|
MARKS AND THE CURSOR
|
|
Marks are places within the text. They are represented internally
|
|
as a long variable which is split into two bitfields: a line number
|
|
and a character index. Line numbers start with 1, and character
|
|
indexes start with 0.
|
|
|
|
Since line numbers start with 1, it is impossible for a set mark to
|
|
have a value of 0L. 0L is therefore used to represent unset marks.
|
|
|
|
When you do the "delete text" change, any marks that were part of
|
|
the deleted text are unset, and any marks that were set to points
|
|
after it are adjusted. Similarly, marks are adjusted after new text
|
|
is inserted.
|
|
|
|
The cursor is represented as a mark.
|
|
|
|
EX COMMAND INTERPRETATION
|
|
EX commands are parsed, and the command name is looked up in an array
|
|
of structures which also contain a pointer to the function that
|
|
implements the command, and a description of the arguments that the
|
|
command can take. If the command is recognized and its arguments
|
|
are legal, then the function is called.
|
|
|
|
Each function performs its task; this may cause the cursor to be moved
|
|
to a different line, or whatever.
|
|
|
|
SCREEN CONTROL
|
|
The screen is updated via a package which looks like the "curses"
|
|
library, but which is actually implemented in a simpler, faster way.
|
|
Most curses operations are implemented as macros which copy characters
|
|
into a large I/O buffer, which is then written with a single large
|
|
write() call as part of the refresh() operation.
|
|
|
|
The functions which modify text remember where text has been modified;
|
|
the screen redrawing function needs this information to help it reduce
|
|
the amount of text that is redrawn each time.
|
|
|
|
|
|
------------------------------- regexp -----------------------------
|
|
Regular Expressions
|
|
|
|
Syntax
|
|
|
|
The code for handling regular expressions is derived from Henry Spencer's
|
|
regexp package. However, I have modified the syntax to resemble that of
|
|
the real vi.
|
|
|
|
ELVIS' regexp package treats the following one- or two-character strings
|
|
(called meta-characters) in special ways:
|
|
|
|
\( \) Used to control grouping
|
|
^ Matches the beginning of a line
|
|
$ Matches the end of a line
|
|
\< Matches the beginning of a word
|
|
\> Matches the end of a word
|
|
. Matches any single character
|
|
[ ] Matches any single character inside the brackets
|
|
* The preceding may be repeated 0 or more times
|
|
+ The preceding may be repeated 1 or more times
|
|
? The preceding is optional
|
|
\| Separates two alternatives
|
|
|
|
Anything else is treated as a normal character which must match exactly.
|
|
The special strings may all be preceded by a backslash to force them to
|
|
be treated normally.
|
|
|
|
For example, "\(for\|back\)ward" will find "forward" or "backward", and
|
|
"\<text\>" will find "text" but not "context".
|
|
|
|
|
|
Options
|
|
|
|
ELVIS has two options which affect the way regular expressions are used.
|
|
These options may be examined or set via the :set command.
|
|
|
|
The first option is called "[no]magic". This is a boolean option, and it is
|
|
"magic" (TRUE) by default. While in magic mode, all of the meta-characters
|
|
behave as described above. In nomagic mode, only ^ and $ retain their
|
|
special meaning.
|
|
|
|
The second option is called "[no]ignorecase". This is a boolean option, and
|
|
it is "noignorecase" (FALSE) by default. While in ignorecase mode, the
|
|
searching mechanism will not distinguish between an uppercase letter and its
|
|
lowercase form. In noignorecase mode, uppercase and lowercase are treated
|
|
as being different.
|
|
|
|
Also, the "[no]wrapscan" option affects searches.
|
|
|
|
Substitutions
|
|
|
|
The :s command has at least two arguments: a regular expression, and a
|
|
substitution string. The text that matched the regular expression is
|
|
replaced by text which is derived from the substitution string.
|
|
|
|
Most characters in the substitution string are copied into the text literally
|
|
but a few have special meaning:
|
|
|
|
& Causes a copy of the original text to be inserted
|
|
\1 Inserts a copy of that portion of the original text which
|
|
matched the first set of \( \) parentheses.
|
|
\2 - \9 Does the same for the second (etc.) pair of \( \).
|
|
|
|
These may be preceded by a backslash to force them to be treated normally.
|