add directory Minix
This commit is contained in:
107
Minix/1.7.5/MANUALS/CAT3/REGCMP.3
Normal file
107
Minix/1.7.5/MANUALS/CAT3/REGCMP.3
Normal file
@@ -0,0 +1,107 @@
|
||||
|
||||
NAME
|
||||
regcmp, regex - compile and execute regular expression
|
||||
SYNTAX
|
||||
char *regcmp (string1 [, string2, ...], (char *)0)
|
||||
char *string1, *string2, ...;
|
||||
char *regex (re, subject[, ret0, ...])
|
||||
char *re, *subject, *ret0, ...;
|
||||
extern char *__loc1;
|
||||
DESCRIPTION
|
||||
Regcmp compiles a regular expression and returns a pointer
|
||||
to the compiled form. Malloc(3C) is used to create space
|
||||
for the vector. It is the user's responsibility to free
|
||||
unneeded space so allocated. A NULL return from regcmp
|
||||
indicates an incorrect argument. Regcmp(1) has been written
|
||||
to generally preclude the need for this routine at execution
|
||||
time.
|
||||
Regex executes a compiled pattern against the subject
|
||||
string. Additional arguments are passed to receive values
|
||||
back. Regex returns NULL on failure or a pointer to the
|
||||
next unmatched character on success. A global character
|
||||
pointer __loc1 points to where the match began. Regcmp and
|
||||
regex were mostly borrowed from the editor, ed(1); however,
|
||||
the syntax and semantics have been changed slightly. The
|
||||
following are the valid symbols and their associated
|
||||
meanings.
|
||||
[]*.^ These symbols retain their current meaning.
|
||||
$ Matches the end of the string; \n matches a new-
|
||||
line.
|
||||
- Within brackets the minus means through. For
|
||||
example, [a-z] is equivalent to [abcd...xyz]. The
|
||||
- can appear as itself only if used as the first
|
||||
or last character. For example, the character
|
||||
class expression []-] matches the characters
|
||||
] and -.
|
||||
+ A regular expression followed by + means one or
|
||||
more times. For example, [0-9]+ is equivalent to
|
||||
[0-9][0-9]*.
|
||||
{m} {m,} {m,u}
|
||||
Integer values enclosed in {} indicate the number
|
||||
of times the preceding regular expression is to be
|
||||
applied. The value m is the minimum number and u
|
||||
is a number, less than 256, which is the maximum.
|
||||
|
||||
If only m is present (e.g., {m}), it indicates the
|
||||
exact number of times the regular expression is to
|
||||
be applied. The value {m,} is analogous to
|
||||
{m,infinity}. The plus (+) and star (*)
|
||||
operations are equivalent to {1,} and {0,}
|
||||
respectively.
|
||||
( ... )$n The value of the enclosed regular expression is to
|
||||
be returned. The value will be stored in the
|
||||
(n+1)th argument following the subject argument.
|
||||
At most ten enclosed regular expressions are
|
||||
allowed. Regex makes its assignments
|
||||
unconditionally.
|
||||
( ... ) Parentheses are used for grouping. An operator,
|
||||
e.g., *, +, {}, can work on a single character or
|
||||
a regular expression enclosed in parentheses. For
|
||||
example, (a*(cb+)*)$0.
|
||||
By necessity, all the above defined symbols are special.
|
||||
They must, therefore, be escaped to be used as themselves.
|
||||
EXAMPLES
|
||||
Example 1:
|
||||
char *cursor, *newcursor, *ptr;
|
||||
...
|
||||
newcursor = regex((ptr = regcmp("^\n", 0)), cursor);
|
||||
free(ptr);
|
||||
This example will match a leading new-line in the subject
|
||||
string pointed at by cursor.
|
||||
Example 2:
|
||||
char ret0[9];
|
||||
char *newcursor, *name;
|
||||
...
|
||||
name = regcmp("([A-Za-z][A-za-z0-9_]{0,7})$0", '(char *)0');
|
||||
newcursor = regex(name, "123Testing321", ret0);
|
||||
This example will match through the string ``Testing3'' and
|
||||
will return the address of the character after the last
|
||||
matched character (cursor+11). The string ``Testing3'' will
|
||||
be copied to the character array ret0.
|
||||
Example 3:
|
||||
#include "file.i"
|
||||
char *string, *newcursor;
|
||||
...
|
||||
newcursor = regex(name, string);
|
||||
This example applies a precompiled regular expression in
|
||||
file.i [see regcmp(1)] against string.
|
||||
This routine is kept in /lib/<module>/libPW.a, where module
|
||||
is either small or large.
|
||||
SEE ALSO
|
||||
malloc(3C).
|
||||
ed(1), regcmp(1) in the UNIX System V User Reference Manual.
|
||||
BUGS
|
||||
The user program may run out of memory if regcmp is called
|
||||
iteratively without freeing the vectors no longer required.
|
||||
The following user-supplied replacement for malloc(3C)
|
||||
reuses the same vector, saving time and space:
|
||||
/* user's program */
|
||||
...
|
||||
char *
|
||||
malloc(n)
|
||||
unsigned n;
|
||||
{
|
||||
static char rebuf[512];
|
||||
return (n <= sizeof rebuf) ? rebuf : NULL;
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user