these mamajammas are fun for all. why they don't teach this stuff in kindergarten
i'll never know ;^) regular expressions are used to search strings, find matches,
replace them, and fun stuff like that. they are available in most programming
languages including perl, php, javascript, jscript, vbscript, and java. as a
programer, regular expressions are a powerfull tool that will assist you in
many projects. here is my schpeel on regular expressions, i hope that it helps
you out...whoever you are ;^)
here is a list of regular expressions and what they match. This list is still
being updated...
| charachter |
description |
| \ |
the escape charachter, this is used to match special charachters.
'\\' matches '\', '\n' matches a newline charachter, '\\n' matches '\n' |
| ^ |
matches the beginning of the input string. if the multiline
property is set, it will also match the position fillowing a '\n' or '\r' |
| $ |
matches the position at the end of an input string. if the
multiline property is set, it will also match the position preceding a '\n'
or '\r' |
| * |
matches the previous expression 0 or more times. 'bofo*' will
match 'bofo', 'bofoo', and 'bof'. '*' is equivilent to {0,} |
| + |
matches the previous expression 1 or more times. 'bofo+' will
match 'bofo' and 'bofoo', but not 'bof'. '+' is equivilent to {1,} |
| ? |
matches the previous expression zero or one time. 'bofo?'
will match 'bofo' and 'bof', but not 'bofu'. '?' is equivilent to {0,1}
| also if this charachter follows any other quantifier (*,+,?,{n},{n,},{n,m})
the expression will become non-greedy. |
| {n} |
where n is a non-negative integer. matches the previous
subexpression exactly n times. 'bofo{2}' will match 'bofoo', but
not 'bofo' |
| {n,} |
where n is a non-negative integer. matches the the
previous subexpression at least n times. ' |
| {n,m} |
where n and m are non-negative integers and
n <= m. matches the the previous subexpression at least
n times, but no more than m times. in the string 'bofoooooo'
the expression 'bofo{1,3}' will match 'bofooo' |
| . |
matches any single charachter exept '\n' |
| (pattern) |
matches (pattern) and captures the match. the captured
match can be retrieved from the resulting Matches collection using the SubMatches
collection of VBScript or the $0 - $9 properties in jscript |
| (?:pattern) |
|
| (?=pattern) |
|
| (?!pattern) |
|
| x|y |
| is the logical or, 'x|y' matches either x
or y. '(b|f)o' will match 'bo' and 'fo'. 'b|fo' will match 'b' and
'fo' |
| [xyz] |
defines a charachter set and matches any charachter in the
set. [bad] will match the 'b' in 'bofo' |
| [^xyz] |
defines a negative charachter set and matches anything except
the charachters in the set. [^xyz] is the opposite of [xyz] |
| \b |
matches a word boundry, the position between a word and a
space. 'co\b' matces the 'co' on 'soco', but not the 'co' in 'computer' |
| \B |
matches a non-word boundry. 'co\B' matces the 'co' in 'computer',
but not the 'co' on 'soco' this is the opposite of '\b' |
| \cx |
where x is a control charachter. '\cM' matches Control-M,
or a newline. |
| \d |
matches a digit charachter, equivilent to [0-9], the opposite
of \D |
| \D |
matches a non- digit charachter, equivilent to [^0-9], the
opposite of \d |
| \f |
matches a form-feed. equivilent to \x0c and \cL |
| \n |
matches a newline. equivilent to \x0a and \cJ |
| \r |
matches a carriage return. equivilent to \x0d and \cM |
| \s |
matches any white space charachter. equivilent to [\f\n\t\r\v],
the opposite of \S |
| \S |
matches any non-white space charachter. equivilent to [^\f\n\t\r\v],
the opposite of \s |
| \t |
matches a tab. equivilent to \x09 and \cI |
| \v |
matches a verticle tab charachter. equivilent to \x0b and
\cK |
| \w |
matches any word charachter including an underscore. equivilent
to [A-Za-z0-9_], the opposite of \W |
| \W |
matches any non-word charachter including an underscore. equivilent
to [^A-Za-z0-9_], the opposite of \w |
\xn
|
where n is a hexidecimal escape value representing
an ASCII code. hex escape values must be 2 digits long. '\x41' matches 'A',
'\x041' will be translated as '\04' and '1' |
| \num |
were num is a positive integer, this will reference
captured matches. (.)\1 matches 2 consecutive identical charachters |
| \n |
either an octal escape or a backreference. if \n is preceded
by at least n captured subexpressions, n is a backreference.
otherwise it is an octal escape value if n is an octal digit (0-7) |
| \nm |
either an octal escape or a backreference. if \nm is
preceded by at least nm captured subexpressions, nm is a backreference.
if nm is preceded by at least n captures, n is a bacreference
followed by he literal m, if neither preceding matches exist \nm
matches octal escape value nm where n amd m are octal digits
(0-7). |
| \nml |
mathces octal escape value nml where n is an
octal digit (0-3) and m and l are octal digits (0-7). |
| \un |
matches n where n is a unicode characheter expressed
as four hexidecimal dgits. \u00A9 matches the copyright symbol - © |
| |
|
| |
|