How to use regular expressions


A regular expression, often called a pattern, is an expression that describes a set of strings. They are typically used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings "Handel", "H�ndel", and "Haendel" can be described by the pattern H(�|ae?)ndel (or alternatively, it is said that the pattern matches each of the three strings). In most formalisms, if there is any regular expression (Often refered to as a regex) that matches a particular set then there is an infinite number of such expressions. Most formalisms provide the following operations to construct regular expressions.

Description of the characters:

. : matches any single character. Within square bracket expressions, the dot character matches a literal dot. For example, a.c matches "abc", etc., but [a.c] matches only "a", ".", or "c".
[ ] : a bracket expression matches a single character that is contained within the brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: [abcx-z] matches "a", "b", "c", "x", "y", or "z", as does [a-cx-z]. The - character is treated as a literal character if it is the last or the first character within the brackets, or if it is escaped with a backslash: [abc-], [-abc] or [a-bc].
[^ ] : matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than "a", "b", or "c". [^a-z] matches any single character that is not a lowercase letter from "a" to "z". As above, literal characters and ranges can be mixed.
^ : matches the starting position within the string.
$ : matches the ending position of the string.
: matches what the nth marked subexpression matched, where n is a digit from 1 to 9.
* : matches the preceding element zero or more times. For example, ab*c matches "ac", "abc", "abbbc", etc. [xyz]* matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on. (ab)* matches "", "ab", "abab", "ababab", and so on.
? : matches the preceding element zero or one time. For example, ba? matches "b" or "ba".
+ : matches the preceding element one or more times. For example, ba+ matches "ba", "baa", "baaa", and so on.
| : the choice operator matches either the expression before or the expression after the operator. For example, abc|def matches "abc" or "def".


Examples:

.at : matches any three-character string ending with "at", including "hat", "cat", and "bat".
[hc]at : matches "hat" and "cat".
[^b]at : matches all strings matched by at except "bat".
^[hc]at : matches "hat" and "cat", but only at the beginning of the string or line.
[hc]at$ : matches "hat" and "cat", but only at the end of the string or line.
[hc]+at : matches "hat", "cat", "hhat", "chat", "hcat", "ccchat", and so on, but not "at".
[hc]?at : matches "hat", "cat", and "at".
cat|dog : matches "cat" or "dog".


More information on this subject can be found online from the following links:

Wikipedia.com
Regular-Expressions.info



Article ID: 483
Created On: Tue, Jan 14, 2014 at 10:53 PM
Last Updated On: Tue, Jan 14, 2014 at 10:53 PM
Authored by: Administrator [[email protected]]

Online URL: https://kb.quikbox.com/article.php?id=483