How to use regular expressions
A regular expression, often called a pattern, is an expression that describes a set of strings. They are typically used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings "Handel", "H�ndel", and "Haendel" can be described by the pattern H(�|ae?)ndel (or alternatively, it is said that the pattern matches each of the three strings). In most formalisms, if there is any regular expression (Often refered to as a regex) that matches a particular set then there is an infinite number of such expressions. Most formalisms provide the following operations to construct regular expressions.
Description of the characters:
. | : | matches any single character. Within square bracket expressions, the dot character matches a literal dot. For example, a.c matches "abc", etc., but [a.c] matches only "a", ".", or "c". |
[ ] | : | a bracket expression matches a single character that is contained within the brackets. For example, [abc] matches "a", "b", or "c". [a-z] specifies a range which matches any lowercase letter from "a" to "z". These forms can be mixed: [abcx-z] matches "a", "b", "c", "x", "y", or "z", as does [a-cx-z]. The - character is treated as a literal character if it is the last or the first character within the brackets, or if it is escaped with a backslash: [abc-], [-abc] or [a-bc]. |
[^ ] | : | matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than "a", "b", or "c". [^a-z] matches any single character that is not a lowercase letter from "a" to "z". As above, literal characters and ranges can be mixed. |
^ | : | matches the starting position within the string. |
$ | : | matches the ending position of the string. |
: | matches what the nth marked subexpression matched, where n is a digit from 1 to 9. | |
* | : | matches the preceding element zero or more times. For example, ab*c matches "ac", "abc", "abbbc", etc. [xyz]* matches "", "x", "y", "z", "zx", "zyx", "xyzzy", and so on. (ab)* matches "", "ab", "abab", "ababab", and so on. |
? | : | matches the preceding element zero or one time. For example, ba? matches "b" or "ba". |
+ | : | matches the preceding element one or more times. For example, ba+ matches "ba", "baa", "baaa", and so on. |
| | : | the choice operator matches either the expression before or the expression after the operator. For example, abc|def matches "abc" or "def". |
Examples:
.at | : | matches any three-character string ending with "at", including "hat", "cat", and "bat". |
[hc]at | : | matches "hat" and "cat". |
[^b]at | : | matches all strings matched by at except "bat". |
^[hc]at | : | matches "hat" and "cat", but only at the beginning of the string or line. |
[hc]at$ | : | matches "hat" and "cat", but only at the end of the string or line. |
[hc]+at | : | matches "hat", "cat", "hhat", "chat", "hcat", "ccchat", and so on, but not "at". |
[hc]?at | : | matches "hat", "cat", and "at". |
cat|dog | : | matches "cat" or "dog". |
More information on this subject can be found online from the following links:
Wikipedia.com
Regular-Expressions.info