Go forward to Case-sensitivity.
Go backward to Regexp Usage.
Go up to Regexp.

Regular Expression Operators

   You can combine regular expressions with the following characters,
called "regular expression operators", or "metacharacters", to increase
the power and versatility of regular expressions.

   Here is a table of metacharacters.  All characters not listed in the
table stand for themselves.

     This matches the beginning of the string or the beginning of a line
     within the string.  For example:


     matches the `@chapter' at the beginning of a string, and can be
     used to identify chapter beginnings in Texinfo source files.

     This is similar to `^', but it matches only at the end of a string
     or the end of a line within the string.  For example:


     matches a record that ends with a `p'.

     This matches any single character except a newline.  For example:


     matches any single character followed by a `P' in a string.  Using
     concatenation we can make regular expressions like `U.A', which
     matches any three-character sequence that begins with `U' and ends
     with `A'.

     This is called a "character set".  It matches any one of the
     characters that are enclosed in the square brackets.  For example:


     matches any one of the characters `M', `V', or `X' in a string.

     Ranges of characters are indicated by using a hyphen between the
     beginning and ending characters, and enclosing the whole thing in
     brackets.  For example:


     matches any digit.

     To include the character `\', `]', `-' or `^' in a character set,
     put a `\' in front of it.  For example:


     matches either `d', or `]'.

     This treatment of `\' is compatible with other `awk'
     implementations, and is also mandated by the POSIX Command Language
     and Utilities standard.  The regular expressions in `awk' are a
     superset of the POSIX specification for Extended Regular
     Expressions (EREs).  POSIX EREs are based on the regular
     expressions accepted by the traditional `egrep' utility.

     In `egrep' syntax, backslash is not syntactically special within
     square brackets.  This means that special tricks have to be used to
     represent the characters `]', `-' and `^' as members of a
     character set.

     In `egrep' syntax, to match `-', write it as `---', which is a
     range containing only `-'.  You may also give `-' as the first or
     last character in the set.  To match `^', put it anywhere except
     as the first character of a set.  To match a `]', make it the
     first character in the set.  For example:


     matches either `]', `d' or `^'.

`[^ ...]'
     This is a "complemented character set".  The first character after
     the `[' *must* be a `^'.  It matches any characters *except* those
     in the square brackets (or newline).  For example:


     matches any character that is not a digit.

     This is the "alternation operator" and it is used to specify
     alternatives.  For example:


     matches any string that matches either `^P' or `[0-9]'.  This
     means it matches any string that contains a digit or starts with

     The alternation applies to the largest possible regexps on either

     Parentheses are used for grouping in regular expressions as in
     arithmetic.  They can be used to concatenate regular expressions
     containing the alternation operator, `|'.

     This symbol means that the preceding regular expression is to be
     repeated as many times as possible to find a match.  For example:


     applies the `*' symbol to the preceding `h' and looks for matches
     to one `p' followed by any number of `h's.  This will also match
     just `p' if no `h's are present.

     The `*' repeats the *smallest* possible preceding expression.
     (Use parentheses if you wish to repeat a larger expression.)  It
     finds as many repetitions as possible.  For example:

          awk '/\(c[ad][ad]*r x\)/ { print }' sample

     prints every record in the input containing a string of the form
     `(car x)', `(cdr x)', `(cadr x)', and so on.

     This symbol is similar to `*', but the preceding expression must be
     matched at least once.  This means that:


     would match `why' and `whhy' but not `wy', whereas `wh*y' would
     match all three of these strings.  This is a simpler way of
     writing the last `*' example:

          awk '/\(c[ad]+r x\)/ { print }' sample

     This symbol is similar to `*', but the preceding expression can be
     matched once or not at all.  For example:


     will match `fed' and `fd', but nothing else.

     This is used to suppress the special meaning of a character when
     matching.  For example:


     matches the character `$'.

     The escape sequences used for string constants (*note Constant
     Expressions: Constants.) are valid in regular expressions as well;
     they are also introduced by a `\'.

   In regular expressions, the `*', `+', and `?' operators have the
highest precedence, followed by concatenation, and finally by `|'.  As
in arithmetic, parentheses can change how operators are grouped.