The parser accepts regular expressions of the form /REGEX/MODIFIERS where most of the modifiers get ignored except the ones in /ism which can change the semantics of the expression in question. The current parser do not support in-line modifiers, back-references, look-around expressions or any non-ASCII character sets. Apart from these exclusions, the parser supports the following constructs (presentation adopted from the Java documentation):
| Construct | Matches |
|---|---|
| Characters | |
| x | The character x |
| \\ | The backslash character |
| \0n | The character with octal value 0n (0 <= n <= 7) |
| \0nn | The character with octal value 0nn (0 <= n <= 7) |
| \0mnn | The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) |
| \xhh | The character with hexadecimal value 0xhh |
| \t | The tab character |
| \n | The newline (line feed) character |
| \r | The carriage-return character |
| \f | The form-feed character |
| \a | The alert (bell) character |
| \e | The escape character |
| \cx | The control character corresponding to x |
| Character classes | |
| [abc] | a, b, or c (simple class) |
| [^abc] | Any character except a, b, or c (negation) |
| [a-zA-Z] | a through z or A through Z, inclusive (range) |
| Predefined character classes | |
| . | Any character (except the line-terminator if /s is not set) |
| \d | A digit: [0-9] |
| \D | A non-digit: [^0-9] |
| \s | A whitespace character: [ \t\n\x0B\f\r] |
| \S | A non-whitespace character: [^\s] |
| \w | A word character: [a-zA-Z_0-9] |
| \W | A non-word character: [^\w] |
| Boundary matchers | |
| ^ | The beginning of a line |
| $ | The end of a line |
| \b | A word boundary |
| \B | A non-word boundary |
| \A | The beginning of the input |
| \z | The end of the input |
| Greedy quantifiers | |
| X? | X, once or not at all |
| X* | X, zero or more times |
| X+ | X, one or more times |
| X{n} | X, exactly n times |
| X{n,} | X, at least n times |
| X{n,m} | X, at least n but not more than m times |
| Reluctant quantifiers | |
| X?? | X, once or not at all |
| X*? | X, zero or more times |
| X+? | X, one or more times |
| X{n}? | X, exactly n times |
| X{n,}? | X, at least n times |
| X{n,m}? | X, at least n but not more than m times |
| Logical operators | |
| XY | X followed by Y |
| X|Y | Either X or Y |
| (X) | X, as a group |
| Quotation | |
| \ | Nothing, but quotes the following character |
| \Q | Nothing, but quotes all characters until \E |
| \E | Nothing, but ends quoting started by \Q |