Unix Power ToolsUnix Power ToolsSearch this book

32.2. Don't Confuse Regular Expressions with Wildcards

Before we even start talking about regular expressions, a word of caution for beginners: regular expressions can be confusing because they look a lot like the file-matching patterns ("wildcards") the shell uses. Both the shell and programs that use regular expressions have special meanings for the asterisk (*), question mark (?), parentheses (( )), square brackets ([ ]), and vertical bar (|, the "pipe").

Some of these characters even act the same way -- almost.

Just remember, the shells, find, and some others generally use filename-matching patterns and not regular expressions.[99]

[99]Recent versions of many programs, including find, now support regex via special command-line options. For example, find on my Linux server supports the -regex and -iregex options, for specifying filenames via a regular expression, case-sensitive and -insensitive, respectively. But the find command on my OS X laptop does not. -- SJC

You also have to remember that shell wildcards are expanded before the shell passes the arguments to the program. To prevent this expansion, the special characters in a regular expression must be quoted (Section 27.12) when passed as an argument from the shell.

The command:

$ grep [A-Z]*.c chap[12]

could, for example, be interpreted by the shell as:

grep Array.c Bug.c Comp.c chap1 chap2

and so grep would then try to find the pattern "Array.c" in files Bug.c, Comp.c, chap1, and chap2.

The simplest solution in most cases is to surround the regular expression with single quotes ('). Another is to use the echo command to echo your command line to see how the shell will interpret the special characters.

--BB and DG, TOR



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.