NCNU Linux User Group 2012 <Regular Expression>

NCNU Linux User Group 2012 <Regular Expression> 王惟綸 2012/07/10

Outline • What’s a Regular Expression? • The Purpose • What’s grep? • Various Operators • Extended Regular Expressions • Exercises • References

What’s a Regular Expression? • A regular expression is a pattern that describes a set of strings. • ExamplesX[2-7] = {X2, X3, X4, X5, X6, X7} T[ae]ste? = {Taste, Tast, Teste, Test}

The Purpose • The regular expression is used to process strings. It makes users easily do searching, replacement, and deletion though the aid of special characters. • T[ae]ste? = {Taste, Tast, Teste, Test} -- These four strings, Taste, Tast, Teste, and Test, can be found out by only searching the pattern “T[ae]ste?”.

What’s grep? • global regular expression print • The grep command searches for the pattern specified by the Pattern parameter and writes each matching line to standard output. [-i ] : ignore the type of upper and lower cases [-v] : inverse the output

alias & unalias

Various Operators • [ ] presents any one character among those characters inside. • [ - ] presents any one character among the code range. • [^ ] represents the characters not in the range of a list. • ^ Matches the empty string at the beginning of a line. • $ Matches the empty string at the end of a line. • . Matches any single character. • * The preceding item will be matched zero or more times.

1. [ ] presents any one character among those characters inside. th[ei] = {the, thi}

2. [ - ] presents any one character among the code range. LANG=C ：0 1 2 3 4 ... A B C D ... Z a b c d ...z LANG=zh_TW.Big5 ：0 1 2 3 4 ... a A b B c C d D ... z Z

2. [ - ] presents any one character among the code range.

3. [^] represents the characters not in the range of a list.

4. ^ Matches the empty string at the beginning of a line.

5. $ Matches the empty string at the end of a line.

6. . Matches any single character.

7. * The preceding item will be matched zero or more times. go* = {g, go, goo, gooo, …} goo* = {go, goo, gooo, …}

Extended Regular Expressions • In basic regular expressions the metacharacters "?", "+", "{", "|", "(", and ")" lose their special meaning; instead use the backslashed versions "\?", "\+", "\{", "\|", "$", and "$". • Using grep -E or egrep instead of grep. • + The preceding item will be matched one or more times. • ? The preceding item will be matched zero or one time. • | represents the preceding item or the following item. • ( ) represents group strings. • {N} The preceding item is matched exactly N times. • {N, } The preceding item is matched N or more times. • {N,M} The preceding item is matched at least N times, but not more than M times.

1. + The preceding item will be matched one or more times. goo+ = {goo, gooo, goooo, …}

2. ? The preceding item will be matched zero or one time. goog? = {goog, goo}

3. | represents the preceding item or the following item. goo|fav = {goo, fav}

4. ( ) represents group strings. f(oo|ee)d = {food, feed}

5. {N} The preceding item is matched exactly N times. go\{2\} = {goo} go\{5\} = {gooooo}

6. {N, } The preceding item is matched N or more times.

7. {N,M} The preceding item is matched at least N times, but not more than M times. go\{2,5\}g = {goog, gooog, goooog, gooooog}

Exercises • What does grep -n '^[^A-z] ' mean? • What does grep -n 'g.*g ' mean? • What does egrep '(d(a|u)|cc?)d' mean? • How to print out non-empty lines? • How to find out “[LUG\2012]”? • Find all files and their contents containing the symbol “*” under /etc

References • http://linux.vbird.org/linux_basic/0330regularex.php • http://tldp.org/LDP/Bash-Beginners-Guide/html/chap_04.html • http://en.wikipedia.org/wiki/Regular_expression • http://www.regular-expressions.info/posix.html

NCNU Linux User Group 2012 <Regular Expression>