1 / 12

CS 497C – Introduction to UNIX Lecture 31: - Filters Using Regular Expressions – grep and sed

CS 497C – Introduction to UNIX Lecture 31: - Filters Using Regular Expressions – grep and sed. Chin-Chih Chang chang@cs.twsu.edu. Substitution. sed’s strongest feature is substitution, achieved with its s (substitute) command. It has the following format: [address]s/expression1/string2/flag

Download Presentation

CS 497C – Introduction to UNIX Lecture 31: - Filters Using Regular Expressions – grep and sed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 497C – Introduction to UNIXLecture 31: - Filters Using Regular Expressions – grep and sed Chin-Chih Changchang@cs.twsu.edu

  2. Substitution • sed’s strongest feature is substitution, achieved with its s (substitute) command. • It has the following format: [address]s/expression1/string2/flag • This is how you replace the | with a colon: $ sed ‘s/|/:/g’ emp.lst | head -2 • To check whether substitution is performed, you can use the cmp command as follows: $ sed ‘s/|/:/g’ emp.lst | cmp -l - emp.lst | wc -l

  3. Substitution • You can perform multiple substitutions with one invocation of sed by pressing [Enter] at the end of each instruction, and then close the quote at the end: $ sed ‘s/<I>/<EM>/g > s/<B>/<STRONG>/g’ form.html • You can compress multiple spaces as below: $ sed ‘s^ *|^|^g’ emp.lst | head -2

  4. Substitution sed ‘/dirctor/s/director/member/’ emp.lst sed ‘/dirctor/s//member/’ emp.lst • The above command suggests that sed ‘remembers’ the scanned pattern, and stores it in // (2 frontslashes). • The // representing an empty (or null) regular expression is interpreted to mean that the search and substituted patterns are the same. This is called the remembered pattern.

  5. Substitution • When a pattern in the source string also occurs in the replaced string, you can use the special character & to represent it. sed ‘s/director/executive director/’ emp.lst sed ‘s/director/executive &/’ emp.lst • These two commands are same. The &, known as the repeated pattern, expands to the entire source string.

  6. Regular Expressions • The interval regular expression (IRE) uses the escaped pair of curly braces {} with a single or a pair of numbers between them. • We can use this sequence to display files which have write permission set for group: $ ls -l | grep “^.\{5\}w” • The regular expression ^.\{5\}w matches five characters (.\{5\}) at the beginning (^) of the line, followed by the pattern (w).

  7. Regular Expressions • The \{5\} signifies that the previous character (.) has to occur five times. The . (dot) character is used to match any character. • The IRE has three forms: • ch\{m\} – The metacharacter ch can occur m times. • ch\{m,n\} – ch can occur between m and n times. • ch\{m,\} – ch can occur at least m times.

  8. Regular Expressions • We can display the listing for those files that have the write bit set either for group or others: $ ls –l | grep “^.\{5,8\}w” • To locate the people born in 1945 in the sample database, use sed as follows: $ sed –n ‘/^.\{49\}45/p’ emp.lst • The tagged regular expression (TRE) uses \( and \) to enclose a pattern.

  9. Regular Expressions • Suppose you want to replace the words John Wayne by Wayne, John. The sed substitution instruction will then look like this: $ echo “John Wayne” | sed ‘s/\(John\) \(Wayne\)/\2, \1/’ • Because the TRE remembers a grouped pattern, you can look for these repeated words like this: $ grep “\[a-z][a-z][a-z]*\) *\1” note

  10. Regular Expressions • These are pattern matching options used by grep, sed, and perl (Page 441): • abc : match the character string “abc”. • * : zero or more occurrences of previous character. • . : match any character except newline. • .* : nothing or any number of characters. • a? : match zero or one instance “a”. • a* : match zero or more repetitions of “a”.

  11. Regular Expressions • [abcde] : match any character within the brackets. • [a-b] : match any character within the range a to b. • [^abcde] : match any character except those within the brackets. • [^a-b] : match any character except those in the range a to b. • ^ : match beginning of line, e.g., /^#/. • ^$ : lines containing nothing.

  12. Regular Expressions • $ : match end of line, e.g., /money.$/. • a\{2\} : match exactly two repetitions of “a”. • a\{4,\} : match four or more repetitions of “a”. • a\{2, 4\} : match between two and four repetitions of “a”. • \(exp\): expression exp for later referencing with \1, \2, etc. • a|b : match a or b.

More Related