1 / 23

grep (Global REgular expresion Print)

grep (Global REgular expresion Print). Operation Search a group of files Find all lines that contain a particular regular expression pattern Write the result to an output file grep returns to the prompt with no extra output when it is done Syntax: grep [-cilLnrsvwx] pattern [list of files]

nida
Download Presentation

grep (Global REgular expresion Print)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. grep (Global REgular expresion Print) • Operation • Search a group of files • Find all lines that contain a particular regular expression pattern • Write the result to an output file • grep returns to the prompt with no extra output when it is done • Syntax: grep [-cilLnrsvwx] pattern [list of files] • Examples • find information about the user, harley>grep harley /etc/passwd • Find all lines in the files containing the string xxx . >grep xxx .

  2. grep Flags • -c count the number of matches • -i Ignore case when searching for matches • -l List the file names containing matches • -L list files that do not have a match • -n Write the line number in front of each line • -r perform a recursive directory search • -s suppress warning and error messages • -v search for lines without the matching pattern • -w search only for complete words • -x only match lines that exactly match the pattern

  3. Regular Expressions Note: Many UNIX programs use these (vi, sed, more, grep, awk) • Industry standard way to specify patterns • In Java: string.match("pattern"); • In Java: string.replaceAll("pattern", string) • Meta-characters and operators ^ beginning of line, $ end of a line * match 0 or more of the previous group + match 1 or more of the previous group ? match 0 or one of the previous group {n} match n of the previous group {m,n} match m to n of the previous group {n,} match n or more of the previous group | match either the group before or the groups after . match any character except for new line \ literally interpret the following meta-character or operator

  4. Regular Expression Examples Note: To use ( ) {} or + grep use the –E (extended) switch or precede with \

  5. More grep Examples Contents of a file called homework Math: problems 12-10 to 12-33, due Monday BasketWeaving: make a 6-inch basket, DONE Psychology: essay on Animal Existentialism, due end of term Surfing:catch at least 10 grep commands >grep –v DONE homework displays all but line 2 >grep –c DONE homework displays 1 >grep –wi ".*a.*" on homework displays all lines >grep –w "m.*e" homework displays line 2 >grep –i "d.*e" homework displays lines 1, 2 and 3 >grep '\(Ma\|DO\).*' homework displays lines 1 and 2 Note: the last example escapes the parentheses and the vertical bar

  6. Sorting Data • Background • Each line in a file is a record • Each line is a series of fields separated by spaces and/or tabs • Commands >sort fileName sorts fileName on the 1st field of each line >sort +5 fileName sorts on the 6th field of each line >sort –n +4 fileName sort on the 5th field numerically >sort –t ':' +3r +2 fileName sort descending on the 4th field, and then ascending on the 3rd with ':' as a delimeter >sort –t ':' fileName sort using ':' as a separator character >sort –u +1 fileName sort reverse on the 2nd field and remove duplicates (output must be unique) >sort –k 3,4 in a pipe sorts by the key, from field 4 through field 5 >sort +4n +7 sorts numeric by the 5th field and alphabetic by the 8th

  7. SED (Stream Editor • SED is a filter • Input from stdin or a file • Output to stdout or a file • Modifies the input to produce the output • Non-interactive • Processing • Read from an input stream • Perform line oriented commands • Write to an output stream • Syntax: >sed [-i] command | [-e command] … [file]

  8. Search and Replace Note: This syntax works in vi, more, awk • Search, change and redirect to newFile>sed ‘s/cat/dog/g' file > newFile • Search, change, and edit file>sed –i ‘s/cat/dog/g' file • Specific range of lines: >sed '5,10s/cat/dog/g' file • Lines apply search to lines containing OK: >sed '/OK/s/cat/dog/g' names • Lines apply to lines having 2 numeric characters>sed '/[0-9]\{2\}/s/cat/dog/g' names • Delete range of lines: >sed '5,10d' file Note: single quotes suppress the shell's interpretation of special characters Note: You must escape the characters: +, { and } for it to work

  9. sed –i \ -e 's/mon/Monday/g' \ -e 's/tue/Tuesday/g' \ -e 's/wed/Wednesday/g' \ -e 's/thu/Thursday/g' \ -e 's/fri/Friday/g' \ -e 's/sat/Saturday/g' \ -e 's/sun/Sunday/g' \ calendar The backslash is a continuation character The –e specifies another command (extension) The '/g/ means change every occurrence on each line, not just the first Complex Commands

  10. AWK • AWK (Aho, Weinberger, Kernigham) • Special purpose programming language • Interpretive • Useful for UNIX Scripts • Purposes • Filter text files based on supplied patterns • Produce reports • Callable from "vi" • Create simple databases • Simple mathematical operations • Creating scripts • Not good for large complicated tasks • Other interpretive languages: perl, php

  11. The single quote causes the shell to ignore special characters The various clauses are optional Much of the syntax for <action> clauses is c and Java compatible The patterns utilize regular expressions BEGIN {<initialization>} <pattern> {<action>} <pattern> {<action>} • • • <pattern> {<action>} END {<final actions>} >awk '<awk program>' General Syntax

  12. AWK General Operation • Each file consists of a series of records • Each record is a series of fields • Defaults • Record separator: new line character • Field separator: white space characters • Flow of Operation • Read the input file line by line • If it matches the line, then process • Otherwise skip

  13. Some AWK Simple Examples • Print fields of records in a file>awk ' {print $5, $6, $7, $8} ' fileName • Print lines with a search string>awk '/gold/ {print}' fileName • Print the number of records>awk 'END {print NR, "records"}' fileName • Print records using a condition>awk '{if ($3 < 1980) print $3}' fileName • Using variables>awk '/gold/{sum += $2} END {print "value = " sum}' fileName

  14. Longer Program in a file # awk program summarizing a coin collection BEGIN {num_gold=0; wt_gold=0; } { /gold/ {num_gold++; wt_gold += $2}; } END { val_gold = 485 * wt_gold;printf("\n Gold Pieces: %2d," num_gold); printf("\n Gold Weight: %5.2f", wt_gold); printf("\n Gold Value: %7.2f\n", val_gold); } To Execute an AWK program: >awk –f <program fileName>

  15. Invoking AWK >awk [-F<ch>] [<program>] [-f <programFile>] [<vars>] [- | <datafile>] • <ch> is a field separator (default: space, tab) • <program> an AWK program • <programFile> a file containing an AWK program • <vars> a series of variables to initialize>awk –f program f1=file2 f2=file1 > output • - means accept AWK input from STDIN • <dataFile> a file containing data to process Note: AWK is often invoked repeatedly in shell scripts

  16. Search Patterns • An exact string: /The/ • A string starting a line: /^The/ • A string ending a line: /The$/ • A String ignoring case of first letter: /[Tt]he • Decimal: /[0-9]*.[0-9]*/ • Alphanumeric: /[a-zA-Z0-9]*/ • Choice between two strings: /(da|De).*/ • Numeric: /[+-]?[0-9]+/ • Any Boolean expression: $4>90 or $4>$5 Note: Some utilities require \(, \) and \| if you use ()| regular expression characters

  17. Built in Variables • NR: Total number of records • NF: Total number of fields • FILENAME: The current input file • FS: Field separator character • RS: Record separator character • OFS: Output field separator character • ORS: Output record separator character • OFMT: The default printf output format

  18. Arrays and control structures • Indexed and associative arrays • By index: months[3] = "March"; • Associative: debts["Kim"] = 1000; • Note: arrays index from one, not zero • Counter Controled: for (i=1, i<100; i++) data[i] = i; • Iterator: for (i in myArray) print i, names[i]; • Pre test: i=0; while (i<20) data[i] = i++; • Condition: if (i==1) print debts["Kim"] else print debts["Joe"]; print (i==1)? debts["Kim"] : debts["Joe"]; • Unconditional control statements • break: jump out of a loop • continue: next iteration • next: get next line of input • exit: exit the AWK program

  19. Built-in functions • Square root: print sqrt(3.6) • Integer portion: print int(3.2) • Substring: print substr("abcde", 3,2); • Split: letters = split("a;b;c;d;e", ";"); • Position: print index("gorbachev", "bach");Note:if a substring doesn't exist, 0 returnedNote:Strings index from one, not zero

  20. printf • printf(<template>, <arguments>); • printf applies the template to the arguments • Formats are specified in the templates%d for integer output%o for octal%x for hexadecimal%s for string%e for exponential format%f for floating point format • Greater control%5.2f means 5 spaces wide, print two digits%-8.4s means left justify, 8 wide, print 4 characters%08s means output leading zeroes, print 8 characters

  21. Escape Characters • New line: \n • Carriage return: \r • Backspace: \b • Horizontal tab: \t • Form feed: \f • A quote: \" • A backslash: \\

  22. AWK redirection and pipes • Create a file with the first field>awk '{print $1 >> "file" } • Pipe output to another utility>ls –l | awk '{print $8}' | tr '[a-z]' '[ A-Z]'Pipe to a utility to translate from lower to upper case • Sort the grades file and print the first field>sort +4n grades | awk '{print $1}' • list .txt files < 2000 bytes, print sorted descending>ls –l | grep '\.txt$' | awk '$5 < 2000 {print $9, $5}' | sort –nr +1

  23. More Examples • Print Bush's grades>awk '/Bush/{print $3, $4}' grades • Print first name, last name, and quiz 3 grade for everyone who got more than a 90 on quiz 1 and 2>awk '{if ($4>90 && $5>90) print $3, $2, $6}' grades>awk '$4>90 && $5>90 {print $3, $2, $6}' • Print username for user with userid 502>awk –F: '{if ($3==502) print $1}'>awk –F: '$3==502 {print $1}'

More Related