1 / 20

The grep Command

The grep Command. Purpose & Use. Searches the input files for lines containing a match to a given pattern list copies the line to standard output produces whatever sort of output is requested with options. The grep command. Matching on text

zandra
Download Presentation

The grep Command

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The grep Command

  2. Purpose & Use • Searches the input files for lines containing a match to a given pattern list • copies the line to standard output • produces whatever sort of output is requested with options

  3. The grep command • Matching on text • No limit on input file length other than available memory • Arbitrary characters within a line

  4. Invoking grep • General synopsis: • grep options pattern input_file_names • Zero or more options • Zero or more input file names

  5. Command-line Options • POSIX.2 • GNU extensions • long option names

  6. Generic Program Information • ‘--help’ • command-line options and the bug-reporting address and exit • ‘-V’ ‘--version’ • version number of grep

  7. Matching Control • ‘-e pattern’ Use pattern as a pattern ‘--regexp=pattern’ • ‘-f file’ Obtain patterns from file, one per line ‘--file=file’ • ‘-i’ ‘-y’ Ignore case ‘--ignore-case’ • ‘-v’ Selects non-matching lines ‘—invert-match’ • ‘-w’ Lines w. matches that form whole words ‘--word-regexp’ • ‘-x’ Matches that exactly match the whole line ‘--line-regexp’

  8. General Output Control • ‘-c’, ‘--count’ prints a count of matching lines • ‘--color[=WHEN]’ Surround the matched strings, matching lines, context • ‘-- colour[=WHEN]’ lines, file names, line numbers, byte offsets, and separators with escape sequences to display them in color on the terminal. • ‘-L’ prints out name of files without any match ‘--files-without-match’ • ‘-l’ prints out names f files with match ‘--files-with-matches’ • ‘-m num’ stops reading file after num matches ‘--max-count=num’ (!) when used with other options like ‘-c’ or ‘-v’ • ‘-o’ prints each matched part of each line on a ‘--only-matching’ separate line • ‘-q’, ‘--quiet’, ‘--silent’ no output • ‘-s’, ‘--no-messages’ suppress error messages about nonexistent or unreadable files

  9. Output Line Prefix Control • order is always file name, line number, and byte offset • ‘-b’, ‘--byte-offset’ prints the 0-based byte offset • ‘-H’, ‘--with-filename’ prints file name, default when more than 1 file • ‘-h’, ‘--no-filename’ doesn’t print out file names • ‘--label=LABEL’ display input as i. coming from file LABEL • ‘-n’, ‘--line-number’ displays 1-based line number • ‘-T’, ‘--initial-tab’ the first character of line content lies on a tab stop • ‘-u’, ‘--unix-byte-offsets’ Unix-style byte offset; used with ‘-b’ • ‘-Z’, ‘--null’ outputs zero instead of the character that follows the file name

  10. Context Line Control • Cannot be used with ‘-o’ or ‘--only-matching’ • ‘-A num’ prints num lines of trailing context ‘--after-context=num’ after matching lines • ‘-B num’ prints num lines of leading context ‘--before-context=num’ before matching lines • ‘-C num’, ‘-num’ Print num lines of leading and trailing ‘--context=num’ output context • ‘--group-separator=string’ print string instead of ‘--’ around disjoint groups of lines. • ‘--no-group-separator’ print disjoint groups of lines adjacent to each other

  11. File and Directory Selection • ‘-a’, ‘--text’ process a binary file as a text file • ‘--binary-files=type’ if the first few bytes are of type binary assume that file is binary • ‘-D action’ for device, FIFO, or socket file, use ‘--devices=action’ action to process it (read or skip) • ‘-d action’ for directory file, use action to process it ‘--directories=action’ (read, skip, recurse) • ‘--exclude=glob’ skip files whose base name matches glob • ‘--exclude-dir=dir’ exclude directories matching the pattern dir from recursive directory searches • ‘-I’ process binary files as if there is no match ‘--binary-files=without-match’ • ‘--include=glob’ files whose base name matches glob only • ‘-r’, ‘-R’, ‘--recursive’ process all files in that directory, recursively

  12. Other Options • ‘--line-buffered’ use line buffering on output; can cause a performance penalty • ‘--mmap’ ignored for backwards compatibility; reads input with the mmap system call, rarely if ever yields better performance • ‘-U’, ‘--binary’ treats the file(s) as binary • ‘-z’ ‘--null-data’ treats the input as a set of lines, each terminated by a zero byte

  13. Environment Variables • GREP_OPTIONS specifies default options to be placed in front of any explicit options • LC_ALL specify the locale for the LC_COLLATE category LC_COLLATE which determines the collating sequence used to LANG interpret range expressions • LC_ALL specify the locale for the LC_CTYPE category LC_CTYPE which determines the type of characters LANG • LC_ALL specify the locale for the LC_MESSAGES category LC_MESSAGES which determines the language that grep uses for LANG messages • POSIXLY_CORRECT grep behaves as posix.2 requires

  14. Environment Variables • GREP_COLOR specifies the color used to highlight matched text • GREP_COLORS specifies the colors and other attributes used to highlight various parts of the output • Capabilities: • sl= whole selected lines / context matching lines • cx= whole context lines/ selected non-matching lines • rv Boolean value that reverses (swaps) the meanings of the ‘sl=’ and‘cx=’ capabilities when the ‘-v’ command-line option is specified • mt=01;31 matching non-empty text in any matching line • ln=32 line numbers prefixing any content line

  15. Exit Status • 0 if selected lines are found • 1 if selected lines are not found • 2 if an error occured

  16. grep Programs • four major variants of grep, controlled by the following options: • ‘-G’, ‘--basic-regexp’ • Interpret the pattern as a basic regular expression (BRE). This is the default. • ‘-E’, ‘--extended-regexp’ • Interpret the pattern as an extended regular expression • ‘-F’, ‘--fixed-strings’ • Interpret the pattern as a list of fixed strings, separated by newlines • ‘-P’, ‘--perl-regexp’ • Interpret the pattern as a Perl regular expression

  17. Regular Expressions • Is a pattern that describes a set of strings • fundamental building blocks are the regular expressions that match a single character • A regular expression may be followed by one of several repetition operators: • ‘.’ matches any single character • ‘?’ The preceding item will be matched at most once. • ‘*’ The preceding item will be matched zero or more times. • ‘+’ The preceding item will be matched one or more times. • ‘{n}’ The preceding item is matched exactly n times. • ‘{n,}’ The preceding item is matched n or more times. • ‘{,m}’ The preceding item is matched at most m times. • ‘{n,m}’ The preceding item is matched at least n times, but not more than m times. expressions may be joined by the infix operator ‘|’ Repetition -> concatenation -> alternation

  18. Bracket Expressions • matches any single character in that list • If (^) is at the beginning then it matches any characters not in the list • classes of characters: • ‘[:alnum:]’ Alphanumeric characters • ‘[:alpha:]’ Alphabetic characters • ‘[:blank:]’ Blank characters: space and tab. • ‘[:cntrl:]’ Control characters. • ‘[:digit:]’ Digits • ‘[:graph:]’ Graphical characters: ‘[:alnum:]’ and ‘[:punct:]’. • ‘[:lower:]’ Lower-case letters • ‘[:print:]’ Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space. • [:punct:]’ Punctuation characters: ! " # $ % & ’ ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _‘ { | } ~. • ‘[:space:]’ Space characters: tab, newline, vertical tab, form feed, carriage return • ‘[:upper:] Upper-case letters • ‘[:xdigit:]’ Hexadecimal digits

  19. The Backslash Character and Special Expressions • ‘‘\b’’ Match the empty string at the edge of a word. • ‘‘\B’’ Match the empty string provided it’s not at the edge of a word. • ‘‘\<’’ Match the empty string at the beginning of word. • ‘‘\>’’ Match the empty string at the end of word. • ‘‘\w’’ Match word constituent, it is a synonym for ‘[[:alnum:]]’. • ‘‘\W’’ Match non-word constituent, it is a synonym for ‘[^[:alnum:]]’. • Ex: ‘\brat\b’ matches the separate word ‘rat’, ‘\Brat\B’ matches ‘crate’ but not ‘furry rat’

  20. Example & Questions grep -i ’hello.*world’ menu.hmain.c • How can you list just the names of matching files? • How do you search directories recursively? • What if a pattern has a leading ‘-’? • How do you search for a whole word, not a part of a word?

More Related