The grep command
Download
1 / 20

The grep Command - PowerPoint PPT Presentation


  • 166 Views
  • Uploaded on

The grep Command. Purpose & Use. Searches the input files for lines containing a match to a given pattern list copies the line to standard output produces whatever sort of output is requested with options. The grep command. Matching on text

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The grep Command' - zandra


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
The grep command

The grep Command


Purpose use
Purpose & Use

  • Searches the input files for lines containing a match to a given pattern list

  • copies the line to standard output

  • produces whatever sort of output is requested with options


The grep command1
The grep command

  • Matching on text

  • No limit on input file length other than available memory

  • Arbitrary characters within a line


Invoking grep
Invoking grep

  • General synopsis:

    • grep options pattern input_file_names

  • Zero or more options

  • Zero or more input file names


Command line options
Command-line Options

  • POSIX.2

  • GNU extensions

    • long option names


Generic program information
Generic Program Information

  • ‘--help’

    • command-line options and the bug-reporting address and exit

  • ‘-V’

    ‘--version’

    • version number of grep


Matching control
Matching Control

  • ‘-e pattern’ Use pattern as a pattern

    ‘--regexp=pattern’

  • ‘-f file’ Obtain patterns from file, one per line

    ‘--file=file’

  • ‘-i’ ‘-y’ Ignore case

    ‘--ignore-case’

  • ‘-v’ Selects non-matching lines

    ‘—invert-match’

  • ‘-w’ Lines w. matches that form whole words

    ‘--word-regexp’

  • ‘-x’ Matches that exactly match the whole line

    ‘--line-regexp’


General output control
General Output Control

  • ‘-c’, ‘--count’ prints a count of matching lines

  • ‘--color[=WHEN]’ Surround the matched strings, matching lines, context

  • ‘-- colour[=WHEN]’ lines, file names, line numbers, byte offsets, and separators with escape sequences to display them in color on the terminal.

  • ‘-L’ prints out name of files without any match

    ‘--files-without-match’

  • ‘-l’ prints out names f files with match

    ‘--files-with-matches’

  • ‘-m num’ stops reading file after num matches

    ‘--max-count=num’ (!) when used with other options like ‘-c’ or ‘-v’

  • ‘-o’ prints each matched part of each line on a

    ‘--only-matching’ separate line

  • ‘-q’, ‘--quiet’, ‘--silent’ no output

  • ‘-s’, ‘--no-messages’ suppress error messages about nonexistent or unreadable files


Output line prefix control
Output Line Prefix Control

  • order is always file name, line number, and byte offset

  • ‘-b’, ‘--byte-offset’ prints the 0-based byte offset

  • ‘-H’, ‘--with-filename’ prints file name, default when more than 1 file

  • ‘-h’, ‘--no-filename’ doesn’t print out file names

  • ‘--label=LABEL’ display input as i. coming from file LABEL

  • ‘-n’, ‘--line-number’ displays 1-based line number

  • ‘-T’, ‘--initial-tab’ the first character of line content lies on a tab stop

  • ‘-u’, ‘--unix-byte-offsets’ Unix-style byte offset; used with ‘-b’

  • ‘-Z’, ‘--null’ outputs zero instead of the character that follows the file name


Context line control
Context Line Control

  • Cannot be used with ‘-o’ or ‘--only-matching’

  • ‘-A num’ prints num lines of trailing context

    ‘--after-context=num’ after matching lines

  • ‘-B num’ prints num lines of leading context

    ‘--before-context=num’ before matching lines

  • ‘-C num’, ‘-num’ Print num lines of leading and trailing

    ‘--context=num’ output context

  • ‘--group-separator=string’ print string instead of ‘--’ around disjoint groups of lines.

  • ‘--no-group-separator’ print disjoint groups of lines adjacent to each other


File and directory selection
File and Directory Selection

  • ‘-a’, ‘--text’ process a binary file as a text file

  • ‘--binary-files=type’ if the first few bytes are of type binary assume that file is binary

  • ‘-D action’ for device, FIFO, or socket file, use

    ‘--devices=action’ action to process it (read or skip)

  • ‘-d action’ for directory file, use action to process it

    ‘--directories=action’ (read, skip, recurse)

  • ‘--exclude=glob’ skip files whose base name matches glob

  • ‘--exclude-dir=dir’ exclude directories matching the pattern dir from recursive directory searches

  • ‘-I’ process binary files as if there is no match

    ‘--binary-files=without-match’

  • ‘--include=glob’ files whose base name matches glob only

  • ‘-r’, ‘-R’, ‘--recursive’ process all files in that directory, recursively


Other options
Other Options

  • ‘--line-buffered’ use line buffering on output; can cause a performance penalty

  • ‘--mmap’ ignored for backwards compatibility; reads input with the mmap system call, rarely if ever yields better performance

  • ‘-U’, ‘--binary’ treats the file(s) as binary

  • ‘-z’ ‘--null-data’ treats the input as a set of lines, each terminated by a zero byte


Environment variables
Environment Variables

  • GREP_OPTIONS specifies default options to be placed in front of any explicit options

  • LC_ALL specify the locale for the LC_COLLATE category

    LC_COLLATE which determines the collating sequence used to

    LANG interpret range expressions

  • LC_ALL specify the locale for the LC_CTYPE category

    LC_CTYPE which determines the type of characters

    LANG

  • LC_ALL specify the locale for the LC_MESSAGES category

    LC_MESSAGES which determines the language that grep uses for

    LANG messages

  • POSIXLY_CORRECT grep behaves as posix.2 requires


Environment variables1
Environment Variables

  • GREP_COLOR specifies the color used to highlight matched text

  • GREP_COLORS specifies the colors and other attributes used to highlight various parts of the output

    • Capabilities:

      • sl= whole selected lines / context matching lines

      • cx= whole context lines/ selected non-matching lines

      • rv Boolean value that reverses (swaps) the meanings of the ‘sl=’ and‘cx=’ capabilities when the ‘-v’ command-line option is specified

      • mt=01;31 matching non-empty text in any matching line

      • ln=32 line numbers prefixing any content line


Exit status
Exit Status

  • 0 if selected lines are found

  • 1 if selected lines are not found

  • 2 if an error occured


Grep programs
grep Programs

  • four major variants of grep, controlled by the following options:

    • ‘-G’, ‘--basic-regexp’

      • Interpret the pattern as a basic regular expression (BRE). This is the default.

    • ‘-E’, ‘--extended-regexp’

      • Interpret the pattern as an extended regular expression

        • ‘-F’, ‘--fixed-strings’

      • Interpret the pattern as a list of fixed strings, separated by newlines

    • ‘-P’, ‘--perl-regexp’

      • Interpret the pattern as a Perl regular expression


Regular expressions
Regular Expressions

  • Is a pattern that describes a set of strings

  • fundamental building blocks are the regular expressions that match a single character

  • A regular expression may be followed by one of several repetition operators:

    • ‘.’ matches any single character

    • ‘?’ The preceding item will be matched at most once.

    • ‘*’ The preceding item will be matched zero or more times.

    • ‘+’ The preceding item will be matched one or more times.

    • ‘{n}’ The preceding item is matched exactly n times.

    • ‘{n,}’ The preceding item is matched n or more times.

    • ‘{,m}’ The preceding item is matched at most m times.

    • ‘{n,m}’ The preceding item is matched at least n times, but not more than m times.

      expressions may be joined by the infix operator ‘|’

      Repetition -> concatenation -> alternation


Bracket expressions
Bracket Expressions

  • matches any single character in that list

  • If (^) is at the beginning then it matches any characters not in the list

  • classes of characters:

    • ‘[:alnum:]’ Alphanumeric characters

    • ‘[:alpha:]’ Alphabetic characters

    • ‘[:blank:]’ Blank characters: space and tab.

    • ‘[:cntrl:]’ Control characters.

    • ‘[:digit:]’ Digits

    • ‘[:graph:]’ Graphical characters: ‘[:alnum:]’ and ‘[:punct:]’.

    • ‘[:lower:]’ Lower-case letters

    • ‘[:print:]’ Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.

    • [:punct:]’ Punctuation characters: ! " # $ % & ’ ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _‘ { | } ~.

    • ‘[:space:]’ Space characters: tab, newline, vertical tab, form feed, carriage return

    • ‘[:upper:] Upper-case letters

    • ‘[:xdigit:]’ Hexadecimal digits


The backslash character and special expressions
The Backslash Character and Special Expressions

  • ‘‘\b’’ Match the empty string at the edge of a word.

  • ‘‘\B’’ Match the empty string provided it’s not at the edge of a word.

  • ‘‘\<’’ Match the empty string at the beginning of word.

  • ‘‘\>’’ Match the empty string at the end of word.

  • ‘‘\w’’ Match word constituent, it is a synonym for ‘[[:alnum:]]’.

  • ‘‘\W’’ Match non-word constituent, it is a synonym for ‘[^[:alnum:]]’.

  • Ex: ‘\brat\b’ matches the separate word ‘rat’, ‘\Brat\B’ matches ‘crate’ but not ‘furry rat’


Example questions
Example & Questions

grep -i ’hello.*world’ menu.hmain.c

  • How can you list just the names of matching files?

  • How do you search directories recursively?

  • What if a pattern has a leading ‘-’?

  • How do you search for a whole word, not a part of a word?