Lecture 10
Download
1 / 11

Lecture 10 - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Lecture 10. Introduction to AWK COP 3344 Introduction to UNIX. 1. What is AWK. Important early text manipulation language Created by Al Aho, Peter Weinberger & Brian Kernighan This Unix utility manipulates text files that are viewed as arranged in columns

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Lecture 10' - adolfo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Lecture 10

Introduction to AWK

COP 3344 Introduction to UNIX

1


What is AWK

  • Important early text manipulation language

    • Created by Al Aho, Peter Weinberger & Brian Kernighan

  • This Unix utility manipulates text files that are viewed as arranged in columns

  • awk splits each line of input (from standard input or a set of files) based on whitespace (the default) and processes each line - the field separator need not be whitespace but can also be a specified character

  • There are also other flavors of awk such as nawk and gawk

2


Awk Command Structure

  • awk [options] ‘program’ [file(s)]

  • awk [options] -f programfile [files(s)]

  • A program can be one or more pairs of the following:

    • pattern { procedure }

  • BEGIN and END constructs can also be used

  • An important option is -Fc where c is the field separator to use. For example awk -F: . . . indicates that the separator is”:”

  • Example

    • awk -F: ‘/this/ { print $2 }’ file1

3


Awk Program Processing

  • awk scans each input line for pattern and when a match occur the associated actions defined by procedure are executed. The general form of a program is:

    • BEGIN { initial statements }

    • pattern { procedure }

    • pattern { procedure }

    • END { final statements }

  • If the pattern is missing, the procedure is applied to each line

  • If procedure is missing, then the matched lines are written to standard output

  • Fields are referred to by the variables $1, $2, …, $n. $0 refers to the entire record (the line).

  • Statements following BEGIN are done before any pattern-procedures; statements after END are done after all pattern-procedures.

  • In most programs there is only one pattern {procedure}

  • 4


    awk patterns

    • awk patterns can be of the following form

      • /regular expression/

      • relational expression

      • field-matching expression

    • Example patterns

      • /this/

      • /^alpha*/

      • NF > 2

      • $1 == $2

      • $1 ~ /m$/

    5


    Example pattern-procedures

    • Print the second field of each line

      { print $2 }

    • Print the first field of all lines that contain the pattern alpha

      /alpha/ { print $1 }

    • Print all records containing more than two fields

      NF > 2

    • Add numbers in second column if first field matches the word “add”

    • $1 ~ /^add$/ { total += $2 }

    • END { print “total is”, total }

    6


    awk Regular Expressions

    • Regular expressions are formed in the same way as they are for extended grep. All the operators are available

    • Note that regular expressions must be placed with the slashes: /<regular expression>/

    • Examples

      • /D[Rr]\./ #matches any line containing DR. or Dr.

      • /^alpha/ #matches any line starting with alpha

      • /^[a-zA-Z]+/ #matches any line starting with a sequence of #letters (one or more)‏

    7


    awk Relational Expressions

    • Relational expressions can consist of strings, numbers, arithmetic / string operators, relational operators, defined variables, and predefined variables.

      • $1, …, $n, are the fields of the record

      • $0 is the entire line

      • NF is the number of fields in the current line

      • NR is the number of the current line

      • FS is the field separator

      • FILENAME is the current filename

    • many relational operators are available

      • NF > 5 && $1 == $2

      • /while/ || /do/

    • Note: variables can be assigned with the “=“ operator

      • FS = “,”

      • total = 5

    8


    awk field matching expressions

    • Field matching expressions can check if a regular expression matches “~” or does not match “!~” a field.

    • Examples

      • $1 ~ /D[Rr]\./ #first field matches DR. or Dr. ?

      • $1 !~ /From/ #first field does not match From ?

    9


    awk procedures

    • An awk procedure specifies the processing of a line that matches a given pattern. An awk procedure is contained within the “{“ and “}” and consists of statements separated by semicolons or newlines.

    • awk is a full programming language, and contains control statements (such as: do while, for, if, break, continue, etc.)‏

    • Note that BEGIN can be used to initialize variables and END can be used to do post processing after all records have been processed

    10


    awk examples

    • #print the first two fields of each line if the first field matches the string /this/

      awk ‘/this/ { print $2, $1 }’ file1

    • #sum the values of the fields in the second column and print out the final sum, if the first field matches add

      awk ‘BEGIN { sum=0 } /add/ { sum += $2 } \

      END{ print sum }’ file2

    • # illustrating if statements and the or operator

      awk ‘/green/ || /yellow/ \

      {if ($1==“green") print $1 ; \

      else if ($1=="yellow") print "SLOW DOWN";}’ \

      file3

    11


    ad