Introduction to Awk
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Introduction to Awk PowerPoint PPT Presentation


  • 39 Views
  • Uploaded on
  • Presentation posted in: General

Introduction to Awk. Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks. Awk. Works well on record-type data Reads input file(s) a line at a time Parses each line into fields

Download Presentation

Introduction to Awk

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Introduction to awk

Introduction to Awk

Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.


Introduction to awk

Awk

Works well on record-type data

Reads input file(s) a line at a time

Parses each line into fields

Performs user-defined tests against each line, performs actions on matches


Introduction to awk

Other Common Uses

  • Input validation

    • Every record have same # of fields?

    • Do values make sense (negative time, hourly wage > $1000, etc.)?

  • Filtering out certain fields

  • Searches

    • Who got a zero on lab 3?

    • Who got the highest grade?

  • Many others


Introduction to awk

Invocation

  • Can write little one-liners on the command line (very handy):

    • print the 3rd field of every line:

      $ awk '{ print $3 }' input.txt

  • Execute an awk script file:

    $ awk –f script.awk input.txt

  • Or, use this sha-bang as the first line, and give your script execute permissions:

    #!/bin/awk -f


Introduction to awk

Form of an AWK program

  • AWK programs are entries of the form:

    pattern { action }

    • pattern – some test, looking for a pattern (regular expressions) or C-like conditions

      • if null, actions are applies to every line

    • action – a statement or set of statements

      • if not provided, the default action is to print the entire line, much like grep


Introduction to awk

Form of an AWK program

  • Input files are parsed, a record (line) at a time

  • Each line is checked against each pattern, in order

  • There are 2 special patterns:

    • BEGIN – true before any records are read

    • END – true at end of input (after all records have been read)


Introduction to awk

Awk Features

  • Patterns can be regular expressions or C like conditions.

  • Each line of the input is matched against the patterns, one after the next. If a match occurs the corresponding action is performed.

  • Input lines are parsed and split into fields, which are accessed by $1,…,$NF, where NF is a variable set to the number of fields. The variable $0 contains the entire line, and by default lines are split by white space (blanks, tabs)


Introduction to awk

Variables

  • Not declared, nor typed

  • No character type

    • Only strings and floats (support for ints)

  • $n refers to the nth field (where n is some integer value)

    # prints each field on the line

    for( i=1; i<=NF; ++i )

    print $i


Introduction to awk

Some Built-in Variables

FS – the input field separator

OFS – the output field separator

NF – # of fields; changes w/each record

NR – the # of records read (so far). So, the current record #

FNR – the # of records read so far, reset for each named file

$0 – the entire input line


Introduction to awk

Example

Print pay for those employees who actually worked

$ awk ‘$3>0 {print $1, $2*$3}’ emp.data

Kathy 40

Mark 100

Mary 121

Susie 76.5

$ cat emp.data

Beth 4.00 0

Dan 3.75 0

Kathy 4.00 10

Mark 5.00 20

Mary 5.50 22

Susie 4.25 18


Introduction to awk

Example – CSV file

$ cat students.csv

smith,john,js12

jones,fred,fj84

bee,sue,sb23

fife,ralph,rf86

james,jim,jj22

cook,nancy,nc54

banana,anna,ab67

russ,sam,sr77

loeb,lisa,guitarHottie

$ cat getEmails.awk

#!/bin/awk -f

BEGIN { FS = "," }

{ printf( "%s's email is: [email protected]\n", $2, $3 ); }

$ getEmails.awk students.csv

john's email is: [email protected]

fred's email is: [email protected]

sue's email is: [email protected]

ralph's email is: [email protected]

jim's email is: [email protected]

nancy's email is: [email protected]

anna's email is: [email protected]

sam's email is: [email protected]

lisa's email is: [email protected]


Introduction to awk

Example – output separator

$ cat out.awk

#!/bin/awk -f

BEGIN { FS = ","; OFS = "-*-"; }

{ print $1, $2, $3; }

$ out.awk students.csv

smith-*-john-*-js12

jones-*-fred-*-fj84

bee-*-sue-*-sb23

fife-*-ralph-*-rf86

james-*-jim-*-jj22

cook-*-nancy-*-nc54

banana-*-anna-*-ab67

russ-*-sam-*-sr77

loeb-*-lisa-*-guitarHottie


Introduction to awk

Flow Control

  • Awk syntax is much like C

  • Same loops, if statements, etc.

  • AWK: Aho, Weinberger, Kernighan

  • Kernighan and Ritchie wrote the C language


Introduction to awk

Associative Arrays

  • Awk also supports arrays that can be indexed by arbitrary strings. They are implemented using hash tables.

    • Total[“Sue”] = 100;

  • It is possible to loop over all indices that have currently been assigned values.

    for (name in Total)

    print name, Total[name];


Introduction to awk

Example using Associative Arrays

$ cat scores

Fred 90

Sue 100

Fred 85

Sam 70

Sue 98

Sam 50

Fred 70

$ cat total.awk

{ Total[$1] += $2}

END {

for (i in Total)

print i, Total[i];

}

$ awk -f total.awk scores

Sue 198

Sam 120

Fred 245


Introduction to awk

Useful one-liners

  • Line count:

    awk 'END {print NR}'

  • grep

    awk '/pat/'

  • head

    awk 'NR<=10'

  • Add line #s to a file

    awk '{print NR, $0}'

    awk '{ printf( "%5d %s", NR, $0 )}'

  • Many more. See the resources tab on the course webpage for links to more examples.


  • Login