Unix talk 2
Download
1 / 31

Unix Talk #2 - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Unix Talk #2. AWK overview Patterns and actions Records and fields Print vs. printf. Introduction. Students' grades in a text file John 22 56 38 70 85 80 Alex 90 89 79 98 35 How can I calculate John's current average within this file GREP? Search for John with grep? Gives me the line.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Unix Talk #2' - peta


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Unix talk 2

Unix Talk #2

AWK overview

Patterns and actions

Records and fields

Print vs. printf


Introduction
Introduction

  • Students' grades in a text file

  • John 22 56 38 70 85 80

  • Alex 90 89 79 98 35

  • How can I calculate John's current average within this file

  • GREP?

    • Search for John with grep? Gives me the line.

    • Now I can use my calculator to figure it out.

    • SED?

  • sed will allow me to print, change, delete, etc.

  • I really want to automatically manipulate the values within this line.

  • This is where awk comes in.

  • (awk me amadeus)


awk

  • The first initials from the last names of each of the authors, Aho, Weinberg and Kernighan

  • Which awk are we tawking about?

    • awk

    • nawk – new awk ( on CS machines )

    • gawk – GNU awk ( bart )


Awk syntax
AWK syntax

  • awk ‘/pattern/’ file

  • awk ‘{action}’ file

  • awk ‘/pattern/ {action;}' file

  • cat file | awk ‘{action}’

    Awk automatically reads in the file for you line by line.

    • No need to open/close file. (like in C or Java)

    • pattern section FINDS LINES with that pattern

    • action section does the actions you defined on the lines it found

    • The original file does not change.


Simple example
Simple example

  • awk ‘{ print }’ fruit_prices

  • Note: Here the pattern is missing, in this case, the awk command print is used to print each line it read


Simple example1
Simple example

awk ‘

/\$[0-9]*\.[0-9][0-9]*/ { print}

‘ fruit_prices


Action
Action

  • Actions are specified by the programmers not just print, delete, etc (p/d/s from sed). That is why it is so awesome!

  • Actions consists of

    • variable assignments,

    • arithmetic and logic operators,

    • decision structures,

    • looping structures.

  • For example, print, if, while and for

  • awk ‘{print}’ filename


Execution types
Execution types

  • format 1: awk ‘script’

    • where INPUT must come from pipe or STDIN

    • command | awk ‘script’

  • format 2: awk ‘script’ input1 input2 ... inputn

    • where we supply input FILES as input1, input2, etc.

  • format 3: awk -f script_file input1...

  • (# in "script..." is comment)


Pattern
Pattern

  • Types

    • Regular expressions

    • BEGIN

      • Do all the stuff BEFORE reading any input

    • END

      • does all this stuff AFTER reading ALL input.

  • Pattern is optional

  • If no pattern is specified, the "action" will occur for EVERY LINE one @ time.

  • awk ‘{Action}’ filename

  • awk '{print;}' names prints all lines

  • awk ‘BEGIN {print “The average grades”}’


Awk regular expression metacharacters
Awk Regular Expression Metacharacters

  • Supports

    • ^, $, ., *, +, ?, [ABC], [^ABC],

    • [A-Z], A|B, (AB)+, \, &

  • Not support

    • Backreferencing, \( \)

    • Repetition, \{ \}


awk ‘

BEGIN { actions ; }

/pattern/ { actions ; }

/pattern/ { actions ; }

END { actions ;}

‘ files

Execution steps:

  • If a BEGIN pattern is present, executes its actions

  • Reads an input line and parses it into fields

  • Compares each of the specified patterns against the input line, if find a match, executes the actions. This step is repeated for all patterns.

  • Repeats steps 2 and 3 while input lines are present

  • After the script reads all the input lines, if the END pattern is present, executes its actions


Try this
Try This!

  • Place the following in the file tryawk1.awk

    BEGIN { print "Starting to read input";

    nLines = 0; }

    /^.*$/ { nLines++; }

    END { print “DONE: Total lines = “ nLines; }

    • Run the command: cat tryawk1.awk | awk –f tryawk1.awk

    • Counts the # of lines in the input

      • nLines is a variable … note NO declaration, just use

      • print command prints a line of text, adds newline to end of the line


Records and fields
Records and fields

  • awk has RECORDS (lines) and FIELDS

  • $0 represents the entire line of input

  • $1 represents the first field

  • Print just like echo

    • Print $1 $2 # $1 concat $2

    • Print $1, $2 # $1 OFS $2

  • cat fruit_prices

  • awk '{print;}' fruit_prices #prints all lines

  • awk '{print $0;}' fruit_prices #prints each entire line

  • awk '{print $1;}' fruit_prices #prints first field in each line

  • awk '{print $2;}' fruit_prices #prints second field in each line


Examples
Examples

cat phones.data

John Robinson 234-3456

Yin Pan 123-4567

awk ‘{ print $1, $2, $3 }’ phones.data

John Robinson 234-3456

Yin Pan 123-4567

awk ‘{ print $2 “, ”, $1, $3 }’ phones.data

Robinson, John 234-3456

Pan, Yin 123-4567

awk ‘/^$/ { print x += 1 }’ phones.data

awk ‘/Mary/ { print $0 }’ phones.data


Examples con t
Examples (con’t)

  • ls -l | awk ‘

    $6 == "Oct" { sum += $5 ; }

    END { print sum ; }

  • ls -l | awk -f block_use.awk

    cat block_use.awk

    $6 == "Oct" { sum += $5 ; }

    END { print sum ; }


Taking pattern specific actions
Taking Pattern-specific Actions

#!/bin/sh

awk ‘

/\$[1-9][0-9]*\.[0-9][0-9]*/ { print $0,”*”;}

/\$0\.[0-9][0-9]*/ { print ;}

‘ fruit_prices


Intrinsic variables
Intrinsic variables

  • awk defines RECORDS (lines) and FIELDS

    • FS, input field separator (default=space/tab)

    • OFS, output field separator (default=space)

    • ORS, Output record separator (default=newline)

    • RS, Input record separator (default=newline)

    • NR, number of the current record being processed

    • NF, number of fields within current record

    • FILENAME, awk sets this pattern to the name of the file that it's currently reading. (If you have more than input file, awk resets this pattern as it reads each file in turn.


How does awk work
How does awk work

  • awk ‘{print $1, $3}’ names

    • Put a line of input to $0 based on RS

    • The line is broken into fields based on FS and store them in a numbered variable, starting with $1

    • Prints the fields with print or others based on OFS to separate fields

    • After awk displays it output, it goes to next line and repeat. The output lines are separated by ORS.


Changing the input field separator
Changing the Input Field Separator

  • Manually resetting FS in a BEGIN pattern

    • Forces you to hard code the value of the field separator

    • BEGIN{FS=“:” ; }

    • Example:

      • $ awk ‘BEGIN { FS=“:” ; } { print $1, $6 ; }’ /etc/passwd

  • Specifying the –F option to awk

    • awk –F: ‘ { … } ’

    • Enables using a shell variable to specify the field separator dynamically

    • Example:

      • sep=‘:’

      • $ awk –F$sep ‘ { print $1, $6 ; }’ /etc/passwd


Example
Example

  • FirstName;LastName;Address;City;State;Zip;Phone

  • SSN:DOB:NumberOfDependents

  • HospitilizationCOde,DentalCode,LifeCOde

  • Convert this file format to:

  • SSN,LastName,FirstName,Address,….


  • awk ‘BEGIN{OFS=“,”; FS=“;”}

    {NR%3==1 {FS=“;”; #prepare

    F=$1; L=$2; A=$3;…..}

    NR%3==2 {FS=“:”; SSN=$1;DOB=$2;…}

    NR%3==0{FS=“,”;…;print F L A…}

    }’ filename


Print vs printf 2
Print vs. Printf.2

  • printf

    • 1st argument is a string … the ‘format’

    • Prints each character of the format

      • Upon reaching a %, the next few characters are a format specifier

      • The next argument is printed according to the specifier

    • Does not append a newline

    • More control over appearance of output

    • Consider

      awk 'BEGIN { printf "%5.2f\n", 2/3; }'

      • Prints 0.67 (here, the  represents a space)

      • %5.2f means print a fractional number (the ‘f’) in a field 5 characters wide, with 2 digits to the right of the decimal point.


Why printf
Why Printf

  • printf - for formatting output of your “print”

  • We have function print, why printf

    • Printf allows us to FORMAT stuff.

    • can FORCE printing of string

    • Decimals

    • whole numbers

    • how many digits fall on either side of decimal pt

    • scientific notation

    • make things line up nicely


Printf
printf

  • printf (format, what to print)

  • printf ( "%s", x)

    • %s is a PLACEHOLDER for some OUTPUT.

    • s is a specific type of output (string)

    • ONE item (%s), must have ONE thing to print in the "what to print“

    • format inside of quotes, followed by comma, followed by variables outside the quotes to print.

  • printf ( " s = %s ", x )

    • "s=" is a LITERAL string


Printf format
Printf format

  • s = A character string

  • f = A floating point number

  • d or i= the integer part of a decimal number

  • g or e = scientific notation of a floating point

  • c = An ASCII character

  • if x=65 and I use this print statement

  • printf ( " s = %c ", x )

  • output is "s = A“

  • awk 'BEGIN{x=65; printf("char: %c\n", x)}'


Printf1
Printf

  • More control:

    • %wd

      • Print an integer out in a field of width w

      • If the number is smaller than w characters, print leading spaces

      • Try awk 'BEGIN { printf "%10d\n", 10; }' /dev/null

    • Try to add a ‘-’ immediately after the %

      • Left justifies the value in the field


Printf2
Printf

  • %ws

    • Print a string out in a field of width w

    • Supply leading spaces as necessary

  • Place a ‘-’ immediately after the % to get left justification


Printf3
Printf

  • %w.df

    • Prints the value out in a field of width w

    • Places the decimal point d places from the right end

    • Place a ‘-’ immediately after the % to get left justification


Printf examples
Printf examples

  • Apple 10 20 25

  • <---10----><-5-><-5-><-5->

  • awk ‘{printf (" %10s %5d %5d %d ", $1, $2, $3, $4 )}’ file

  • awk ‘{printf (" %-10s %5d %5d %d ", $1, $2, $3, $4 )}’ file

  • minus sign designates that this field will be LEFT JUSTIFIED

  • awk ‘{printf (" %-10s %-5d %-5d %d ", $1, $2, $3, $4 )}’ file

  • awk ‘{printf (“|%-15s|\n”, $1)}’


Printf examples1
Printf examples

  • Let’s put an average in there...

  • printf (" %-10s %-5d %-5d %-5d %f ", $1, $2, $3, $4, average )

  • Will provide RAW number ( as many decimals as the calculation provides with 6 char’s to RIGHT of decimal)

  • printf (" %-10s %-5d %-5d %-5d %.2f ", $1, $2, $3, $4, average )

  • %.2f says use TWO char's to RIGHT of decimal

  • printf doesn't provide the newline automatically....

  • printf (" %-10s %-5d %-5d %-5d %.2f \n ", $1, $2, $3, $4, average )


The ofmt variable stands for output formatting for numbers
The OFMT variable(stands for Output Formatting for numbers)

  • A special awk variable

  • Control the printing of numbers when using print function

  • awk ‘BEGIN{print 1.243434534;}’

  • awk ‘BEGIN{OFMT=“%.2f”; print 1.23344455;}’


ad