Math 272
1 / 29

Math 272 - PowerPoint PPT Presentation

  • Uploaded on

Math 272. AWK UTILITY. BY A Mikati & M Shaito. supervised by:. Dr. A Nasri. Awk Utility. Introduction Some basics Some samples Patterns & Actions Regular Expressions Boolean start /end BEGIN /END. Awk Utility (continued). Awk variables Control of flow statements:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Math 272' - dezso

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Math 272



A Mikati


M Shaito

supervised by:

Dr. A Nasri

Awk utility
Awk Utility

  • Introduction

  • Some basics

  • Some samples

  • Patterns & Actions

  • Regular Expressions

  • Boolean

  • start /end


Awk utility continued
Awk Utility (continued)

  • Awk variables

  • Control of flow statements:

  • a: If_Else statement

  • b: While Statement

  • c: For statement


  • History:

  • The name awk comes from the initials of its designers: Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan. The original version of awk was written in 1977. In 1985 a new version made the programming language more powerful, introducing user-defined functions, multiple input streams, and computed regular expressions.

  • Introduction cont d
    Introduction (cont’d):

    If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. To write a program to do this in a language such as C or Pascal is a time-consuming inconvenience that may take many lines of code. The job may be easier with awk.

    The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs easily with just a few lines of code.

    Some basics
    Some Basics:

    • The basic function of awk is to search files for lines (or other units of text) that contain certain patterns.

      • Awk recognizes the concepts of "file", "record", and "field".

      • A file consists of records, which by default are the lines of the file. One line becomes one record.

      • Awk operates on one record at a time.

      • A record consists of fields, which by default are separated by any number of spaces or tabs.

      • Field number 1 is accessed with $1, field 2 with $2, and so forth. $0 refers to the whole record.

    Some samples
    Some Samples:

    >awk ‘{print $0}’ filename

    Perhaps the quickest way of learning awk is to look at some sample programs. The one above will print the file in its entirety, just like cat(1). Here are some others, along with a quick description of what they do.

    >awk '{print $2,$1}' filename

    will print the second field, then the first. All other fields are ignored.

    What if you don't want to apply the program to each line of the file? Say, for example, that you only wanted to process lines that had the first field greater than the second. The following program will do that:

    >awk '$1 > $2 {print $1,$2,$1-$2}' filename

    Patterns actions
    Patterns & Actions:

    The part outside the curly braces is called the "pattern", and the part inside is the "action". The comparison operators include the ones from C:

    == != < > <= >= ?:

    If no pattern is given, then the action applies to all lines. This fact was used in the sample programs above. If no action is given, then the entire line is printed. If "print" is used all by itself, the entire line is printed. Thus, the following are equivalent:

    awk '$1 > $2' filename

    awk '$1 > $2{print}' filename

    awk '$1 > $2{print $0}' filename

    Patterns actions cont d
    Patterns &Actions: (cont’d)

    The various fields in a line can also be treated as strings instead of numbers. To compare a field to a string, use the following method:

    >awk '$1=="foo"{print $2}' filename

    There are various types of patterns and actions that will be explained in details.

    Kinds of patterns
    Kinds of patterns:

    • /regular expression/

  • A regular expression as a pattern. It matches when the text of the input record fits the regular expression.

  • expression

    • A single expression. It matches when its value, converted to a number, is nonzero (if a number) or non null (if a string).


    • Special patterns to supply start-up or clean-up information to awk.

  • null

    • The empty pattern matches every input record.

  • Regular expressions
    Regular Expressions:

    A regular expression, or regexp, is a way of describing a class of strings. A regular expression enclosed in slashes (`/') is an awk pattern that matches every input record whose text belongs to that class.

    The simplest regular expression is a sequence of letters, numbers, or both. Such a regexp matches any string that contains that sequence. Thus, the regexp`foo' matches any string containing `foo'. Therefore, the pattern /foo/ matches any input record containing `foo'. Other kinds of regexps let you specify more complicated classes of strings.

    >awk '/foo.*bar/{print $1,$3}' filename


    A Boolean pattern is an expression which combines other patterns using the Boolean operators "or" (`||'), "and" (`&&'), and "not" (`!'). Whether the Boolean pattern matches an input record depends on whether its subpatterns match.

    For example, the following command prints all records in the input file `filename' that contain both `2400'and `foo'.

    awk '/2400/ && /foo/' filename

    Start end
    Start & end:

    There are three special forms of patterns that do not fit the above descriptions. One is the start-end pair of regular expressions. Also it is known as range pattern which is made of two patterns separated by a comma, of the form startpat, endpat. It matches ranges of consecutive input records. The first pattern startpat controls where the range begins, and the second one endpat controls where it ends. For example,

    awk '$1 == "on", $1 == "off"’ filename

    Begin end

    • Any action associated with the BEGIN pattern will happen before any line-by-line processing is done. Actions with the END pattern will happen after all lines are processed.

    • But how do you put more than one pattern-action pair into an awk program? There are several choices.

      • One is to just mash them together, like so:

      • >awk 'BEGIN{print"fee"}\ $1=="foo"{print"fi"}\

      • END{print"fo fum"}' filename

    Begin end cont d
    BEGIN /END: (cont’d)

    • Another choice is to put the program into a file, like so:

    • BEGIN{print"fee"}

    • $1=="foo"{print"fi"}

    • END{print"fo fum"}

    • Let's say that's in the file giant.awk. Now, run it using the "-f" flag to awk:

    • >awk -f giant.awk filename

    Begin end cont d1
    BEGIN / END : (cont’d)

    • Athird choice is to create a file that calls awk all by itself. The following form will do the trick

    • #!/usr/bin/awk -f

    • BEGIN{print"fee"}

    • $1=="foo"{print"fi"}

    • END{print"fo fum"}

  • If we call this file giant2.awk, we can run it by first giving it execute permissions,

  • >chmod u+x giant2.awk

  • and then just call it like so:

  • >./giant2.awk filename .

  • Begin end cont d2
    BEGIN /END: (cont’d)

    awk has variables that can be either real numbers or strings. For example, the following code prints a running total of the fifth column:

    >awk '{print x+=$5,$0 }' filename

    This can be used when looking at file sizes from an "ls -l". It is also useful for balancing one's checkbook, if the amount of the check is kept in one column.


    An awk program or script consists of a series of rules and function definitions, interspersed. A rule contains a pattern and an action, either of which may be omitted. The purpose of the action is to tell awk what to do once a match for the pattern is found. Thus, the entire program looks somewhat like this:

    [pattern] [{ action }]

    [pattern] [{ action }]

    function name (args) { ... }

    An action consists of one or more awk statements, enclosed in curly braces (`{' and `}'). Each statement specifies one thing to be done. The statements are separated by newlines or semicolons.

    Actions cont d
    Actions: (cont’d)

    Here are the kinds of statements supported in awk: 1)Expressions, which can call functions or assign values to variables .Executing this kind of statement simply computes the value of the expression and then ignores it. This is useful when the expression has side effects

    2)Control statements, which specify the control flow of awk programs. The awk language gives you C-like constructs (if, for, while, and so on) as well as a few special ones 3)Compound statements, which consist of one or more statements enclosed in curly braces. A compound statement is used in order to put several statements together in the body of an if, while, do or for statement.

    Actions cont d1

    4)Input control, using the getline command and the next statement

    5)Output statements, print and printf.

    6)Deletion statements, for deleting array elements.

    Awk variables
    Awk variables

    Most awk variables are available for you to use for your own purposes; they never change except when your program assigns values to them, and never affect anything except when your program examines them.

    A few variables have special built-in meanings. Some of them awk examines automatically, so that they enable you to tell awk how to do certain things. Others are set automatically by awk, so that they carry information from the internal workings of awk to your program.

    user-modified: Built-in variables that you change to control awk. Auto-set: Built-in variables where awk gives you info.

    Control of flow statements
    Control of flow statements:

    Control statements such as if, while, and so on control the flow of execution in awk programs. Most of the control statements in awk are patterned on similar statements in C.

    All the control statements start with special keywords such as if and while, to distinguish them from simple expressions.

    Many control statements contain other statements; for example, the if statement contains another statement which may or may not be executed. The contained statement is called the body. If you want to include more than one statement in the body, group them into a single compound statement with curly braces, separating them with newlines or semicolons.

    If statement
    If- statement :

    The if-else statement is awk's decision-making statement. It looks like this:

    if (condition) then-body [else else-body]

    condition is an expression that controls what the rest of the statement will do. If condition is true, then-body is executed; otherwise, else-body is executed (assuming that the else clause is present). The else part of the statement is optional. The condition is considered false if its value is zero or the null string, and true otherwise.

    awk '{ if (x % 2 == 0) print "x is even"; else print "x is odd" }'

    While statement
    While Statement :

    In programming, a loop means a part of a program that is (or at least can be) executed two or more times in succession.

    The while statement is the simplest looping statement in awk. It repeatedly executes a statement as long as a condition is true. It looks like this:

    while (condition)


    this example prints the first three fields of each record, one per line.

    awk '{ i = 1 while (i <= 3) {

    print $i




    For statement
    For Statement :

    The for statement makes it more convenient to count iterations of a loop. The general form of the for statement looks like this:

    for (initialization; condition; increment)


    This statement starts by executing initialization. Then, as long as condition is true, it repeatedly executes body and then increment.

    Here is an example of a for statement:

    awk '{ for (i = 1; i <= 3; i++)

    print $i


    This prints the first three fields of each input record, one field at a time.

    Thanks for listening

    A Mikati

    M Shaito

    For more information about Awk utility