slide1
Download
Skip this Video
Download Presentation
Math 272

Loading in 2 Seconds...

play fullscreen
1 / 29

Math 272 - PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on

Math 272. AWK UTILITY. BY A Mikati & M Shaito. supervised by:. Dr. A Nasri. Awk Utility. Introduction Some basics Some samples Patterns & Actions Regular Expressions Boolean start /end BEGIN /END. Awk Utility (continued). Awk variables Control of flow statements:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Math 272' - dezso


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Math 272

AWK UTILITY

slide2

BY

A Mikati

&

M Shaito

slide3

supervised by:

Dr. A Nasri

awk utility
Awk Utility
  • Introduction
  • Some basics
  • Some samples
  • Patterns & Actions
  • Regular Expressions
  • Boolean
  • start /end
  • BEGIN /END
awk utility continued
Awk Utility (continued)
  • Awk variables
  • Control of flow statements:
  • a: If_Else statement
  • b: While Statement
  • c: For statement
introduction
Introduction:
      • History:
  • The name awk comes from the initials of its designers: Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan. The original version of awk was written in 1977. In 1985 a new version made the programming language more powerful, introducing user-defined functions, multiple input streams, and computed regular expressions.
introduction cont d
Introduction (cont’d):

If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. To write a program to do this in a language such as C or Pascal is a time-consuming inconvenience that may take many lines of code. The job may be easier with awk.

The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs easily with just a few lines of code.

some basics
Some Basics:
  • The basic function of awk is to search files for lines (or other units of text) that contain certain patterns.
        • Awk recognizes the concepts of "file", "record", and "field".
        • A file consists of records, which by default are the lines of the file. One line becomes one record.
        • Awk operates on one record at a time.
        • A record consists of fields, which by default are separated by any number of spaces or tabs.
        • Field number 1 is accessed with $1, field 2 with $2, and so forth. $0 refers to the whole record.
some samples
Some Samples:

>awk ‘{print $0}’ filename

Perhaps the quickest way of learning awk is to look at some sample programs. The one above will print the file in its entirety, just like cat(1). Here are some others, along with a quick description of what they do.

>awk \'{print $2,$1}\' filename

will print the second field, then the first. All other fields are ignored.

What if you don\'t want to apply the program to each line of the file? Say, for example, that you only wanted to process lines that had the first field greater than the second. The following program will do that:

>awk \'$1 > $2 {print $1,$2,$1-$2}\' filename

patterns actions
Patterns & Actions:

The part outside the curly braces is called the "pattern", and the part inside is the "action". The comparison operators include the ones from C:

== != < > <= >= ?:

If no pattern is given, then the action applies to all lines. This fact was used in the sample programs above. If no action is given, then the entire line is printed. If "print" is used all by itself, the entire line is printed. Thus, the following are equivalent:

awk \'$1 > $2\' filename

awk \'$1 > $2{print}\' filename

awk \'$1 > $2{print $0}\' filename

patterns actions cont d
Patterns &Actions: (cont’d)

The various fields in a line can also be treated as strings instead of numbers. To compare a field to a string, use the following method:

>awk \'$1=="foo"{print $2}\' filename

There are various types of patterns and actions that will be explained in details.

kinds of patterns
Kinds of patterns:
      • /regular expression/
    • A regular expression as a pattern. It matches when the text of the input record fits the regular expression.
  • expression
    • A single expression. It matches when its value, converted to a number, is nonzero (if a number) or non null (if a string).
  • BEGIN END
    • Special patterns to supply start-up or clean-up information to awk.
  • null
    • The empty pattern matches every input record.
regular expressions
Regular Expressions:

A regular expression, or regexp, is a way of describing a class of strings. A regular expression enclosed in slashes (`/\') is an awk pattern that matches every input record whose text belongs to that class.

The simplest regular expression is a sequence of letters, numbers, or both. Such a regexp matches any string that contains that sequence. Thus, the regexp`foo\' matches any string containing `foo\'. Therefore, the pattern /foo/ matches any input record containing `foo\'. Other kinds of regexps let you specify more complicated classes of strings.

>awk \'/foo.*bar/{print $1,$3}\' filename

boolean
Boolean:

A Boolean pattern is an expression which combines other patterns using the Boolean operators "or" (`||\'), "and" (`&&\'), and "not" (`!\'). Whether the Boolean pattern matches an input record depends on whether its subpatterns match.

For example, the following command prints all records in the input file `filename\' that contain both `2400\'and `foo\'.

awk \'/2400/ && /foo/\' filename

start end
Start & end:

There are three special forms of patterns that do not fit the above descriptions. One is the start-end pair of regular expressions. Also it is known as range pattern which is made of two patterns separated by a comma, of the form startpat, endpat. It matches ranges of consecutive input records. The first pattern startpat controls where the range begins, and the second one endpat controls where it ends. For example,

awk \'$1 == "on", $1 == "off"’ filename

begin end
BEGIN /END:
  • Any action associated with the BEGIN pattern will happen before any line-by-line processing is done. Actions with the END pattern will happen after all lines are processed.
  • But how do you put more than one pattern-action pair into an awk program? There are several choices.
        • One is to just mash them together, like so:
        • >awk \'BEGIN{print"fee"}\ $1=="foo"{print"fi"}\
        • END{print"fo fum"}\' filename
begin end cont d
BEGIN /END: (cont’d)
  • Another choice is to put the program into a file, like so:
  • BEGIN{print"fee"}
  • $1=="foo"{print"fi"}
  • END{print"fo fum"}
  • Let\'s say that\'s in the file giant.awk. Now, run it using the "-f" flag to awk:
  • >awk -f giant.awk filename
begin end cont d1
BEGIN / END : (cont’d)
        • Athird choice is to create a file that calls awk all by itself. The following form will do the trick
        • #!/usr/bin/awk -f
        • BEGIN{print"fee"}
        • $1=="foo"{print"fi"}
        • END{print"fo fum"}
  • If we call this file giant2.awk, we can run it by first giving it execute permissions,
  • >chmod u+x giant2.awk
  • and then just call it like so:
  • >./giant2.awk filename .
begin end cont d2
BEGIN /END: (cont’d)

awk has variables that can be either real numbers or strings. For example, the following code prints a running total of the fifth column:

>awk \'{print x+=$5,$0 }\' filename

This can be used when looking at file sizes from an "ls -l". It is also useful for balancing one\'s checkbook, if the amount of the check is kept in one column.

actions
Actions:

An awk program or script consists of a series of rules and function definitions, interspersed. A rule contains a pattern and an action, either of which may be omitted. The purpose of the action is to tell awk what to do once a match for the pattern is found. Thus, the entire program looks somewhat like this:

[pattern] [{ action }]

[pattern] [{ action }]

function name (args) { ... }

An action consists of one or more awk statements, enclosed in curly braces (`{\' and `}\'). Each statement specifies one thing to be done. The statements are separated by newlines or semicolons.

actions cont d
Actions: (cont’d)

Here are the kinds of statements supported in awk: 1)Expressions, which can call functions or assign values to variables .Executing this kind of statement simply computes the value of the expression and then ignores it. This is useful when the expression has side effects

2)Control statements, which specify the control flow of awk programs. The awk language gives you C-like constructs (if, for, while, and so on) as well as a few special ones 3)Compound statements, which consist of one or more statements enclosed in curly braces. A compound statement is used in order to put several statements together in the body of an if, while, do or for statement.

actions cont d1
Actions:(cont’d)

4)Input control, using the getline command and the next statement

5)Output statements, print and printf.

6)Deletion statements, for deleting array elements.

awk variables
Awk variables

Most awk variables are available for you to use for your own purposes; they never change except when your program assigns values to them, and never affect anything except when your program examines them.

A few variables have special built-in meanings. Some of them awk examines automatically, so that they enable you to tell awk how to do certain things. Others are set automatically by awk, so that they carry information from the internal workings of awk to your program.

user-modified: Built-in variables that you change to control awk. Auto-set: Built-in variables where awk gives you info.

control of flow statements
Control of flow statements:

Control statements such as if, while, and so on control the flow of execution in awk programs. Most of the control statements in awk are patterned on similar statements in C.

All the control statements start with special keywords such as if and while, to distinguish them from simple expressions.

Many control statements contain other statements; for example, the if statement contains another statement which may or may not be executed. The contained statement is called the body. If you want to include more than one statement in the body, group them into a single compound statement with curly braces, separating them with newlines or semicolons.

if statement
If- statement :

The if-else statement is awk\'s decision-making statement. It looks like this:

if (condition) then-body [else else-body]

condition is an expression that controls what the rest of the statement will do. If condition is true, then-body is executed; otherwise, else-body is executed (assuming that the else clause is present). The else part of the statement is optional. The condition is considered false if its value is zero or the null string, and true otherwise.

awk \'{ if (x % 2 == 0) print "x is even"; else print "x is odd" }\'

while statement
While Statement :

In programming, a loop means a part of a program that is (or at least can be) executed two or more times in succession.

The while statement is the simplest looping statement in awk. It repeatedly executes a statement as long as a condition is true. It looks like this:

while (condition)

body

this example prints the first three fields of each record, one per line.

awk \'{ i = 1 while (i <= 3) {

print $i

i++

}

}\'

for statement
For Statement :

The for statement makes it more convenient to count iterations of a loop. The general form of the for statement looks like this:

for (initialization; condition; increment)

body

This statement starts by executing initialization. Then, as long as condition is true, it repeatedly executes body and then increment.

Here is an example of a for statement:

awk \'{ for (i = 1; i <= 3; i++)

print $i

}\'

This prints the first three fields of each input record, one field at a time.

slide28

Thanks for listening

A Mikati

M Shaito

slide29

For more information about Awk utility

VISIT

http://mshaito.tripod.com/awk/awk.html

http://

ad