Regular Expressions. In awk distinguish among /[a-dw-z]/ /^[a-dw-z]/ /[^a-dw-z]/ /^[a-dw-z]\$/ !/[a-dw-z]/. Variables in Scripts. Built-in vs. user defined Initialization conventions Values vs. names Types and coercion. awk Variables. Compare NF

## PowerPoint Slideshow about ' CSC 4630' - taniel

### CSC 4630

Meeting 12

Exam Reprise

Regular Expressions

In awk distinguish among

• /[a-dw-z]/
• /^[a-dw-z]/
• /[^a-dw-z]/
• /^[a-dw-z]\$/
• !/[a-dw-z]/
Variables in Scripts
• Built-in vs. user defined
• Initialization conventions
• Values vs. names
• Types and coercion
awk Variables
• Compare
• NF
• \$NF
• NR
• \$NR
awk Variables (2)

Example: Implement wc for numnames

{len += length(\$1)}

END {print NR, NR, NR + len}

Notes:

• len initializes to 0
• NR increments with each line read
• NF is consistently 1
• length(\$0) = length(\$1) = length(\$NF)
• length does not count the \n line terminators
awk Strings

Suppose each line of a file looks like the 14

character string

(610) 519-4505

How many fields are there in the string?

• If using default FS, two fields, lengths 5,8
• If FS = “-”, two fields, lengths 9,4
awk Strings (2)

Reformatting the string (610) 519-4505 and others of this form

{print substr(\$1,2,3) “-” \$2}

Notes:

• No pattern means apply to all lines
• No commas in print statement means concatenate strings, commas mean insert OFS
• awk does not allow multiple simultaneous field separators (except the default space and tab
awk Scripts

Given data separated by tabs, fifth field is elapsed time given as hh:mm

Script computes average elapsed time.

Try again (by Wednesday), using a template that

• has one action statement in body
• has one calculation and one print statement for the END pattern
• uses one user-defined variable called t
• uses + * / % operators, int and substr functions
awk Scripts (2)

Given a file of words, one per line.

Script returns frequency count of the letters in the words.

Try again (by Wednesday), using a template that

• has one action statement in body, a for loop
• has one statement for the END pattern, a for loop that controls the printing
• uses one user-defined variable, an array called lc