Introduction to Pattern Matching in Perl: Concepts and Techniques

Perl Chapter 7 Pattern Matching

Introduction • Scanning strings for substrings useful in many applications • grep, find files, compilers, … • Pattern matching  UNIX (egrep) and awk.. • Basis is regular expressions • from theory of computation? • Patterns are boolean expressions  T/F • Patterns remember parts (list)

Syntax • m dl pattern dl [modifiers] • m is the operator • using / .. / as the delimiters makes m optional • Examples m ~pattern~ # ~ if / in pattern or /pattern/

Simple Patterns • Match individual char or character classes • 3 categories • normal chars– which match themselves • metachars, which have special meanings in patterns (\, $, ? , + ) • backslash will turn a meta char into a normal char \? • period • Escape sequences (\t) can appear in a pattern in which case they match themselves, if preceded by the \

Default string to match is $_ if (/snow/) { print “snow in \$_ \n”; } • /snow/ returns T/F • period matches any char expect a newline • /a../ would be an a followed by 2 non-newline chars

Matching Character classes • defined by placing chars in [ ]s • [A-Za-z] • [0-7] octal digit • [aeiou] • [^A-Za-z] chars NOT in char class

Common character classes • \d [0-9] • \D [^0-9] • \w [A-Za-z] a word char • \W [^A-Za-z] • \s [ \r\t\n\f] white space • \S [^ \r\t\n\f]

/[A-Z]”\s/ - matches uppercase letter, a double quote and a whitespace • /[\dA-Fa-f]/ - matches one Hex digit $pattern = “ slkdjfsdf”; if (/$pattern/) { …. }

Quantifiers • {n} - exactly n reps • {m, } – at least m reps • {m,n} - at least m, but not more than n /a{1,3}b}/ - matches ab, aab, aaab /(cats){3}/ - matches catscatscats /[abc]{1,2}/ - matches a, b, c, ab, ac, ba, bc, ca, cb • * 0 or more, including empty string • + 1 or more • ? 0 or 1 • . 1

/\w+/ matches 1 or more word-chars • /\d+\.\d+/ matches 1 or more digits, decimal, 1 or more digits (i.e., a real decimal number) Note \. matches decimal!! • /\$?\d+\.\d\d/ matches a price with or without $ • /ba(ll)*/ matches ba followed by 0 or more occurrences of string ll • /\d{3}-\d{2}-\d{4}/ matches SSN

Questions Assume $_ = “Tommie”; • Which m in Tommie does /m/ match? • What do these match? • /m*/ • /m+/ • /m*i/ • left most • matches empty string at beginning • matches mm • matches mmi

Matching • .* greedy mode (default) matches the max possible non-newline chars $_=“Bob Bobcat Bobolink”; /.*Bob/ will match the Bob in Bobolink Actually .* matches whole string, then backs up one character at a time until it finds a match for the rest of the pattern “Bob”, finding rightmost occurrence. Works that way for all quantified patterns.

Matching $_=“Freddie’s hot dogs are really hot!”; • /Fred+/  Fredd • /Fred+?/ ? minimal mode  Fred • /.*hot/  last hot • /.*?hot/  first hot

Precedence • From highest to lowest • () • Quantifiers • char sequence - [belly|belts|bells] • Alternation • Careful mixing alternation with char-class • [belly|belts|bells] eq to [belyts]

Binding operators • pattern can be matched to any string • connect string to pattern • $stringvar =~ /[,;:]/; finds pattern in $stringvar • $string !~ /[,;:]/; finds pattern, but inverts logic

Remembering matches $s = “TD ran for 305 yards today”; $s =~ /(\d+)(\w+)(\w+)/; print “$1 $2 $3 \n”; • prints 305 yards today • Matching parentheses $s =~ /((\d+)(\w+)(\w+))/; • $1 305 yards today • $2 305 • $3 yards • $4 today

Split with a pattern $s = “Betty, Bert, Bart, Bartholomew” @names = split /, /, $s $s = “Betty:778:Bert:222:Bart:43297:Bartholomew” $s =~ /:\d+:/ • $1 = Betty $2-Bert $3=Bart $4=Bartholomew

Substitutions $x = “no more apples!”; $x=~ s /apples/applets/;  $x changed to “no more applets!” $x = “12034005”; $x =~ s/0//g; $x changes $x to “12345” • g modifier changes every occurrence

Translating characters • tr /search-list/replacement-list/ • tr /a-z/A-Z/; replaces all LC to UC, returns number replaced • tr /\./\./; replaces all . with ., but returns number of replacements (so in effect counts) $s = “Hello”; $s =~ tr /a-z/A-Z/; changes to HELLO, returns 4 (or true)

Introduction to Pattern Matching in Perl: Concepts and Techniques

Introduction to Pattern Matching in Perl: Concepts and Techniques

Presentation Transcript

Perl

Perl Chapter 5

Perl

Perl

Perl Chapter 4

Perl

1.1.2.8.7 Intermediate Perl – Session 7

Chapter 9: Perl (continue)

PERL

1.0.1.8 .7 Introduction to Perl Session 7

Perl

Perl

Lecture 7: Perl Introduction

Perl

Perl Chapter 9

Perl Chapter 6

Perl Chapter 4

Chapter 4 Perl programming