1 / 31

Programming in Perl regular expressions and m,s operators

Programming in Perl regular expressions and m,s operators. Peter Verhás January 2002. Pattern Matching Operator. expression =~ m/regexp/options; $a = "apple"; print "yes!" if $a =~ m/pp/; The result is TRUE (1) or FALSE (0). M operator options. g global search

ryann
Download Presentation

Programming in Perl regular expressions and m,s operators

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming in Perlregular expressions and m,s operators Peter Verhás January 2002.

  2. Pattern Matching Operator expression =~ m/regexp/options; $a = "apple"; print "yes!" if $a =~ m/pp/; The result is TRUE (1) or FALSE (0).

  3. M operator options • g global search • i case insensitive search • m multi-line string • s single line string • o evaluate once only • x extended regular expression Now let’s see what Regular expression is and then we will return to m operator fine points.

  4. Regular Expressions • A regular expression is a string with joker characters and joker expressions. • We will look at examples to explain it.

  5. Regular Expression to Verify Email (1) NOTES: $_ is used as default m/is default when / is used $_ =~ m/^.*@\w+\..+$/ @ would also work instead of \@ but \@ is safe @mail = ( 'peter@verhas.com', 'hab.akukk%mikkamakka@jeno', ); for( @mail ){ if( /^.*\@\w+\..+$/ ){ print "$_ seems to be a good eMail\n"; }else{ print "$_ bad address\n"; } } OUTPUT: peter@verhas.com seems to be a good eMail hab.akukk%mikkamakka@jeno bad address

  6. Regular Expression to Verify Email (2) /^.*\@\w+\..+$/ • ^ at the start of the string • .* zero or more any-character • *means zero or more of what stands before • \@ a single @ character • \w+ one or more alpha character • +means one or more of what stands before • \. one . (dot) character • specialregexp character is escaped with\ • .+ one or more any character • $ until end of string

  7. Search and Replace Example of Regular Expressions $text = 'JavaScript is not used on island Java.'; $text =~ s/Java(?!Script)/Borneo/; print $text; OUTPUT: JavaScript is not used on island Borneo. NOTES: Operator s will be dicussed later in detail (?! )is zero length forward look, detailed later

  8. Meta (joker) Character • . any character but new line • ^start of string • $ end of string • \ escaping the next character • \w any alphacharacter • \W any non-alpha character • \s any white space • \S any non-whitespace Only examples, there are other meta characters, see the Perl manual.

  9. Parentheses (1) $text = 'Hook is not used on island Java.'; $text =~ /(Ho(ok))\s(is?).*\3((l|s)(a|l))/; print "$1 $2 $3 $4 $5 $6\n"; # $text = 'Hook i not used on island Java.'; $text =~ /(Ho(ok))\s(is?).*\3((l|s)(a|l))/; print "$1 $2 $3 $4 $5 $6\n"; OUTPUT: Hook ok is la l a Hook ok i sl s l NOTES: Numbering is in the order of the opening parentheses

  10. Parentheses without $n $text = 'Hook is not used on island Java.'; $text =~ /(Ho(ok))\s(is?).*\3((?:l|s)(a|l))/; print "$1 $2 $3 $4 $5 .$6.\n"; $text = 'Hook i not used on island Java.'; $text =~ /(Ho(ok))\s(is?).*\3((?:l|s)(a|l))/; print "$1 $2 $3 $4 $5 .$6.\n"; OUTPUT: Hook ok is la a .. Hook ok i sl l .. NOTES: (?: ) groups sub-expression without creating reference $6 is zero string

  11. Character classes • List of characters between [ and ] • Interval, e.g. [a-f] • Negative character set[^a-f]

  12. Repetitions • * zero or more times • + one or more times • ? zero or one time • {n} exactly n times • {n,} at least n times • {n,m} at least n times, at most m times NOTES: There is {n,} but there is not {,m} Why? (hint: {0,m} works, but {n,???}??)

  13. Greedy repetition • Repetitions are greedy, eat as many characters as possible $text = 'Hook is not used on island Java.'; $text =~ /(.*)is/; #1 print "$1.\n"; $text =~ /(.*?)is/; #2 print "$1.\n"; $text =~ /(.*?)is.*n/; #3 print "$1.\n"; OUTPUT: Hook is not used on . Hook . Hook .

  14. Other extensions • Other UNIX tools also use simpler, similar regular expressions • Perl regular expressions are more powerful List of some extensions on the next slides

  15. Regular expression comment (?# comment comes here) • Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments! Use comments!

  16. Regular Expression Parentheses • (?: sub expression w/o $n) (?: we have discussed it already beforehand as it came up in an example, but this is the proper place to discuss this construct.)

  17. Positive look forward (?= subregexp) $t = 'jamaica rum rum kingston rum'; $t =~ s/([aeoui])(?=\w)/uc($1)/ge; print $t; • OUTPUT: jAmAIca rUm rUm kIngstOn rUm Example: Uppercase all vowels standing inside a word to upper case.

  18. Negative look forward (?! subregexp) $t = 'jamaica rum rum kingston rum'; $t =~ s/([aeoui])(?!\w)/uc($1)/ge; print $t; • OUTPUT: jamaicA rum rum kingston rum Example: Uppercase all vowels standing end of a word to upper case.

  19. Option change inside the regular expression (? imsx) • This can be used inside m/ or s/ operator. • i and g options can not be used Now we go back to operator m/ and discuss some details.

  20. M operator array result @k = "abbabaa" =~ m/(bb).+(a.)/; print $#k; print ' ',$k[0],' ',$k[1],"\n"; OUTPUT: 1 bb aa NOTES: Parts of the expression are closed into ( ) $1, $2 ... are the default variables where the substrings are put

  21. M operator option g(1) @k = "abbabaa" =~ m/(b)(a)/g; print $#k,' ',$k[0],' ',$k[1],' ',$k[2],' ',$k[3],"\n"; OUTPUT: 3b a b a NOTES: $_ is used as default m/is default when / is used @ would also work instead of \@ but it is safe

  22. M operator option g(2) $t = "abbabaa"; while( $t =~ m/(ab)(b|a)/g ){ print pos($t)," $1 $2\n"; } OUTPUT: 3 ab b 6 ab a

  23. M operator option i • Case insensitive match print '.',"apple" =~ /AppLe/,".\n"; print '.',"apple" =~ /AppLe/i,".\n"; • prints .. .1.

  24. M operator options m and s $t = "mah\na\nb"; while( $t =~ /(.?.)$/mg ){ print '.',$1; }print ".\n"; while( $t =~ /(.?.)$/sg ){ print '.',$1; }print ".\n"; while( $t =~ /(.?.)$/g ){ print '.',$1; }print ".\n"; • OUTPUT: .ah.a.b. . b. .b. mmatches$to all\nin the string smatches.to\n(otherwise.is any character but\n)

  25. M operator option o • Evaluate the regular expression only once to save processor $t = "al brab"; $a = 'al'; $b = 'rab'; &q;&p; $b = 'fe'; &q;&p; sub q { print ' q',$t =~ /$a\sb$b/o } sub p { print ' p',$t =~ /$a\sb$b/ } • prints q1 p1 q1 p

  26. M operator option x @k = "abbabaa" =~ m/(bb) #two or more 'b' gets into $1 .+ #one or more any-character (a.) #a letter 'a' and exactly one any-character /x; #space and comment allowed print $#k; print ' ',$k[0],' ',$k[1],"\n"; OUTPUT: 1 bb aa This option allows space (\ is space) and comments to ease readability.

  27. Operator s $text =~ s/regexp/replace/egimosx • Options: • e replace is interpreted as expression • g global search and replace • i case insensitive search • m string is treated as multi-line • o regular expression is evaluated only once • s string is treated as single-line • x extended syntax for the regexp

  28. Global Search and Replace $t = "abbab" ; $t =~ s/ab/aa/g; print $t; OUTPUT: aabaa replaces all occurrences of the search regular expression to the replacement string

  29. m and s operators with different delimiters • / is the default, but you can use • ' to have non-interpolated string • Other non alphanumeric characters • () {} [] with matching character pairs • In this case s{search}{replace}

  30. m and s operators with different delimiters example $text = 'a@bba@bbabb'; @b = ('bba'); $text =~ s{@b}{q}g; print "$text\n"; $text = 'a@bba@bbabb'; $text =~ s'@b'q'g; print "$text\n"; OUTPUT: a@q@qbb aqbaqbabb @bis evaluated in the first search but not in the second

  31. Thank you for your kind attention.

More Related