Prx functions there is hardly anything regular about them
Download
1 / 50

PRX Functions: There is Hardly Anything Regular About Them! - PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on

PRX Functions: There is Hardly Anything Regular About Them!. Ken Borowiak. Regular Expressions. Regular Expressions. String that describes a PATTERN. Why Should You Care About Regex?. Flexibility INDEX Colon modifier LIKE operator in a WHERE clause. Why Should You Care About Regex?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' PRX Functions: There is Hardly Anything Regular About Them!' - onella


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


Regular expressions1
Regular Expressions

String that describes aPATTERN


Why should you care about regex
Why Should You Care About Regex?

  • Flexibility

    • INDEX

    • Colon modifier

    • LIKE operator in a WHERE clause


Why should you care about regex1
Why Should You Care About Regex?

  • Flexibility

  • Ubiquity

    • SAS V9

    • Oracle 10g

    • Java

    • Perl, grep, sed

    • Text Editors – SAS Enhanced Editor, TextPad, etc.

    • Applications – ODS Tagsets, more


Why should you care about regex2
Why Should You Care About Regex?

  • Flexibility

  • Ubiquitity

  • Portable syntax


Why should you care about regex3
Why Should You Care About Regex?

  • Flexibility

  • Ubiquitous

  • Portable syntax

  • Tons of Documentation


Why should you care about regex4
Why Should You Care About Regex?

Assert your:

  • Geekness

  • Nerdness

  • Coolness


What can you do with regex
What Can You Do With Regex?

  • Match

    • Subsetting

    • Conditional logic

    • Validation


Odm iso time validation
ODM – ISO Time Validation

</xs:simpleType>

- <xs:simpleType name="time">

- <xs:restriction base="xs:time">

<xs:pattern value="(((([0-1][0-9])|([2][0-3])):([0-5][0-9]):([0-5][0-9])(\.[0-9]+)?)(((\+|-)(([0-1][0-9])|([2][0-3])):[0-5][0-9])|(Z))?)"/>

</xs:restriction>



What can you do with regex2
What Can You Do with Regex?

  • Match

  • Extract

  • Substitution (Find-&-Replace)

    • Compression


Prx functions
PRX* Functions

  • New in SAS V9

  • Regex engine of Perl 5.6.1


Sample data
Sample Data

MR Bigglesworth

Mini-mr biggggleswerth

Mr. Austin D. Powers

dr evil

MINI-ME(1/8th size of dr evil)

mr bIgglesWorTH

Mi$$e$ Vanessa Kensington

Sc0tt Evil


Matching via prxmatch
Matching via PRXMATCH

procprintdata=characters label ;

where

prxmatch('/Mr/', name)>0;

run ;


Matching via prxmatch1
Matching via PRXMATCH

prxmatch('/Mr/', name)>0;

RESULT

obsname

3Mr. Austin D. Powers


Important point
IMPORTANT POINT

Default setting is

case-sensitive


Match m followed by r or r
Match 'M' followed by 'R' or 'r'


Match m followed by r or r1
Match 'M' followed by 'R' or 'r'

procprintdata=characters label ;

where

prxmatch('/M[Rr]/', name) ;

run ;


Match m followed by r or r2
Match 'M' followed by 'R' or 'r'

procprintdata=characters label ;

where

prxmatch('/M[Rr]/', name) ;

run ;

CHARACTER CLASS


Match m followed by r or r3
Match 'M' followed by 'R' or 'r'

prxmatch('/M[Rr]/', name) ;

RESULT

obsname

1 MR Bigglesworth

3 Mr. Austin D. Powers


Match m followed by r or rs
Match 'M' followed by 'R' or 'rs'

procprintdata=characters label ;

where

prxmatch('/M(R|rs)/',name) ;

run ;


Match m followed by r or r4
Match 'M' followed by 'R' or 'r'

procprintdata=characters label ;

where

prxmatch('/M(R|rs)/',name) ;

run ;

Alternation


Match m followed by r or rs1
Match 'M' followed by 'R' or 'rs'

prxmatch('/M(R|rs)/', name) ;

RESULT

obsname

1 MR Bigglesworth



Case insensitive search for mr1
Case Insensitive Search for ‘MR’

procprintdata=characters label ;

where

prxmatch('/MR/i', name) ;

run ;

Modifier


Case insensitive search for mr2
Case Insensitive Search for ‘MR’

prxmatch('/MR/i', name) ;

obsname

1 MR Bigglesworth

2 Mini-mr bigggglesworth

3 Mr. Austin D. Powers

6 mr bIgglesWorTH


Case insensitive search for mr at start of the field
Case Insensitive Search for ‘MR’ at Start of the Field


Case insensitive search for mr at start of field
Case Insensitive Search for ‘MR’ at Start of Field

procprintdata=characters label ;

where

prxmatch('/^MR/i', name) ;

run ;

Anchor


Case insensitive search for mr at start of field1
Case Insensitive Search for ‘MR’ at Start of Field

prxmatch('/^MR/i', name) ;

RESULT

obsname

1 MR Bigglesworth

3 Mr. Austin D. Powers

6 mr bIgglesWorTH


Metacharacters
Metacharacters

  • [ Beginning of character class

  • ] End of character class

  • ^ Beginning of field anchor (1st pos of regex)

  • [^ ] Negated character class

  • ( Beginning of grouping for alternation


More metacharacters
More Metacharacters

  • . Match any character

  • ? Match preceeding subexpression 0 or 1 times

  • * Match preceeding subexpression 0 or many times

  • + Match preceeding subexpression 1 or many times


More metacharacters1
More Metacharacters

QUANTIFIERS

  • ? Match preceeding subexpression 0 or 1 times

  • * Match preceeding subexpression 0 or many times

  • + Match preceeding subexpression 1 or many times


Matching a metacharacter
Matching a Metacharacter

Case Insensitive Search for ‘MR.’


Matching a metacharacter1
Matching a Metacharacter

procprintdata=characters label ;

where

prxmatch('/MR./i', name) ;

run ;


Matching a metacharacter2
Matching a Metacharacter

prxmatch('/MR./i', name) ;

obsname

1 MR_Bigglesworth

2 Mini-mr_bigggglesworth

3 Mr. Austin D. Powers

6 mr_bIgglesWorTH


Matching a metacharacter3
Matching a Metacharacter

procprintdata=characters label ;

where

prxmatch('/MR\./i', name) ;

run ;


Matching a metacharacter4
Matching a Metacharacter

procprintdata=characters label ;

where

prxmatch('/MR\./i', name) ;

run ;

‘backwhacked’ or masked


Matching a metacharacter5
Matching a Metacharacter

prxmatch('/MR\./i', name) ;

RESULT

obsname

3Mr. Austin D. Powers


Quantifiers
Quantifiers

Find misspellings of ‘bigglesworth’

obsname

1 MR Bigglesworth

2 Mini-mr biggggleswerth

6 mr bIgglesWorTH


Quantifiers1
Quantifiers

'/bigg+lesw(o|e)rth/i'

Quantifier applies only to the second ‘g’


Quantifiers2
Quantifiers

'/big{2,}lesw(o|e)rth/i'

Match at least 2 ‘g’


Predefined character classes
Predefined Character Classes

  • \d Any digit

    [0-9]

  • \D Any non-digit

    [^0-9]

  • [[:digit:]] POSIX bracketed expression

  • \w Any word charcter

    [A-Za-z0-9_]



Search for a digit1
Search for a Digit

prxmatch('/\d/', name);

RESULT

obsname

5 MINI-ME(1/8th size of dr evil)

8 Sc0tt Evil


Search for a digit2
Search for a Digit

prxmatch('/[[:digit:]]/', name);

RESULT

obsname

5 MINI-ME(1/8th size of dr evil)

8 Sc0tt Evil


Quiz

Rewrite the following with PRX

where substr( ATC, 1, 3 )

in ( ‘C01’ ‘C03’ ‘C07’ ‘C08’ ‘C09’ ) ;


Solution
Solution

prxmatch( ‘/^C0[13789]/’ , ATC ) ;

prxmatch( ‘/^C0[137-9]/’ , ATC ) ;

prxmatch( ‘/^C0(1|3|7|8|9)/’ , ATC ) ;


Summary
SUMMARY

  • PRX* are powerful

  • Learning curve can be steep

    • Start with easy task

  • Shine in the face of difficult tasks


Contact info
Contact Info

Contact Info:

Ken Borowiak

[email protected]

[email protected]


ad