Prx functions there is hardly anything regular about them
This presentation is the property of its rightful owner.
Sponsored Links
1 / 50

PRX Functions: There is Hardly Anything Regular About Them! PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

PRX Functions: There is Hardly Anything Regular About Them!. Ken Borowiak. Regular Expressions. Regular Expressions. String that describes a PATTERN. Why Should You Care About Regex?. Flexibility INDEX Colon modifier LIKE operator in a WHERE clause. Why Should You Care About Regex?.

Download Presentation

PRX Functions: There is Hardly Anything Regular About Them!

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Prx functions there is hardly anything regular about them

PRX Functions: There is Hardly Anything Regular About Them!

Ken Borowiak


Regular expressions

Regular Expressions


Regular expressions1

Regular Expressions

String that describes aPATTERN


Why should you care about regex

Why Should You Care About Regex?

  • Flexibility

    • INDEX

    • Colon modifier

    • LIKE operator in a WHERE clause


Why should you care about regex1

Why Should You Care About Regex?

  • Flexibility

  • Ubiquity

    • SAS V9

    • Oracle 10g

    • Java

    • Perl, grep, sed

    • Text Editors – SAS Enhanced Editor, TextPad, etc.

    • Applications – ODS Tagsets, more


Why should you care about regex2

Why Should You Care About Regex?

  • Flexibility

  • Ubiquitity

  • Portable syntax


Why should you care about regex3

Why Should You Care About Regex?

  • Flexibility

  • Ubiquitous

  • Portable syntax

  • Tons of Documentation


Why should you care about regex4

Why Should You Care About Regex?

Assert your:

  • Geekness

  • Nerdness

  • Coolness


What can you do with regex

What Can You Do With Regex?

  • Match

    • Subsetting

    • Conditional logic

    • Validation


Odm iso time validation

ODM – ISO Time Validation

</xs:simpleType>

- <xs:simpleType name="time">

- <xs:restriction base="xs:time">

<xs:pattern value="(((([0-1][0-9])|([2][0-3])):([0-5][0-9]):([0-5][0-9])(\.[0-9]+)?)(((\+|-)(([0-1][0-9])|([2][0-3])):[0-5][0-9])|(Z))?)"/>

</xs:restriction>


What can you do with regex1

What Can You Do With Regex?

  • Match

  • Extract


What can you do with regex2

What Can You Do with Regex?

  • Match

  • Extract

  • Substitution (Find-&-Replace)

    • Compression


Prx functions

PRX* Functions

  • New in SAS V9

  • Regex engine of Perl 5.6.1


Sample data

Sample Data

MR Bigglesworth

Mini-mr biggggleswerth

Mr. Austin D. Powers

dr evil

MINI-ME(1/8th size of dr evil)

mr bIgglesWorTH

Mi$$e$ Vanessa Kensington

Sc0tt Evil


Matching via prxmatch

Matching via PRXMATCH

procprintdata=characters label ;

where

prxmatch('/Mr/', name)>0;

run ;


Matching via prxmatch1

Matching via PRXMATCH

prxmatch('/Mr/', name)>0;

RESULT

obsname

3Mr. Austin D. Powers


Important point

IMPORTANT POINT

Default setting is

case-sensitive


Match m followed by r or r

Match 'M' followed by 'R' or 'r'


Match m followed by r or r1

Match 'M' followed by 'R' or 'r'

procprintdata=characters label ;

where

prxmatch('/M[Rr]/', name) ;

run ;


Match m followed by r or r2

Match 'M' followed by 'R' or 'r'

procprintdata=characters label ;

where

prxmatch('/M[Rr]/', name) ;

run ;

CHARACTER CLASS


Match m followed by r or r3

Match 'M' followed by 'R' or 'r'

prxmatch('/M[Rr]/', name) ;

RESULT

obsname

1 MR Bigglesworth

3 Mr. Austin D. Powers


Match m followed by r or rs

Match 'M' followed by 'R' or 'rs'

procprintdata=characters label ;

where

prxmatch('/M(R|rs)/',name) ;

run ;


Match m followed by r or r4

Match 'M' followed by 'R' or 'r'

procprintdata=characters label ;

where

prxmatch('/M(R|rs)/',name) ;

run ;

Alternation


Match m followed by r or rs1

Match 'M' followed by 'R' or 'rs'

prxmatch('/M(R|rs)/', name) ;

RESULT

obsname

1 MR Bigglesworth


Case insensitive search for mr

Case Insensitive Search for ‘MR’


Case insensitive search for mr1

Case Insensitive Search for ‘MR’

procprintdata=characters label ;

where

prxmatch('/MR/i', name) ;

run ;

Modifier


Case insensitive search for mr2

Case Insensitive Search for ‘MR’

prxmatch('/MR/i', name) ;

obsname

1 MR Bigglesworth

2 Mini-mr bigggglesworth

3 Mr. Austin D. Powers

6 mr bIgglesWorTH


Case insensitive search for mr at start of the field

Case Insensitive Search for ‘MR’ at Start of the Field


Case insensitive search for mr at start of field

Case Insensitive Search for ‘MR’ at Start of Field

procprintdata=characters label ;

where

prxmatch('/^MR/i', name) ;

run ;

Anchor


Case insensitive search for mr at start of field1

Case Insensitive Search for ‘MR’ at Start of Field

prxmatch('/^MR/i', name) ;

RESULT

obsname

1 MR Bigglesworth

3 Mr. Austin D. Powers

6 mr bIgglesWorTH


Metacharacters

Metacharacters

  • [ Beginning of character class

  • ] End of character class

  • ^ Beginning of field anchor (1st pos of regex)

  • [^ ] Negated character class

  • ( Beginning of grouping for alternation


More metacharacters

More Metacharacters

  • . Match any character

  • ? Match preceeding subexpression 0 or 1 times

  • * Match preceeding subexpression 0 or many times

  • + Match preceeding subexpression 1 or many times


More metacharacters1

More Metacharacters

QUANTIFIERS

  • ? Match preceeding subexpression 0 or 1 times

  • * Match preceeding subexpression 0 or many times

  • + Match preceeding subexpression 1 or many times


Matching a metacharacter

Matching a Metacharacter

Case Insensitive Search for ‘MR.’


Matching a metacharacter1

Matching a Metacharacter

procprintdata=characters label ;

where

prxmatch('/MR./i', name) ;

run ;


Matching a metacharacter2

Matching a Metacharacter

prxmatch('/MR./i', name) ;

obsname

1 MR_Bigglesworth

2 Mini-mr_bigggglesworth

3 Mr. Austin D. Powers

6 mr_bIgglesWorTH


Matching a metacharacter3

Matching a Metacharacter

procprintdata=characters label ;

where

prxmatch('/MR\./i', name) ;

run ;


Matching a metacharacter4

Matching a Metacharacter

procprintdata=characters label ;

where

prxmatch('/MR\./i', name) ;

run ;

‘backwhacked’ or masked


Matching a metacharacter5

Matching a Metacharacter

prxmatch('/MR\./i', name) ;

RESULT

obsname

3Mr. Austin D. Powers


Quantifiers

Quantifiers

Find misspellings of ‘bigglesworth’

obsname

1 MR Bigglesworth

2 Mini-mr biggggleswerth

6 mr bIgglesWorTH


Quantifiers1

Quantifiers

'/bigg+lesw(o|e)rth/i'

Quantifier applies only to the second ‘g’


Quantifiers2

Quantifiers

'/big{2,}lesw(o|e)rth/i'

Match at least 2 ‘g’


Predefined character classes

Predefined Character Classes

  • \d Any digit

    [0-9]

  • \D Any non-digit

    [^0-9]

  • [[:digit:]] POSIX bracketed expression

  • \w Any word charcter

    [A-Za-z0-9_]


Search for a digit

Search for a Digit


Search for a digit1

Search for a Digit

prxmatch('/\d/', name);

RESULT

obsname

5 MINI-ME(1/8th size of dr evil)

8 Sc0tt Evil


Search for a digit2

Search for a Digit

prxmatch('/[[:digit:]]/', name);

RESULT

obsname

5 MINI-ME(1/8th size of dr evil)

8 Sc0tt Evil


Prx functions there is hardly anything regular about them

Quiz

Rewrite the following with PRX

where substr( ATC, 1, 3 )

in ( ‘C01’ ‘C03’ ‘C07’ ‘C08’ ‘C09’ ) ;


Solution

Solution

prxmatch( ‘/^C0[13789]/’ , ATC ) ;

prxmatch( ‘/^C0[137-9]/’ , ATC ) ;

prxmatch( ‘/^C0(1|3|7|8|9)/’ , ATC ) ;


Summary

SUMMARY

  • PRX* are powerful

  • Learning curve can be steep

    • Start with easy task

  • Shine in the face of difficult tasks


Contact info

Contact Info

Contact Info:

Ken Borowiak

[email protected]

[email protected]


  • Login