330 likes | 456 Views
This guide provides a comprehensive overview of using regular expressions in PHP for pattern matching and string manipulation. Learn how to define and compare patterns, test for matches, and replace matched strings using PHP's built-in functions. Explore different types of regular expression engines, including PCRE and POSIX, and understand the significance of metacharacters in crafting complex patterns. Gain insights into searching for exact matches, case sensitivity, and handling multiple occurrences of characters. Perfect for beginner to advanced PHP developers.
E N D
ECA 236 Open Source Server Side Scripting PHP Regular Expressions Open Source Server Side Scripting
regular expressions • pattern matching • define a pattern • compare it to a string • test for match • replace match if desired ^([0-9]{3})-( [0-9]{2})-( [0-9]{4})$ Open Source Server Side Scripting
RE functions • 2 functions to search strings for a match • ereg( ) • performs a case sensitive search • eregi( ) • performs a case insensitive search • 2 functions to replace matched strings • ereg_replace( ) • performs a case sensitive search and replace • eregi_replace( ) • performs a case insensitive search and replace Open Source Server Side Scripting
RE types • PCRE – Perl Compatible Regular Expression • more powerful • harder to learn • can be used on binary data • POSIX Extended • less powerful • a little slower • easier to learn than PCRE Open Source Server Side Scripting
defining a pattern • a regular expression defines a pattern to search against • specific set of rules • simple to complex • simplest pattern to match is a string literal • single letter • string of letters Open Source Server Side Scripting
defining a pattern cont … • for example • to search for the string “cak”, case insensitive • a match will be returned for the following strings $pattern = “cak”; I’m flying out of CAK on Wednesday. I took my baby to a cake walk Saturday night. Mud was caked to my dog’s belly. Open Source Server Side Scripting
metacharacters • special symbols that have meaning beyond their literal meaning • a period means any single character • letter • number • space • symbol Open Source Server Side Scripting
metacharacters cont … • the regular expression • will not match any of the previous examples, but will match $pattern = “ca.k”; She cackled like a chicken when I slipped. cask of Amontillado 300 meters off Cape Rocca K 2 conducts dive exercises. Open Source Server Side Scripting
metacharacters cont … to match any of these metacharacters as a string literal, it must be escaped with a backslash Open Source Server Side Scripting
metacharacters cont … • the carat and the dollar sign specify where characters must be found • ^ carat • following characters must be at the beginning of a string will matchbut not $pattern = “^mush.”; Mushrooms are yummy. I love yummy mushrooms. Open Source Server Side Scripting
metacharacters cont … • $ dollar sign • preceding characters must be at the end of a string will matchbut not $pattern = “mush\.$”; I love fried mush. I love yummy mushrooms. Open Source Server Side Scripting
metacharacters cont … • | pipe • OR will matchbut will also match $pattern = “t|f”; Number six is T. Do you know the answer to number six? Open Source Server Side Scripting
metacharacters cont … • | pipe peccadillo • the regular expression will matchto match whole words, group them using parentheses $pattern = “true|false”; trufalse or truealse $pattern = “(true)|(false)”; Open Source Server Side Scripting
matching functions • ereg( ) • performs a case sensitive search • eregi( ) • performs a case insensitive search • both functions return TRUE if a match is found ereg( “pattern” , “string” ); $pattern = “pattern”;$string = “string”; $ereg( $pattern, $string ); Open Source Server Side Scripting
metacharacters cont … to match any of these metacharacters as a string literal, it must be escaped with a backslash Open Source Server Side Scripting
metacharacters cont … • to match multiple occurrences of previous characters • ? question mark • 0 or 1 instance of previous characterwill match but not $pattern = “r?”; Mary is coming. Harry is coming. Arrrrfff, howled the dog. Open Source Server Side Scripting
metacharacters cont … • * asterisk • 0 or more instances of previous characterwill match $pattern = “r*”; Bob is coming. Harry is coming. Arrrrfff, howled the dog. Open Source Server Side Scripting
metacharacters cont … + plus sign • 1 or more instance of previous characterwill match but not $pattern = “r+”; Mary is coming. Arrrrfff, howled the dog. Bob is coming. Open Source Server Side Scripting
metacharacters cont … • to match specific number of occurrences • {x} • exactly that number of the previous characterswill match but not $pattern = “v{3}?”; vvv vvvvvvv Open Source Server Side Scripting
metacharacters cont … {x,y} • occurs between and including the 2 numberswill match but not $pattern = “a{2,4}?”; aaaaaaaaa aaaaaa Open Source Server Side Scripting
metacharacters cont … {x,} • occurs at least the number of times indicatedwill match but not $pattern = “m{2,}?”; mmmmmmmmm m Open Source Server Side Scripting
metacharacters cont … Open Source Server Side Scripting
metacharacters cont … Open Source Server Side Scripting
character classes • created by placing certain characters inside square brackets • use of a hyphen indicates a range • will match any lowercase letter between a – z • will match any uppercase letter between M – R $pattern = “[a-z]”; $pattern = “[M-R]”; Open Source Server Side Scripting
character classes cont … • provide a range for numbers • combine character class with metacharacters $pattern = “[0-9]”; $pattern = “[1-9]”; $pattern = “[a-zA-Z]{2,4}”; Open Source Server Side Scripting
character classes cont … • test for vowels • predefined character classes • test for any letter or number $pattern = “[aeiou]”; $pattern = “[aeiou]{2}”; $pattern = “[[:alnum:]]”; Open Source Server Side Scripting
character classes cont … • test for any letter, lower or upper case • test for any tab or spaces • test for any number $pattern =“[[:alpha:]] ”; $pattern =“[[:blank:]] ”; $pattern = “[[:digit:]]”; Open Source Server Side Scripting
character classes cont … • test for any lower case letter • test for any upper case letter • test for any punctuation characters $pattern =“[[:lower:]] ”; $pattern =“[[:upper:]] ”; $pattern = “[[:punct:]]”; Open Source Server Side Scripting
character classes cont … • test for any white space • test for any three letter word • test for any combination of three numbers $pattern =“[[:space:]] ”; $pattern = “[[:space:]] [[:alpha:]]{3} [[:space:]]”; $pattern = “[[:digit:]]{3}”; Open Source Server Side Scripting
character classes cont … • inside square brackets a carat ^ has special meaning • indicates exclusion • to test for social security number • ??? $pattern =“[^aeiou] ”; xxx-xx-xxxx $pattern = “^([0-9]{3})-( [0-9]{2})-( [0-9]{4})$”; “^\(?([0-9]{3})\)?(-|\.|[[:space:]]|/)?([0-9]{3})(-|\.|[[:space:]]|/)?([0-9]{4})$” Open Source Server Side Scripting
phone number • phone • parentheses around area code are optional • set of three numbers, followed by three numbers, followed by four numbers • sets of numbers generally separated by hyphen, period, space, forward slash, nothing “^\(?([0-9]{3})\)?(-|\.|[[:space:]]|/)?([0-9]{3})(-|\.|[[:space:]]|/)?([0-9]{4})$” Open Source Server Side Scripting
replacement functions • ereg_replace( ) • performs a case sensitive search and replace • eregi_replace( ) • performs a case insensitive search and replace • both functions take 3 parameters • pattern to be matched • replacement string • string to be tested Open Source Server Side Scripting
replacement functions • both functions return TRUE if a match is found ereg_replace( “pattern”, “replacement”, “string” ); $pattern = “pattern”; $replacement = “replacement”;$string = “string”; $eregi_replace( $pattern, $replacement, $string ); Open Source Server Side Scripting