140 likes | 152 Views
Learn about literals, wildcards, groups, capturing groups, optimizing, special characters, and more. Explore examples and quizzes to enhance your understanding of regex for efficient coding.
E N D
Regex Wildcards on steroids
Regular Expressions • You’ve likely used the wildcard in windows search or coding (*), regular expressions take this to the next level * 10 (largest number I can think of). • Useful tool for anyone not just developers. • Powerful search utility in most tools (Visual Studio, Word, Excel, …)
Literals • Literal characters in regex are any character that you want to literally match. • “Google” in regex will match anything with “Google” in it regardless of what is before or after it. • Will match “Googleplex” • “\bGoogle\b” in regex is stating that you want to match the whole word, solves the above mentioned problem.
Wildcards • “*” means match this 0-infinite times • “+” means match this 1-infinite times • “.” mean match any character (except end of line) • With the above you can see that “.*” would match anything and everything until it hits the first line break IE Windows:“\r\n” or UNIX:“\r”
Groups • Groups allow you to create character groups. • “[a-z]” will match any character that is a lowercase letter. • “[0-9]” will match any digit. • “[a-zA-Z]” will match any character that is lowercase or uppercase • “[aZd3]” will match only the characters defined in the group “a”, “Z”, “d”, and “3”. • “[^0-9]” will match anything that is not a number
Pop Quiz • Write a regular expression to find a phone number matching the following pattern (xxx)xxx-xxxx • \([0-9][0-9][0-9]\)[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9] • Why did we have to use “\(“ and “\)”? • A “\” is to escape a regex special character • A “(“ and “)” are special characters meaning capture to backreference (capturing group)
Capturing Groups • These are handy for the replace section of Find and Replace. • “(“ followed by a “)” creates a capturing group that is back referenced automatically by it’s index (non-zero) in the regex, some languages allow named capture groups. • “(.*)” captures any character (except line break) and puts it into back reference 1.
DEMO • Word • Visual Studio • Notepad++
Don’t go . crazy • Be wary regular expressions are greedy by default. • Example: • This is like hell this merchant just keeps wanting to sell...it's so frustrating • “.*?ell” non-greedy • “.*ell” greedy
Special Characters • “\d” is any number or equivalent to “[0-9]” • “\D” is the opposite of the above or equivalent to “[^0-9]” • “\w” is any letter, number, or underscore or equivalent to “[0-9a-zA-Z_]” • “\W” is the opposite of the above or equivalent to “[^0-9a-zA-Z_]” • “\s” is any whitespace character (spaces, tabs, and linebreaks) • “\S” is the opposite of the above
Optimizing • It is always valuable to be as specific as possible since regex is greedy. • “^” matches the beginning of string • “$” matches the end of string • “^\(\d\d\d\)\d\d\d-\d\d\d\d$” is perfect if the whole string is suppose to be a phone number and will be faster than “\(\d\d\d\)\d\d\d-\d\d\d\d”
Character Counts • Specifying the number of times you want a thing to happen • “\d{3}” will match 3 digits • “\d{1,3}” will match 1 to 3 digits
Pop Quiz • Q: Write a regular expression to find a phone number matching the following pattern (xxx)xxx-xxxx • A: “\(\d{3}\)\d{3}-\d{3}”
DEMO • Show how to code with Regex