Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft
Introduction to Regular Expressions What Are Regular Expressions? Why Would I Want To Use Them? Common Misconceptions Anatomy of An Regular Expression
Disclaimer All opinions in this session are provided "AS IS" with no warranties, and confer no rights. All opinions are my mine and don't necessarily reflect the opinion of Microsoft. All opinions are my mine and don't necessarily reflect the opinion of Microsoft.
What Are Regular Expressions?
Regular Expressions “Regular expressions provide a powerful, flexible, and efficient method for processing text. [They allow] you to quickly parse large amounts of text to find specific character patterns; to extract, edit, replace, or delete text substrings; or to add the extracted strings to a collection in order to generate a report.”
Do What? Simply put, regular expressions will help you find text patterns and do pretty much whatever you want to it. It sounds simple but regular expressions are one of the most difficult and least understood constructs in programming.
Warning Regular expressions are part art and part science. There is a steep learning curve but the rewards are significant.
The Possibilities
Okay, So What Is A Pattern? “a regular or repetitive form, order, or arrangement”
PATTERNS ARE EVERYWHERE
Checker Board
Fibonacci Sequence
Text The IP Address for the server is but it should be , and I am not sure how we managed to get into the subnet but we need to remove ourselves from it immediately unless we are moving to it then I want the new IP to be I suppose.
YOU HAVE USED PATTERNS BEFORE
Wildcard Searches For Files Wildcards = VERY simple pattern matching constructs and are NOT regular expressions Examples:*.txtb*b*?un.txt
Why Use Regular Expressions?
Major Uses of Regular Expressions Matching = find any text anywhere regardless of complexity Substitution = once found, you can replace text
Features Can literally turn 10 lines of code into 1 Extremely efficient pattern matching mechanism Once learned, becomes one of the most indispensible techniques you can have
Languages That Support Regular Expressions All.NET languages JScript XML: XPath & XQuery T-SQLPERLJava [insert language here]
ASP.NET Control
Common Misconceptions
Misconceptions Regular Expressions can do complex programming logic Regular Expressions can do math Regular Expressions will give me winning lottery numbers
Anatomy of an Regular Expression
A Sample Expression
Anatomy CharactersMetacharactersSubexpressions
Characters A literal character represents any valid value represented by the current encoding method. For example the literal character is represented as the decimal value 65 in the ASCII encoding system.
Metacharacters Unlike literal characters, metacharacters are used as “place holders” for characters. For example, the metacharacter “\t” in regular expressions represents the tab character, whereas the “\d” matches any digit 0 through 9.
Subexpressions These are simply smaller expressions nested inside larger ones. For example, the following expression has a subexpression inside it: (john|jane)doe
Must Have Resources
Tools
Book
Tools
Summary
Summary Regular expressions can be used to manipulate and change text While there is a steep learning curve, regular expressions are invaluable as a programming tool Regular expressions are supported by virtually all major programming languages
Next Steps Check out some of the patterns on the RegExLib site Do a live search on regular expressions and see what others have to say about them Prepare your self mentally for a rewarding journey into the world of regular expressions Have Fun!!!