Advanced Regular Expressions

Slides:



Advertisements
Similar presentations
Session 3BBK P1 ModuleApril 2010 : [#] Regular Expressions.
Advertisements

Regular Expressions BKF03 Brian Ciccolo. Agenda Definition Uses – within Aspen and beyond Matching Replacing.
BBK P1 Module2010/11 : [‹#›] Regular Expressions.
2-1. Today’s Lecture Review Chapter 4 Go over exercises.
Bioinformatics Programming 1 EE, NCKU Tien-Hao Chang (Darby Chang)
IT151: Introduction to Programming
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
Regular Expressions A regular expression is a pattern that defines a string or portion thereof. When comparing this pattern against a string, it'll either.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
 2003 Prentice Hall, Inc. All rights reserved. Customized by Sana Odeh for the use of this class. 1 Introduction to Computers and Programming in JAVA.
More Regular Expressions. List/Scalar Context for m// Last week, we said that m// returns ‘true’ or ‘false’ in scalar context. (really, 1 or 0). In list.
Regex Wildcards on steroids. Regular Expressions You’ve likely used the wildcard in windows search or coding (*), regular expressions take this to the.
More on Regular Expressions Regular Expressions More character classes \s matches any whitespace character (space, tab, newline etc) \w matches.
Regular Expressions in ColdFusion Applications Dave Fauth DOMAIN technologies Knowledge Engineering : Systems Integration : Web.
REGULAR EXPRESSIONS CHAPTER 14. REGULAR EXPRESSIONS A coded pattern used to search for matching patterns in text strings Commonly used for data validation.
Last Updated March 2006 Slide 1 Regular Expressions.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Intro and Review Welcome to Java. Introduction Java application programming Use tools from the JDK to compile and run programs. Videos at
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Python Regular Expressions Easy text processing. Regular Expression  A way of identifying certain String patterns  Formally, a RE is:  a letter or.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
Regular Expressions CSC207 – Software Design. Motivation Handling white space –A program ought to be able to treat any number of white space characters.
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
CS 536 Fall Scanner Construction  Given a single string, automata and regular expressions retuned a Boolean answer: a given string is/is not in.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
VBScript Session 13.
REGEX. Problems Have big text file, want to extract data – Phone numbers (503)
Overview A regular expression defines a search pattern for strings. Regular expressions can be used to search, edit and manipulate text. The pattern defined.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Regular Expressions. Overview Regular expressions allow you to do complex searches within text documents. Examples: Search 8-K filings for restatements.
JavaScript, Part 2 Instructor: Charles Moen CSCI/CINF 4230.
Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular.
_______________________________________________________________________________________________________________ PHP Bible, 2 nd Edition1  Wiley and the.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Pattern Matching II. Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the.
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
Awk- An Advanced Filter by Prof. Shylaja S S Head of the Dept. Dept. of Information Science & Engineering, P.E.S Institute of Technology, Bangalore
CGS – 4854 Summer 2012 Web Site Construction and Management Instructor: Francisco R. Ortega Chapter 5 Regular Expressions.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. ADVANCED.
Editing Tons of Text? RegEx to the Rescue! Eric Cressey Senior UX Content Writer Symantec Corporation.
CFUNITED – The premier ColdFusion conference Using Event Gateways with CFMX7 By Jeff Tapper Tapper.net Consulting.
CFUNITED – The premier ColdFusion conference CFMX7 Admin API Nate Nelson
CFUNITED – The premier ColdFusion conference Undocumented CFMX Nate Nelson.
Regular Expressions In Javascript cosc What Do They Do? Does pattern matching on text We use the term “string” to indicate the text that the regular.
Regular Expressions.
Regular Expressions Copyright Doug Maxwell (
RE Tutorial.
Regular Expressions Upsorn Praphamontripong CS 1110
Looking for Patterns - Finding them with Regular Expressions
Lecture 19 Strings and Regular Expressions
CSC 594 Topics in AI – Natural Language Processing
Perl-Compatible Regular Expressions Part 1
Regular Expressions in Perl
Chapter 2, Part I Introduction to C Programming
CSC 594 Topics in AI – Natural Language Processing
Advanced Find and Replace with Regular Expressions
CS 1111 Introduction to Programming Fall 2018
Data Manipulation & Regex
Matcher functions boolean find() Attempts to find the next subsequence of the input sequence that matches the pattern. boolean lookingAt() Attempts to.
ECE 103 Engineering Programming Chapter 8 Data Types and Constants
Regular Expressions and Grep
CIT 383: Administrative Scripting
EECE.2160 ECE Application Programming
REGEX.
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
An Intro to Regex in R Alan Wu.
PYTHON - VARIABLES AND OPERATORS
Presentation transcript:

Advanced Regular Expressions Or What’s special about RegEx in MX CFUNITED – The premier ColdFusion conference www.cfunited.com

Your Presenter Michael Dinowitz Head of House of Fusion Publisher of Fusion Authority Founding member of Team Macromedia Doing this since June 95 Called on for the black magic code June 28th – July 1st 2006

Disclaimer & Introduction If you don’t know the basics – get out No real changes from CF 5 or CFMX 6 June 28th – July 1st 2006

Basic additions Greedy vs. Lazy Nested sub expressions + is one or more and as many as it can +? Is one or more but only as many as it needs ++ Same as greedy but does not allow back references (not in CFMX) Nested sub expressions In order of execution from outside it Then left to right June 28th – July 1st 2006

Character Vs. Posix classes Non-special characters become special Uses a backslash (\) to specify being special Shorter than posix classes Harder to ‘read’ for newbies June 28th – July 1st 2006

Basic Character Classes \b – word boundary Any jump from alphanumeric to non-alphanumeric refindnocase('\bbig\b', 'big') \B – any 2 of the same ‘types’ of characters refindnocase('\B', 'big') = 2 June 28th – July 1st 2006

More Character Classes \A - same as ^ (not combined with (?m) \Z – same as $ (not combined with (?m) \n – newline \r – carriage return \t – tab \d – any digit ([0-9]) \D – any non digit ([^0-9]) June 28th – July 1st 2006

More Character Classes \w - Any alphanumeric character ([[:alnum:]] ) \W - Any non-alphanumeric character ([^[:alnum:]] ) \s - Any whitespace character including tab, space, newline, carriage return, and form feed ([\t\n\r\f ]) \S – any non-whitespace character ([^ \t\n\r\f]) June 28th – July 1st 2006

Expression Modifiers At beginning of expression (?i) Causes expression to be case insensitive (same as NoCase version) (?m) Multi-line mode ^ and $ matches line, not entire string Carriage return Chr(13) is ignored as new line June 28th – July 1st 2006

Expression Modifiers (?x) ignores all white space Also allows usage of ## for comments ## will comment to end of line reFind("(?x) one ##first option |two ##second option |three\ point\ five ## note escaped spaces ", "three point five") June 28th – July 1st 2006

Group Modifiers Affects only the group its in Must be at beginning of group (?##) comment Must escape # (?:) does not add group to return collection (?=) Positive look ahead (?!) negative look ahead June 28th – July 1st 2006

Positive Lookahead Tests if the text in the parenthesis exists Does not save the text into return collection Does not ‘consume text’ <a(?=.+href).+?href="([^"]+).+?> June 28th – July 1st 2006

Negative Lookahead Tests if the text in the parenthesis does not exist Does not save the text into return collection Does not ‘consume text’ (<a(?!.+?target) [^>]+>) June 28th – July 1st 2006

Replace conversion Used in REReplace()/REReplaceNoCase() Either converts the ‘next’ character or a specific section of characters \u – converts next character to uppercase \l – converts the next character to lowercase \U…\E – converts block to uppercase \L…\E – converts block to lowercase June 28th – July 1st 2006

Not Supported Positive Lookbehinds Negative Lookbehinds Other features All accessible through the Java RegEx engine Massimo has a CFC pre-built to do this June 28th – July 1st 2006

Resources Chapters in most CFMX books CF-RegEx mailing list This presentation Books: Mastering Regular Expressions, 2nd Edition Teach Yourself Regular Expressions in 10 Minutes Java Regular Expressions Taming the java Dot util Dot regex Engine June 28th – July 1st 2006