Hands-on Regular Expressions Simple rules for powerful changes.

Slides:



Advertisements
Similar presentations
Regular Expressions (in Python). Python or Egrep We will use Python. In some scripting languages you can call the command “grep” or “egrep” egrep pattern.
Advertisements

Regular Expressions A regular expression is a pattern that defines a string or portion thereof. When comparing this pattern against a string, it'll either.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
Using regular expressions Search for a single occurrence of a specific string. Search for all occurrences of a string. Approximate string matching.
Regular Expressions In ColdFusion and Studio. Definitions String - Any collection of 0 or more characters. Example: “This is a String” SubString - A segment.
Regular Expressions Comp 2400: Fall 2008 Prof. Chris GauthierDickey.
Scripting Languages Chapter 8 More About Regular Expressions.
Regular Expressions in ColdFusion Applications Dave Fauth DOMAIN technologies Knowledge Engineering : Systems Integration : Web.
REGULAR EXPRESSIONS CHAPTER 14. REGULAR EXPRESSIONS A coded pattern used to search for matching patterns in text strings Commonly used for data validation.
Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:
Last Updated March 2006 Slide 1 Regular Expressions.
Regular Expressions Week 07 TCNJ Web 2 Jean Chu. Regular Expressions Regular Expressions are a powerful way to validate and format text strings that may.
Regular Expressions Dr. Ralph D. Westfall May, 2011.
Language Recognizer Connecting Type 3 languages and Finite State Automata Copyright © – Curt Hill.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
System Programming Regular Expressions Regular Expressions
Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. address.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Faculty of Sciences and Social Sciences HOPE JavaScript Validation Regular Expression Stewart Blakeway FML
INFO 320 Server Technology I Week 7 Regular expressions 1INFO 320 week 7.
1 Regular Expressions CIS*2450 Advanced Programming Techniques Material for this lectures has been taken from the excellent book, Mastering Regular Expressions,
Regular Expression JavaScript Web Technology Derived from:
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
ASP.NET Programming with C# and SQL Server First Edition Chapter 5 Manipulating Strings with C#
RegExp. Regular Expression A regular expression is a certain way to describe a pattern of characters. Pattern-matching or keyword search. Regular expressions.
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
I/O Redirection and Regular Expressions February 9 th, 2004 Class Meeting 4.
Basic Text Processing Regular Expressions. Dan Jurafsky 2 The original slides from: tml Some changes.
REGEX. Problems Have big text file, want to extract data – Phone numbers (503)
Working with Forms and Regular Expressions Validating a Web Form with JavaScript.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn
XML 2nd EDITION Tutorial 4 Working With Schemas. XP Schemas A schema is an XML document that defines the content and structure of one or more XML documents.
Appendix A: Regular Expressions It’s All Greek to Me.
20-753: Fundamentals of Web Programming 1 Lecture 10: Server-Side Scripting II Fundamentals of Web Programming Lecture 10: Server-Side Scripting II.
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
CSCI 330 UNIX and Network Programming Unit IV Shell, Part 2.
Validation using Regular Expressions. Regular Expression Instead of asking if user input has some particular value, sometimes you want to know if it follows.
Unit 11 –Reglar Expressions Instructor: Brent Presley.
What are Regular Expressions?What are Regular Expressions?  Pattern to match text  Consists of two parts, atoms and operators  Atoms specifies what.
Introduction to Programming the WWW I CMSC Winter 2004 Lecture 13.
An Introduction to Regular Expressions Specifying a Pattern that a String must meet.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
OOP Tirgul 11. What We’ll Be Seeing Today  Regular Expressions Basics  Doing it in Java  Advanced Regular Expressions  Summary 2.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
Regular Expressions In Javascript cosc What Do They Do? Does pattern matching on text We use the term “string” to indicate the text that the regular.
RE Tutorial.
Regular Expressions Upsorn Praphamontripong CS 1110
String Methods Programming Guides.
Regular Expressions 'RegEx'.
Strings and Serialization
Looking for Patterns - Finding them with Regular Expressions
Lecture 9 Shell Programming – Command substitution
Week 14 - Friday CS221.
Pattern Matching in Strings
Advanced Find and Replace with Regular Expressions
Selenium WebDriver Web Test Tool Training
An Overview of Grep and Regular Expression
Data Manipulation & Regex
Regular Expressions and Grep
PolyAnalyst Web Report Training
Lecture 25: Regular Expressions
Validation using Regular Expressions
Nate Brunelle Today: Regular Expressions
REGEX.
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
Presentation transcript:

Hands-on Regular Expressions Simple rules for powerful changes

June 27 th - 30 th 2007www.cfunited.com DefinitionsDefinitions String - Any collection of 0 or more characters. Example: “This is a String” SubString - A segment of a String Example: “is a” Case Sensitivity - detection if a character is upper or lower case.

June 27 th - 30 th 2007www.cfunited.com Simple Task Find the word “Name” inside a string: Position=#Find(‘Name’, String)# Position=0

June 27 th - 30 th 2007www.cfunited.com Simple Text Find the word “Name” inside a string: Position=#Find(‘name’, String)# Position=4

June 27 th - 30 th 2007www.cfunited.com Simple Task Find the word “Name” inside a string: Position= #FindNoCase(‘Name’, String)# Position=4

June 27 th - 30 th 2007www.cfunited.com Simple Task Find the word “Name” inside a string using Regular Expressions: Position=#REFindNoCase(‘Name’, String)# Position=4

June 27 th - 30 th 2007www.cfunited.com Intro to Regular Expressions Refereed to as RegEx Matches patterns of characters Used in many languages (ColdFusion, Perl, JavaScript, etc.) Uses a small syntax library to do ‘dynamic’ matches Can be used for Search and/or Replace actions Slightly slower than similar Find() and Replace() functions Has both a case sensitive and a non-case sensitive version of each function operation  REFind()  REFindNoCase()  REReplace()  REReplaceNoCase

June 27 th - 30 th 2007www.cfunited.com RegEx Basics Rule 1: A character matches itself as long as it is not a control character. Example: A=“A” A=“a” (non-case sensitive) Position= #REFindNoCase(‘n’, String)# Position=4

June 27 th - 30 th 2007www.cfunited.com RegEx Basics Rule 1a: A search will return the first successful match. To get a different match, set the start position (third attribute of the function - optional) Position1= #REFindNoCase(‘M’, String)# Position2= #REFindNoCase(‘M’, String, 2)# Position1=1 Position2=12

June 27 th - 30 th 2007www.cfunited.com RegEx Basics Rule 2: A collection of non-control characters matches another collection of non-control characters. AA=“AA” AA!=“Aa” (case sensitive) AA=“Aa” (non-case sensitive) A A=“A A” (notice the space) Position=#REFindNoCase(‘y n’, String)# Position=2

June 27 th - 30 th 2007www.cfunited.com RegEx Basics Rule 3: A period (.) is a control character that matches ANY other character. Example:. = “A” A. = “Ac” A.A=“A A” Position= #REFindNoCase(‘N.me’, String)# Position=4

June 27 th - 30 th 2007www.cfunited.com RegEx Basics Rule 4: A control character can be ‘escaped’ by using a backslash (\) before it. This will cause the control character to match a text version of itself. Example:. = “.” \. = “.” A\.A = “A.A” Position1=#REFindNoCase(‘tz\.’, String)# Position=26

June 27 th - 30 th 2007www.cfunited.com RegEx Anchoring Rule 5a: Using the caret (^) will make sure the text your searching for is at the start of the string. Example: ^A= “A” ^M != “AM” Position1=#REFindNoCase(‘^My’, String)# Position2=#REFindNoCase(‘^is’, String)# Position1=1 Position2=0

June 27 th - 30 th 2007www.cfunited.com RegEx Anchoring Rule 5b: Using the dollar sign ($) will make sure the text your searching for is at the end of the string. Example: A$ = “A” M$ = “MAM” (second M will be returned) Position1=#REFindNoCase(‘\.$’, String)# Position1=28

June 27 th - 30 th 2007www.cfunited.com RegEx Ranges Rule 6: When looking for one of a group of characters, place them inside square brackets ([]). Example: ‘[abc]’ will match either a, b, or c. ‘[.+$^]’ will match either a period (.), a plus (+), a dollar sign ($) or a caret (^). Note that all special characters are escaped within square brackets. Position1=#REFindNoCase(‘M[aeiou]’, String)# Position1=6

June 27 th - 30 th 2007www.cfunited.com RegEx Ranges Rule 7a: A caret (^), when used within square brackets ([]) is has the effect of saying ‘NOT these characters’. It must be the first character for this to work. Example: ‘[^abc]’ will match ANY character other than a, b, or c. Position1=#REFindNoCase(‘M[^aeiou]’, String)# Position1=1

June 27 th - 30 th 2007www.cfunited.com RegEx Ranges Rule 7b: A dash (-), when used within square brackets ([]) has the effect of saying ‘all characters from the first character till the last’. Example: ‘[a-e]’ will match ANY character between a and e. Position1=#REFindNoCase(‘M[a-m]’, String)# Position1=6

June 27 th - 30 th 2007www.cfunited.com RegEx Ranges Rule 8: ColdFusion has a series of pre-built character ranges. These are referenced as [[:range name:]]. Example: [[:digit:]] - same as 0-9 (all numbers) [[:alpha:]] - same as A-Z and a-z (all letters of both case) Position1=#REFindNoCase(‘[[:space:]]’, String)# Position1=3

June 27 th - 30 th 2007www.cfunited.com RegEx Character Classes

June 27 th - 30 th 2007www.cfunited.com RegEx Multipliers Any character or character class can be assigned a multiplier that will define the use of the character or class. These multipliers can say that a character must exist, is optional, may exist for a certain minimum or maximum, etc. Multiplier characters include:  Plus (+)One or more  Asterisk (*)0 or more Question Mark (?) may or may not exist once Curly Brackets({})A specific range of occurances

June 27 th - 30 th 2007www.cfunited.com RegEx Multipliers The Plus (+) multiplier specifies that the character or character group must exist but can exist more than once. Example: A+ - A followed by any number of additional A’s [[:digit:]]+ - A number (0-9) followed by any amount of additional numbers Position1=#REFindNoCase(‘is+i’, String)# Position1=2

June 27 th - 30 th 2007www.cfunited.com RegEx Multipliers The Asterisk (*) multiplier specifies that the character or character group may or may not exist, and can exist more than once. (I.e. 0 or more) Example: A* - Either no A or an A followed by any number of additional A’s [[:digit:]]* - Either no number (0-9) or a number followed by any amount of additional numbers Position1=#REFindNoCase(‘si*s’, String)# Position1=3

June 27 th - 30 th 2007www.cfunited.com RegEx Multipliers The Question mark (?) multiplier specifies that the character or character group may or may not exist, but only once. Example: A? - Either A or no As [[:digit:]]+ - One or no numbers (0-9) Position1=#REFindNoCase(‘p?i’, String)# Position1=2

June 27 th - 30 th 2007www.cfunited.com RegEx Multipliers Curly brackets ({}) can be used to specify a minimum and maximum range for a character to appear. The format is {min, max} Example: A{2,4} - 2 As or more but no more than 4. [[:digit:]]{1,6} - 1 number (0-9) or more, but no more than 6. Position1=#REFindNoCase(‘s{2,3}’, String)# Position1=3

June 27 th - 30 th 2007www.cfunited.com RegEx SubExpressions SubExpressions are a way of grouping characters together. This allows us to reference the entire group at once. To group characters together, place them within parenthesis (). Example: (Name) = name (Name)+ = name, namename or basically one or more names. Position1=#REFindNoCase(‘(iss)+’, String)# Position1=2

June 27 th - 30 th 2007www.cfunited.com RegEx SubExpressions An additional special character that is usable within a subExpression is the pipe (|). This means either the first group of text or the second (or more). Example: (Na|me) = na or me (Name|Date) = Name or date Position1=#REFindNoCase(‘(hard|word)’, String)# Position1=18

June 27 th - 30 th 2007www.cfunited.com RegEx SubExpressions SubExpressions allow us to do something else that’s special; back referencing. This is the ability to reference one or more groups directly. This is done by using the backslash (\) followed by a number that specifies which subexpression we want. Example: (name)\1 = namename (Name|Date)\1 = namename or datedate Position1=#REFindNoCase(‘(is )\1’, String)# Position1=13

June 27 th - 30 th 2007www.cfunited.com REReplaceREReplace The REReplace() and REReplaceNoCase() functions use everything you’ve learned about searching and allows you to ‘work’ with the search results, I.e. replace them with something. Example: Position1=#REReplaceNoCase(String, ‘iss’, ‘emm’)# Position2=#REReplaceNoCase(String, ‘iss’, ‘emm’, ‘all’)# Position1=Memmissippi is a hard word Position2=Memmemmippi is a hard word