BBK P1 Module2010/11 : [‹#›] Regular Expressions.

Slides:



Advertisements
Similar presentations
JavaScript I. JavaScript is an object oriented programming language used to add interactivity to web pages. Different from Java, even though bears some.
Advertisements

Session 3BBK P1 ModuleApril 2010 : [#] Regular Expressions.
Regular Expressions BKF03 Brian Ciccolo. Agenda Definition Uses – within Aspen and beyond Matching Replacing.
FORM VALIDATION Faheem Ahmed Khokhar. FORM VALIDATION Faheem Ahmed Khokhar.
Session 1 & 2BBK P1 Module5-May-2007 : [‹#›] PHP: Moving On..
Session 1 & 2BBK P1 Module5-May-2007 : [‹#›] PHP: The Basics.
2-1. Today’s Lecture Review Chapter 4 Go over exercises.
Bioinformatics Programming 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
Form Validation CS What is form validation?  validation: ensuring that form's values are correct  some types of validation:  preventing blank.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
Data Manipulation & Regular Expressions CSCI 215.
Bellevue University CIS 205: Introduction to Programming Using C++ Lecture 3: Primitive Data Types.
Regular Expressions. String Matching The problem of finding a string that “looks kind of like …” is common  e.g. finding useful delimiters in a file,
More on Regular Expressions Regular Expressions More character classes \s matches any whitespace character (space, tab, newline etc) \w matches.
Lesson 3 – Regular Expressions Sandeepa Harshanganie Kannangara MBCS | B.Sc. (special) in MIT.
Last Updated March 2006 Slide 1 Regular Expressions.
Regular Expressions Week 07 TCNJ Web 2 Jean Chu. Regular Expressions Regular Expressions are a powerful way to validate and format text strings that may.
Tutorial 14 Working with Forms and Regular Expressions.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
PHP Workshop ‹#› Data Manipulation & Regex. PHP Workshop ‹#› What..? Often in PHP we have to get data from files, or maybe through forms from a user.
INFO 320 Server Technology I Week 7 Regular expressions 1INFO 320 week 7.
Sys.Prog & Scripting - HW Univ1 Systems Programming & Scripting Lecture 18: Regular Expressions in PHP.
Chap 3 – PHP Quick Start COMP RL Professor Mattos.
CIS 451: Regular Expressions Dr. Ralph D. Westfall January, 2009.
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
Regular Expression (continue) and Cookies. Quick Review What letter values would be included for the following variable, which will be used for validation.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
(Stream Editor) By: Ross Mills.  Sed is an acronym for stream editor  Instead of altering the original file, sed is used to scan the input file line.
Strings in PHP Working with Text in PHP Strings and String Functions Mario Peshev Technical Trainer Software University
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
PHP with Regular Expressions Web Technologies Computing Science Thompson Rivers University.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
REGEX. Problems Have big text file, want to extract data – Phone numbers (503)
Working with Forms and Regular Expressions Validating a Web Form with JavaScript.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Regular Expressions. Overview Regular expressions allow you to do complex searches within text documents. Examples: Search 8-K filings for restatements.
Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn
Copyright © 2003 Pearson Education, Inc. Slide 6a-1 The Web Wizard’s Guide to PHP by David Lash.
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
CSC 2720 Building Web Applications PHP PERL-Compatible Regular Expressions.
Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
1 DIG 3563: Lecture 2a: Regular Expressions Michael Moshell University of Central Florida Information Management.
JavaScript III ECT 270 Robin Burke. Outline Validation examples password more complex Form validation Regular expressions.
Validation using Regular Expressions. Regular Expression Instead of asking if user input has some particular value, sometimes you want to know if it follows.
Unit 11 –Reglar Expressions Instructor: Brent Presley.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
What are Regular Expressions?What are Regular Expressions?  Pattern to match text  Consists of two parts, atoms and operators  Atoms specifies what.
Introduction to Programming the WWW I CMSC Winter 2004 Lecture 13.
An Introduction to Regular Expressions Specifying a Pattern that a String must meet.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
Regular Expressions.
Regular Expressions Copyright Doug Maxwell (
Regular Expressions Upsorn Praphamontripong CS 1110
Regular Expressions 'RegEx'.
CS 330 Class 7 Comments on Exam Programming plan for today:
Looking for Patterns - Finding them with Regular Expressions
Lecture 19 Strings and Regular Expressions
Advanced Regular Expressions
Chapter 19 PHP Part II Credits: Parts of the slides are based on slides created by textbook authors, P.J. Deitel and H. M. Deitel by Prentice Hall ©
Strings Part 1 Taken from notes by Dr. Neil Moore
PHP Programming 6th~19th, 12, 2015 Prof. YOON, Byeong Nam, PhD.
Advanced Find and Replace with Regular Expressions
Data Manipulation & Regex
Validation using Regular Expressions
REGEX.
Strings Taken from notes by Dr. Neil Moore & Dr. Debby Keen
Presentation transcript:

BBK P1 Module2010/11 : [‹#›] Regular Expressions

BBK P1 Module2010/11 : [‹#›] More complicated checks.. It is usually possible to use a combination of various built-in PHP functions to achieve what you want. However, sometimes things get more complicated. When this happens, we turn to Regular Expressions.

BBK P1 Module2010/11 : [‹#›] Regular Expressions Regular expressions are a concise (but obtuse!) way of pattern matching within a string. There are different flavours of regular expression (PERL & POSIX), but we will just look at the faster and more powerful version (PERL).

BBK P1 Module2010/11 : [‹#›] Some definitions ]+\.)+[a-z]{2,6}$/i‘ preg_match(), preg_replace() Actual data that we are going to work upon (e.g. an address string) Definition of the string pattern (the ‘Regular Expression’). PHP functions to do something with data and regular expression.

BBK P1 Module2010/11 : [‹#›] Regular Expressions Are complicated! They are a definition of a pattern. Usually used to validate or extract data from a string.

BBK P1 Module2010/11 : [‹#›] Regex: Delimiters The regex definition is always bracketed by delimiters, usually a ‘/’ : $regex = ’/php/’; Matches: ‘php’, ’I love php’ Doesn’t match: ‘PHP’ ‘I love ph’

BBK P1 Module2010/11 : [‹#›] Regex: First impressions Note how the regular expression matches anywhere in the string: the whole regular expression has to be matched, but the whole data string doesn’t have to be used. It is a case-sensitive comparison.

BBK P1 Module2010/11 : [‹#›] Regex: Case insensitive Extra switches can be added after the last delimiter. The only switch we will use is the ‘i’ switch to make comparison case insensitive: $regex = ’/php/i’; Matches: ‘php’, ’I love pHp’, ‘PHP’ Doesn’t match: ‘I love ph’

BBK P1 Module2010/11 : [‹#›] Regex: Character groups A regex is matched character-by- character. You can specify multiple options for a character using square brackets: $regex = ’/p[hu]p/’; Matches: ‘php’, ’pup’ Doesn’t match: ‘phup’, ‘pop’, ‘PHP’

BBK P1 Module2010/11 : [‹#›] Regex: Character groups You can also specify a digit or alphabetical range in square brackets: $regex = ’/p[a-z1-3]p/’; Matches: ‘php’, ’pup’, ‘pap’, ‘pop’, ‘p3p’ Doesn’t match: ‘PHP’, ‘p5p’

BBK P1 Module2010/11 : [‹#›] Regex: Predefined Classes There are a number of pre-defined classes available: \d Matches a single character that is a digit (0- 9) \s Matches any whitespace character (includes tabs and line breaks) \w Matches any “word” character: alphanumeric characters plus underscore.

BBK P1 Module2010/11 : [‹#›] Regex: Predefined classes $regex = ’/p\dp/’; Matches: ‘p3p’, ’p7p’, Doesn’t match: ‘p10p’, ‘P7p’ $regex = ’/p\wp/’; Matches: ‘p3p’, ’pHp’, ’pop’ Doesn’t match: ‘phhp’

BBK P1 Module2010/11 : [‹#›] Regex: the Dot The special dot character matches anything apart from line breaks: $regex = ’/p.p/’; Matches: ‘php’, ’p&p’, ‘p(p’, ‘p3p’, ‘p$p’ Doesn’t match: ‘PHP’, ‘phhp’

BBK P1 Module2010/11 : [‹#›] Regex: Repetition There are a number of special characters that indicate the character group may be repeated: ? Zero or 1 times * Zero or more times + 1 or more times {a,b} Between a and b times

BBK P1 Module2010/11 : [‹#›] Regex: Repetition $regex = ’/ph?p/’; Matches: ‘pp’, ’php’, Doesn’t match: ‘phhp’, ‘pap’ $regex = ’/ph*p/’; Matches: ‘pp’, ’php’, ’phhhhp’ Doesn’t match: ‘pop’, ’phhohp’

BBK P1 Module2010/11 : [‹#›] Regex: Repetition $regex = ’/ph+p/’; Matches: ‘php’, ’phhhhp’, Doesn’t match: ‘pp’, ‘phyhp’ $regex = ’/ph{1,3}p/’; Matches: ‘php’, ’phhhp’ Doesn’t match: ‘pp’, ’phhhhp’

BBK P1 Module2010/11 : [‹#›] Regex: Bracketed repetition The repetition operators can be used on bracketed expressions to repeat multiple characters: $regex = ’/(php)+/’; Matches: ‘php’, ’phpphp’, ‘phpphpphp’ Doesn’t match: ‘ph’, ‘popph’ Will it match ‘phpph’?

BBK P1 Module2010/11 : [‹#›] Regex: Anchors So far, we have matched anywhere within a string (either the entire data string or part of it). We can change this behaviour by using anchors: ^ Start of the string $ End of string

BBK P1 Module2010/11 : [‹#›] Regex: Anchors With NO anchors: $regex = ’/php/’; Matches: ‘php’, ’php is great’, ‘in php we..’ Doesn’t match: ‘pop’

BBK P1 Module2010/11 : [‹#›] Regex: Anchors With start and end anchors: $regex = ’/^php$/’; Matches: ‘php’, Doesn’t match: ’php is great’, ‘in php we..’, ‘pop’

BBK P1 Module2010/11 : [‹#›] Regex: Escape special characters We have seen that characters such as ?,.,$,*,+ have a special meaning. If we want to actually use them as a literal, we need to escape them with a backslash. $regex = ’/p\.p/’; Matches: ‘p.p’ Doesn’t match: ‘php’, ‘p1p’

BBK P1 Module2010/11 : [‹#›] So.. An example Lets define a regex that matches an $ Regex = z]{2,6}$/i‘; Matches: Doesn’t match: ‘not.an. .com’

BBK P1 Module2010/11 : [‹#›] So.. An example /^ ([a-z\d-]+\.)+ [a-z]{2,6} $/i Starting delimiter, and start-of-string anchor User name – allow any length of letters, numbers, dots, pluses, dashes, percent or quotes separator Domain (letters, digits or dash only). Repetition to include subdomains. com,uk,info,etc. End anchor, end delimiter, case insensitive

BBK P1 Module2010/11 : [‹#›] Phew.. So we now know how to define regular expressions. Further explanation can be found at: We still need to know how to use them!

BBK P1 Module2010/11 : [‹#›] Boolean Matching We can use the function preg_match () to test whether a string matches or not. // match an $input = if (preg_match($ Regex,$input) { echo ‘Is a valid ’; } else { echo ‘NOT a valid ’; }

BBK P1 Module2010/11 : [‹#›] Pattern replacement We can use the function preg_replace () to replace any matching strings. // strip any multiple spaces $input = ‘Some comment string’; $regex = ‘/\s\s+/’; $clean = preg_replace($regex,’ ‘,$input); // ‘Some comment string’

BBK P1 Module2010/11 : [‹#›] Sub-references We’re not quite finished: we need to master the concept of sub-references. Any bracketed expression in a regular expression is regarded as a sub- reference. You use it to extract the bits of data you want from a regular expression. Easiest with an example..

BBK P1 Module2010/11 : [‹#›] Sub-reference example: I start with a date string in a particular format: $str = ’10, April 2007’; The regex that matches this is: $regex = ‘/\d+,\s\w+\s\d+/’; If I want to extract the bits of data I bracket the relevant bits: $regex = ‘/(\d+),\s(\w+)\s(\d+)/’;

BBK P1 Module2010/11 : [‹#›] Extracting data.. I then pass in an extra argument to the function preg_match(): $str = ’The date is 10, April 2007’; $regex = ‘/(\d+),\s(\w+)\s(\d+)/’; preg_match($regex,$str,$matches); // $matches[0] = ‘10, April 2007’ // $matches[1] = 10 // $matches[2] = April // $matches[3] = 2007

BBK P1 Module2010/11 : [‹#›] Back-references This technique can also be used to reference the original text during replacements with $1,$2,etc. in the replacement string: $str = ’The date is 10, April 2007’; $regex = ‘/(\d+),\s(\w+)\s(\d+)/’; $str = preg_replace($regex, ’$1-$2-$3’, $str); // $str = ’The date is 10-April-2007’

BBK P1 Module2010/11 : [‹#›] Phew Again! We now know how to define regular expressions. We now also know how to use them: matching, replacement, data extraction. HOE 14 : Regex