Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn

Slides:



Advertisements
Similar presentations
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
Advertisements

ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
CS 497C – Introduction to UNIX Lecture 31: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang
LING 388: Language and Computers Sandiway Fong Lecture 2: 8/23.
CS 330 Programming Languages 10 / 10 / 2006 Instructor: Michael Eckmann.
Using regular expressions Search for a single occurrence of a specific string. Search for all occurrences of a string. Approximate string matching.
Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl Linux editors and commands (e.g.
Regular Expressions In ColdFusion and Studio. Definitions String - Any collection of 0 or more characters. Example: “This is a String” SubString - A segment.
Form Validation CS What is form validation?  validation: ensuring that form's values are correct  some types of validation:  preventing blank.
Last Updated March 2006 Slide 1 Regular Expressions.
Regular Expressions Week 07 TCNJ Web 2 Jean Chu. Regular Expressions Regular Expressions are a powerful way to validate and format text strings that may.
Lecture 7: Perl pattern handling features. Pattern Matching Recall =~ is the pattern matching operator A first simple match example print “An methionine.
Overview of the grep Command Alex Dukhovny CS 265 Spring 2011.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
PHP Workshop ‹#› Data Manipulation & Regex. PHP Workshop ‹#› What..? Often in PHP we have to get data from files, or maybe through forms from a user.
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
INFO 320 Server Technology I Week 7 Regular expressions 1INFO 320 week 7.
Writing Web Pages By Shyam Gurram. Agenda Writing Web Pages Delimiting PHP Program Units. Displaying Output to Web Pages Putting Comments in PHP Programs.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
CIS 451: Regular Expressions Dr. Ralph D. Westfall January, 2009.
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2015, Fred McClurg, All Rights.
PHP with Regular Expressions Web Technologies Computing Science Thompson Rivers University.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
Regular Expression - Intro Patterns that define a set of strings (or, pieces of a string) Not wildcards (similar notion, but different thing) Used by utilities.
PHP| SCK3633 Web Programming | Jumail, FSKSM, UTM, 2006 | Last Updated March 2006 Slide 1 Regular Expressions.
Working with Forms and Regular Expressions Validating a Web Form with JavaScript.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2010 All Rights Reserved. 1.
Satisfy Your Technical Curiosity Regular Expressions Roy Osherove Methodology & Team System Expert Sela Group The.
Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular.
Copyright © 2003 Pearson Education, Inc. Slide 6a-1 The Web Wizard’s Guide to PHP by David Lash.
©Brooks/Cole, 2001 Chapter 9 Regular Expressions.
CS346 Regular Expressions1 Pattern Matching Regular Expression.
CSC 4630 Meeting 21 April 4, Return to Perl Where are we? What is confusing? What practice do you need?
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Introduction to sed. Sed : a “S tream ED itor ” What is Sed ?  A “non-interactive” text editor that is called from the unix command line.  Input text.
Sys Prog & Scrip - Heriot Watt Univ 1 Systems Programming & Scripting Lecture 12: Introduction to Scripting & Regular Expressions.
20-753: Fundamentals of Web Programming 1 Lecture 10: Server-Side Scripting II Fundamentals of Web Programming Lecture 10: Server-Side Scripting II.
Powerpoint Templates Page 1 Powerpoint Templates GROUP 8:REGULAR EXPRESSION GURU BESAR: PN. SARINA SULAIMAN CIKGU-CIKGU: 1.CIKGU NENI 2.CIKGU
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.
CSC 2720 Building Web Applications PHP PERL-Compatible Regular Expressions.
Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.
2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
Unit 11 –Reglar Expressions Instructor: Brent Presley.
Variable Variables A variable variable has as its value the name of another variable without $ prefix E.g., if we have $addr, might have a statement $tmp.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
OOP Tirgul 11. What We’ll Be Seeing Today  Regular Expressions Basics  Doing it in Java  Advanced Regular Expressions  Summary 2.
Unix RE’s Text Processing Lexical Analysis.   RE’s appear in many systems, often private software that needs a simple language to describe sequences.
Hands-on Regular Expressions Simple rules for powerful changes.
CS 330 Class 7 Comments on Exam Programming plan for today:
Perl Regular Expression in SAS
Looking for Patterns - Finding them with Regular Expressions
CSC 594 Topics in AI – Natural Language Processing
Week 14 - Friday CS221.
CSC 594 Topics in AI – Natural Language Processing
Advanced Find and Replace with Regular Expressions
Data Manipulation & Regex
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
PHP –Regular Expressions
Presentation transcript:

Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn

What are Regular Expressions Regular expressions are a syntax to match text. They date back to mathematical notation made in the 1950s. Became embedded in unix systems through tools like ed and grep.

What are RE Perl in particular promoted the use of very complex regular expressions. They are now available in all popular programming languages. They allow much more complex matching than strpos()

Why use RE You can use RE to enforce rules on formats like phone numbers, addresses or URLs. You can use them to find key data within logs, configuration files or webpages.

Why use RE They can quickly make replacements that may be complex like finding all addresses in a page and making them address [AT] site [dot] com. You can make your code really hard to understand

Syntax basics The entire regular expression is a sequence of characters between two forward slashes (/) abc - most characters are normal character matches. This is looking for the exact character sequence a, b and then c. - a period will match any character (except a newline but that can change) [abc] - square brackets will match any of the characters inside. Here: a, b or c.

Syntax basics ? - marks the previous as optional. so a? means there might be an a (abc)* - parenthesis group patterns and the asterix marks zero or more of the previous character. So this would match an empty string or abcabcabcabc \.+ - the backslash is an all purpose escape character. the + marks one or more of the previous character. So this would match......

More syntax tricks [0-4] - match any number from 0 to 4 [^0-4] - match anything not the number 0-4 \sword\s - match word where there is white space before and after \bword\b - \b marks a word boundary. This could be white space, new line or end of the string

More syntax tricks \d{3,12} - \d matches any digit ([0-9]) while the braces mark the min and max count of the previous character. In this case 3 to 12 digits [a-z]{8,} - must be at least 8 letters

Matching Text Simple check: z0-9]+\.)*[a-z0-9]+$/i”, $ _address) > 0 Finding: preg_match(“/\bcolou?r:\s+([a-zA- Z]+)\b/”, $text, $matches); echo $matches[1]; Find all: preg_match_all(“/ ]+)>/”, $html, $tags); echo $tags[2][1];

Matching Lines This is more for looking through files but could be for any array of text. $new_lines = preg_grep(“/Jan[a-z]*[\s\/\- ](20)?07/”, $old_lines); Or lines that do not match by adding a third parameter of PREG_GREP_INVERT rather than complicating your regular expression into something like /^[^\/]|(\/[^p])|(\/p[^r]) etc...

Replacing text preg_replace( array(“ [AT] “, “ [dot] “), $post);

Splitting text $date_parts = preg_split(“/[-\.,\/\\\s]+/”, $date_string);

Tips Comment what your regular expression is doing. Test your regular expression for speed. Some can cause a noticeable slowdown. There are plenty of simple uses like /Width: (\d+)/ Watch out for greedy expressions. Eg /( )/ will not pull out “b” and “/b” from “ test ” but instead will pull “b>test )/

References Thank you