Regular Expressions CISC/QCSE 810. Recognizing Matching Strings ls *.exe translates to "any set of characters, followed by the exact string ".exe" The.

Slides:



Advertisements
Similar presentations
CST8177 sed The Stream Editor. The original editor for Unix was called ed, short for editor. By today's standards, ed was very primitive. Soon, sed was.
Advertisements

Unix/Linux basics user management Operating systems lab Gergely Windisch room 4.12
● Perl reference
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
Regular Expressions Regular Expression (or pattern) in Perl – is a template that either matches or doesn’t match a given string. if( $str =~ /hello/){
Linux+ Guide to Linux Certification, Second Edition
W3101: Programming Languages (Perl) 1 Perl Regular Expressions Syntax for purpose of slides –Regular expression = /pattern/ –Broader syntax: if (/pattern/)
COS 381 Day 22. Agenda Questions?? Resources Source Code Available for examples in Text Book in Blackboard
Regular Expressions Comp 2400: Fall 2008 Prof. Chris GauthierDickey.
Grep, comm, and uniq. The grep Command The grep command allows a user to search for specific text inside a file. The grep command will find all occurrences.
Using Unix Shell Scripts to Manage Large Data
Filters using Regular Expressions grep: Searching a Pattern.
Chapter 4: UNIX File Processing Input and Output.
1 Day 16 Sed and Awk. 2 Looking through output We already know what “grep” does. –It looks for something in a file. –Returns any line from the file that.
Regular Expressions Dr. Ralph D. Westfall May, 2011.
Practical Extraction & Report Language PERL Joseph Beltran.
LIN 6932 Unix Lecture 6 Hana Filip. LIN 6932 HW6 - Part II solutions posted on my website see syllabus.
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
Sys.Prog & Scripting - HW Univ1 Systems Programming & Scripting Lecture 18: Regular Expressions in PHP.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong DFA to regular.
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
CIS 451: Regular Expressions Dr. Ralph D. Westfall January, 2009.
Regular Expression Mohsen Mollanoori. What is RegeX ?  “ A notation to describe regular languages. ”  “ Not necessarily (and not usually) regular ”
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Intermediate Perl Programming Todd Scheetz July 18, 2001.
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
BIF713 Additional Utilities. Linux Utilities  You have learned many Linux commands. Here are some more that you can use:  Data Manipulation (Reg Exps)
Regular Expressions CSC207 – Software Design. Motivation Handling white space –A program ought to be able to treat any number of white space characters.
CSC 352– Unix Programming, Spring 2015 April 28 A few final commands.
Matching in list context (Chapter 11 = ($str =~ /pattern/); This stores the list of the special ($1, $2,…) capturing variables into the.
Post-Module JavaScript BTM 395: Internet Programming.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
Time to talk about your class projects!. Shell Scripting Awk (lecture 2)
Javascript’s RegExp. RegExp object Javascript has an Object which compiles Regular Expressions into a Finite State Machine The F.S.M. is internal, and.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
CS 330 Class 9 Programming plan for today: More of how data gets into a script Via environment variables Via the url From a form By editing the url directly.
Introduction to Perl Part II By: Bridget Thomson McInnes 22 January 2004.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia Homework 4 not yet graded …
Random Bits of Perl None of this stuff is worthy of it’s own lecture, but it’s all a bunch of things you should learn to use Perl well.
By Corey Stokes 9/14/10. What is grep? Global Regular Expression Print grep is a command line search utility in Unix Try: Search for a word in a.cpp file.
BASH – Text Processing Utilities Erick, Joan © Sekolah Tinggi Teknik Surabaya 1.
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
CS 124/LINGUIST 180 From Languages to Information
Linux+ Guide to Linux Certification, Second Edition Chapter 4 Exploring Linux Filesystems.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
Finding substrings my $sequence = "gatgcaggctcgctagcggct"; #Does this string contain a startcodon? if ($sequence =~ m/atg/) { print "Yes"; } else { print.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
The Scripting Programming Language
PZ02CX Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ02CX - Perl Programming Language Design and Implementation.
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Regular Expressions. What is it 4? Text searching & replacing Sequence searching (input, DNA) Sequence Tracking Machine Operation logic machines that.
Unix RE’s Text Processing Lexical Analysis.   RE’s appear in many systems, often private software that needs a simple language to describe sequences.
Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.
Regular Expressions In Javascript cosc What Do They Do? Does pattern matching on text We use the term “string” to indicate the text that the regular.
Regular Expressions Copyright Doug Maxwell (
Looking for Patterns - Finding them with Regular Expressions
CST8177 sed The Stream Editor.
Tutorial On Lex & Yacc.
Regular Expressions in Pearl - Part II
Perl Programming Language Design and Implementation (4th Edition)
LING 388: Computers and Language
Unix Talk #2 grep/egrep/fgrep (maybe add more to this one….)
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong.
Unix Talk #2 (sed).
Regular Expressions and Grep
1.5 Regular Expressions (REs)
Lab 8: Regular Expressions
PHP –Regular Expressions
Presentation transcript:

Regular Expressions CISC/QCSE 810

Recognizing Matching Strings ls *.exe translates to "any set of characters, followed by the exact string ".exe" The "*.exe" is a regular expression ls gets a list of all files, and then only returns those that match the expression "*.exe"

In Perl In Perl, can see if strings match using the =~ operator $s = "Cat In the Hat"; if ($s =~ /Cat/) { print "Matches Cat"; } if ($s =~ /Chat/) { print "Matches Chat"; }

Common references \wCharacters in words\WNon-word character \sSpace, tab\SNon-whitespace character \dMatch a digit\DNon-digit match \nNewline\tTab.Any character ^Start of string$End of string Modifiers *0 or more occurences{n}Exactly n matches {n,}n or more matches{n,m}Match n to m matches Character Groups [a-z][xyz] [0-9A-Z][\w_] [^a-z]NOT a-z

Exercise 1 Write a regexp that matches only on Canadian postal codes

Exercise 2 Write a regexp that matches typical intermediate files (.o,.dvi,.tmp) helpful if you want a systematic way to delete them

String Substitution Found an input file (*.dat), looking for a matching output file = foreach $input_file { # Copy to output name $output_file = $input_file; # replace.dat with.out $output_file =~ s/.dat/.out/; if (! -f $output_file) { print "Need to create output for $output_file\n"; }

Translating $s = "Alternate Ending"; $s =~ tr/[a-z]/[A-Z]; Can also use 'uc' and 'lc' (more generic for non-English languages)

Grabbing Substrings Get root URL $url = " $url =~ /(www[\w.]*)/; $short_url = $1; print "Full URL: $url\n"; print "Site URL: $short_url\n";

End options s/a/A/g – global; swap all matches changes "aaaba" to "AAAbA" Compare with s/a/A/ changes "aaaba" to "Aaaba" /tmp/i - case insensitive recognizes "tmp", "Tmp", "tMP", "TMP"…

Exercise Write a regexp line that returns all the integers in the text Can it be extended to handle floating point values?

Functions with Regex split split /\s+/, $line; split /,/, $line; split /\t/, $line split //, $line; = qw( aaa bba = grep

Longer example – Log files Parsing log files [25/Mar/2003:02:22: ] "GET /gcs/new.gif HTTP/1.1" [25/Mar/2003:02:22: ] "GET /gcs/update.gif HTTP/1.1" proxy.skynet.be - - [25/Mar/2003:02:40: ] "GET /gcs/gc1hint.html HTTP/1.1" j3194.inktomisearch.com - - [25/Mar/2003:03:13: ] "GET /~gcs/K-12.html HTTP/1.0" kittyhawk.hhmi.org - - [25/Mar/2003:03:17: ] "HEAD /gcs/ HTTP/1.0" j3104.inktomisearch.com - - [25/Mar/2003:03:54: ] "GET /gcs/pa.html HTTP/1.0" crawl11-public.alexa.com - - [25/Mar/2003:04:51: ] "GET /gcs/clinical.html HTTP/1.0" … livebot search.live.com - - [24/Jul/2007:22:16: ] "GET /gcs/webstats/usage_ html HTTP/1.0" [24/Jul/2007:22:22: ] "GET /gcs/status/statuscheck.html HTTP/1.1" livebot search.live.com - - [24/Jul/2007:22:47: ] "GET /gcs/webstats/usage_ html HTTP/1.0" …

Alternate uses If you write your own program, with many print statements, can 1. make print statements meaningful  "Time spent on loading: 23.5s" 2. can parse afterwards to process/store values  $line = m/: ([\d.])+s/;  $time = $1;

Resources Any web search for "perl regular expression tutorial" Perl reg exp by example Reference card Perl site reference