Download presentation
Presentation is loading. Please wait.
Published byEvan Neil Young Modified over 9 years ago
1
Regular Expressions CISC/QCSE 810
2
Recognizing Matching Strings ls *.exe translates to "any set of characters, followed by the exact string ".exe" The "*.exe" is a regular expression ls gets a list of all files, and then only returns those that match the expression "*.exe"
3
In Perl In Perl, can see if strings match using the =~ operator $s = "Cat In the Hat"; if ($s =~ /Cat/) { print "Matches Cat"; } if ($s =~ /Chat/) { print "Matches Chat"; }
4
Common references \wCharacters in words\WNon-word character \sSpace, tab\SNon-whitespace character \dMatch a digit\DNon-digit match \nNewline\tTab.Any character ^Start of string$End of string Modifiers *0 or more occurences{n}Exactly n matches {n,}n or more matches{n,m}Match n to m matches Character Groups [a-z][xyz] [0-9A-Z][\w_] [^a-z]NOT a-z
5
Exercise 1 Write a regexp that matches only on Canadian postal codes
6
Exercise 2 Write a regexp that matches typical intermediate files (.o,.dvi,.tmp) helpful if you want a systematic way to delete them
7
String Substitution Found an input file (*.dat), looking for a matching output file (.out) @input_files = foreach $input_file (@input_files) { # Copy to output name $output_file = $input_file; # replace.dat with.out $output_file =~ s/.dat/.out/; if (! -f $output_file) { print "Need to create output for $output_file\n"; }
8
Translating $s = "Alternate Ending"; $s =~ tr/[a-z]/[A-Z]; Can also use 'uc' and 'lc' (more generic for non-English languages)
9
Grabbing Substrings Get root URL $url = "http://www.mast.queensu.ca/~math224/Slides/Week_09/driven_spring2.m"; $url =~ /(www[\w.]*)/; $short_url = $1; print "Full URL: $url\n"; print "Site URL: $short_url\n";
10
End options s/a/A/g – global; swap all matches changes "aaaba" to "AAAbA" Compare with s/a/A/ changes "aaaba" to "Aaaba" /tmp/i - case insensitive recognizes "tmp", "Tmp", "tMP", "TMP"…
11
Exercise Write a regexp line that returns all the integers in the text Can it be extended to handle floating point values?
12
Functions with Regex split split /\s+/, $line; split /,/, $line; split /\t/, $line split //, $line; grep @v = qw( aaa bba bbc); @matches = grep /bb/, @v;
13
Longer example – Log files Parsing log files 195.5.23.103 - - [25/Mar/2003:02:22:11 -0800] "GET /gcs/new.gif HTTP/1.1" 200 926 195.5.23.103 - - [25/Mar/2003:02:22:11 -0800] "GET /gcs/update.gif HTTP/1.1" 200 971 proxy.skynet.be - - [25/Mar/2003:02:40:54 -0800] "GET /gcs/gc1hint.html HTTP/1.1" 200 16358 j3194.inktomisearch.com - - [25/Mar/2003:03:13:12 -0800] "GET /~gcs/K-12.html HTTP/1.0" 200 3235 kittyhawk.hhmi.org - - [25/Mar/2003:03:17:20 -0800] "HEAD /gcs/ HTTP/1.0" 200 0 j3104.inktomisearch.com - - [25/Mar/2003:03:54:43 -0800] "GET /gcs/pa.html HTTP/1.0" 200 5614 crawl11-public.alexa.com - - [25/Mar/2003:04:51:41 -0800] "GET /gcs/clinical.html HTTP/1.0" 200 20132 … livebot-65-55-208-64.search.live.com - - [24/Jul/2007:22:16:58 -0700] "GET /gcs/webstats/usage_200602.html HTTP/1.0" 200 128720 203.129.234.42 - - [24/Jul/2007:22:22:39 -0700] "GET /gcs/status/statuscheck.html HTTP/1.1" 200 1522624 livebot-65-55-208-65.search.live.com - - [24/Jul/2007:22:47:32 -0700] "GET /gcs/webstats/usage_200610.html HTTP/1.0" 200 132580 …
14
Alternate uses If you write your own program, with many print statements, can 1. make print statements meaningful "Time spent on loading: 23.5s" 2. can parse afterwards to process/store values $line = m/: ([\d.])+s/; $time = $1;
15
Resources Any web search for "perl regular expression tutorial" Perl reg exp by example http://www.somacon.com/p127.php Reference card http://www.erudil.com/preqr.pdf Perl site reference http://perldoc.perl.org/perlre.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.