Presentation is loading. Please wait.

Presentation is loading. Please wait.

Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes.

Similar presentations


Presentation on theme: "Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes."— Presentation transcript:

1 Perl Day 4

2 Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes you want to match things more vaguely, like  I don’t care what number it is, but I care it’s a number  Or I need to make sure the phone number entered has 10 digits. Regular Expressions make it possible Regular Expressions make it possible –They are arguably the most powerful part of perl.

3 Match Let’s first deal with matching things Let’s first deal with matching things –Imagine you ask the user to type in a phone number. You should next check you got a valid phone number (10 digits). print(“Enter a phone number\n”); $PhoneNum=<STDIN>;chomp($PhoneNum);if($PhoneNum=~/\d+/) { print(“good job\n”); } else { print(“That wasn’t a phone number\n”); }

4 =~ Up until now we’ve dealt with ==, !=,, >=,, >=, <=, eq and ne in tests. =~ means you are doing a regular expression match. Note you are not looking for the string ‘/\d+/’, you are looking for what that means. =~ means you are doing a regular expression match. Note you are not looking for the string ‘/\d+/’, you are looking for what that means.

5 \d, \w, \s In a regular expression, you’ll often see \[something]. They each have different meanings: In a regular expression, you’ll often see \[something]. They each have different meanings: –\d – A digit (0-9) –\w – A word character (a-Z, _) –\s – A space, or tab –. Matches absolutely anything –\. Matches only a dot. –Any words will match exactly (e.g. /enda/ would match only enda).

6 + * {} Any of the previous tokens can be followed by Any of the previous tokens can be followed by –+ Means there must be 1 or more –* Means there can be any number (including 0) –{7} Means there must be exactly 7 –{1,4} Means there must be between 1 and 4 –{,10} Means there must be less than 10 e.g: e.g: =~/\d{7}/ means there must be 7 digits =~/\d{7}/ means there must be 7 digits

7 Endings After the last /, you can put additional things: After the last /, you can put additional things: –i This makes the match case insensitive –g Allow it to match more than once (globally) –m Allow it to match on multiple lines

8 Search and Replace Uses the same language as matching Uses the same language as matching –However after the =~ you put an s –When you were doing matching there was secretly a m there, it’s optional $Text=‘abc123 def456’; $Text=~s/\d/x/g; –This will search for a digit and replace it with x. The g indicates it’ll do it everywhere it finds a digit  The result: $Text=‘abcxxx defxxx’;

9 More Examples $Text=‘abc123 def456’; $Text=~s/\d+//g;$Text=~s/c/b/g;$Text=~s/\d{2}/a/;$Text=~s/\s*//g;

10 Matching Specific Places 2 additional special characters: 2 additional special characters: –^ Means match at start of string only –$ Means match at end of string only Sometimes you only want the first 3 digits: Sometimes you only want the first 3 digits:$Phone=4042602694; –This would remove the area code: $Phone=~s/^\d{3}//g; Sometimes you only want the last 4: Sometimes you only want the last 4:$Phone=~s/\d{4}$//g;

11 Capturing Anything you wrap in ()’s will be captured: Anything you wrap in ()’s will be captured: –The first ()’s are $1, the second are $2 etc. $Phone=4042602694;$Phone=~/(\d{3})(\d{3})(\d{4})/;$AreaCode=$1;$Exchange=$2;$Extension=$3;

12 Translation Changing strings to upper case is easy Changing strings to upper case is easy –The Command is Translate (tr), works like match, search and replace. $Text=“this is lower case”; $Text=~tr/[a-z]/[A-Z]/; The square brackets create a “character class” The square brackets create a “character class” –a-z means all letters between a and z. –[c-k] would be all letters from c to k –[asdf] would be a, s, d and f –[ab]+ would be any combinations of a and b, like:  A  Ab  Aaaaa  bbbbb

13 Is Tomato a Fruit or Veg? grep can help. It looks in an array to tell you if a pattern is in the array. grep can help. It looks in an array to tell you if a pattern is in the array. –The pattern can be any regular expression like what you just learned. @Fruits=(‘apple’,’bananna’,’orange’);@Veg=(‘potato’,’carrot’,’tomato’);if(grep(/tomato/i,@Fruits)){ print(“It’s a fruit\n”); print(“It’s a fruit\n”);}elsif(grep(/tomato/i,@Veg)){ print(“It’s a veg\n”); print(“It’s a veg\n”);}

14 Split If you have a string, and you want to make it into an array, split can help. If you have a string, and you want to make it into an array, split can help. $Text=“This is a string”; @Words=split(/\s/,$Text);print(“$Words[2]\n”);


Download ppt "Perl Day 4. Fuzzy Matches We know about eq and ne, but they only match things exactly We know about eq and ne, but they only match things exactly –Sometimes."

Similar presentations


Ads by Google