Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scripting Languages Chapter 8 More About Regular Expressions.

Similar presentations


Presentation on theme: "Scripting Languages Chapter 8 More About Regular Expressions."— Presentation transcript:

1 Scripting Languages Chapter 8 More About Regular Expressions

2 Character Classes  a list of possible characters inside square brackets ( [ ] )  matches any single character from within the class.  matches one character but may be any of the ones listed

3 Square Brackets  are similar to the period except that the match is limited to the characters within exp: /h[aeiou]t/ matches hat,het,hit,hot,hut but not ht or hrt

4 Shortcuts  character alternatives are more popular so shortcuts can help  ( - ) dash specifies a range  [0-9], [0123456789]  [a-z], [abcdefghijklmnopqrstuvwxyz]  [a-fA-F], [abcdefABCDEF]  /[a-z][0-9][A-Z]/ matches a string that contains a lowercase character, followed by a digit, followed by an uppercase char

5 ( - ) limitations  the (-) dash only has this special meaning when it is used to specify a range  /-[0-9]/ will match any string with a dash followed by a digit

6 Caret character  Has special meaning if it is the first character within square brackets.  It represents all characters except those that follow  This program prints those lines that do not contain a digit. while ( ){ chomp; print “$_\n” if /^[0-9]/; }

7 Multiple-character matches  Use set of curly braces to match multiple occurrences of characters.  Numbers inside the curly braces correspond to the character indicated at the left of the curly braces. /x[0-9]{2}x/ matches exactly two digits surrounded by the “x” char

8 Examples /x[0-9]{2,5}x/ matches 2-5 digits surrounded by the “x” character /a[0-9]{5,}s/ matches “at least” five digits surrounded by the “a” character

9 Multiple-character Patterns  Several Multiple-char patterns occur more commonly than others.  Perl has some special reg exp chars for these situations  ? is equivalent to zero or one of what is at its left  * is equivalent to zero or more chars of whatever appears at the left of it  + is equivalent to one or more chars of whatever appears at its left  But remember these symbols represent themselves when enclosed in square brackets

10 Example /[-+]?[0-9]+/ matches a set of optionally signed digits, that is, zero or more digits possibly preceded by either a + or – Note: the two +’s have different meanings

11 Other Special Symbols  Remember (from last lecture) that | allows you to determine if a string contains one of a set of alternatives. (more examples) print if /10|15|19)/; print if /1(0|5|9)/; note the parentheses were used to group a set of alternatives

12 Other shortcuts  \w - Matches a word character (a- z_A-Z).  \W - Matches a non-word char  \s – Matches a whitespace char (blank, tab, or newline)  \S – Matches a non-whitespace char  \d – Matches a digit char  \D – Matches a non-digit char

13 More About Anchoring Patterns  earlier we saw that this expression matches a set of optionally signed digits: /[-+]?[0-9]+/ Each of these match this pattern: -256hello hello+256 lyes the2ndone

14  If you wanted to match a string that contained only digits, then the pattern on previous slide is probably not what you intended. Exp – you asked the user for a number, you would expect responses such as - -256 +256 345 To solve this we need to anchor a match to certain boundaries.

15 The Caret, Dollar Sign-- Again  Remember the caret allows you to match a pattern if it is at the beginning of a string.  And the $ allows you match a pattern if it is at the end of a string.  Note: the \n must be matched explicitly  \b sequence allows you to match a sting at a word boundary.

16 Examples /^this/ - #at beginning of string /this$/ - #at end of string /this/ - #anywhere in the string /\bthis\b/ - #if a word /^this$/ = #only if line contains ‘this’

17 This code asks user for an int and then checks the result with a reg exp. If user input is not an integer, the program asks the user to re- enter the integer. Eventually, the number of attempts for a correct match is printed.

18 #!/usr/bin/perl print “Enter a number “; $count = 1; while(1) {$_ = ; chomp; last if /^[-+]?[0-9]+$/; print “$_ is not a number, Re-enter”; $count++; } print “$count tries to enter a number\n”;

19 Exercises  Get previous script working in your account – understand its contents.  Complete 2 & 3 Pages 113 - 114


Download ppt "Scripting Languages Chapter 8 More About Regular Expressions."

Similar presentations


Ads by Google