Regular Expressions.

Slides:



Advertisements
Similar presentations
JavaScript I. JavaScript is an object oriented programming language used to add interactivity to web pages. Different from Java, even though bears some.
Advertisements

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 9 Strings.
CSCI 6962: Server-side Design and Programming Input Validation and Error Handling.
1 Strings and Text I/O. 2 Motivations Often you encounter the problems that involve string processing and file input and output. Suppose you need to write.
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
Primitive Data Types There are a number of common objects we encounter and are treated specially by almost any programming language These are called basic.
Constants and Data Types Constants Data Types Reading for this class: L&L,
Regular Expressions in Java. Namespace in XML Transparency No. 2 Regular Expressions Regular expressions are an extremely useful tool for manipulating.
CMT Programming Software Applications
1 A Quick Introduction to Regular Expressions in Java.
 Pearson Education, Inc. All rights reserved Strings, Characters and Regular Expressions.
String Escape Sequences
Form Validation CS What is form validation?  validation: ensuring that form's values are correct  some types of validation:  preventing blank.
Regular Expressions. String Matching The problem of finding a string that “looks kind of like …” is common  e.g. finding useful delimiters in a file,
An Introduction to TokensRegex
Applications of Regular Expressions BY— NIKHIL KUMAR KATTE 1.
REGULAR EXPRESSIONS CHAPTER 14. REGULAR EXPRESSIONS A coded pattern used to search for matching patterns in text strings Commonly used for data validation.
Lesson 3 – Regular Expressions Sandeepa Harshanganie Kannangara MBCS | B.Sc. (special) in MIT.
1 Form Validation. Validation  Validation of form data can be cumbersome using the basic techniques  StringTokenizer  If-else statements  Most of.
Last Updated March 2006 Slide 1 Regular Expressions.
Chapter 2 How to Compile and Execute a Simple Program.
Introduction to Programming Prof. Rommel Anthony Palomino Department of Computer Science and Information Technology Spring 2011.
CSCI 1100/1202 January 16, Why do we need variables? To store intermediate results in a long computation. To store a value that is used more than.
Characters, String and Regular expressions. Characters char data type is used to represent a single character. Characters are stored in a computer memory.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 9 Characters and Strings.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Java How to Program, 9/e © Copyright by Pearson Education, Inc. All Rights Reserved.
Regular Expressions in.NET Ashraya R. Mathur CS NET Security.
Using Regular Expressions in Java for Data Validation Evelyn Brannock Jan 30, 2009.
Introduction to Programming David Goldschmidt, Ph.D. Computer Science The College of Saint Rose Java Fundamentals (Comments, Variables, etc.)
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
Outline Character Strings Variables and Assignment Primitive Data Types Expressions Data Conversion Interactive Programs Graphics Applets Drawing Shapes.
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Regular Expression in Java 101 COMP204 Source: Sun tutorial, …
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
An Introduction to Java Programming and Object-Oriented Application Development Chapter 7 Characters, Strings, and Formatting.
Regular Expressions – An Overview Regular expressions are a way to describe a set of strings based on common characteristics shared by each string in.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
 2003 Jeremy D. Frens. All Rights Reserved. Calvin CollegeDept of Computer Science(1/8) Regular Expressions in Java Joel Adams and Jeremy Frens Calvin.
REGEX. Problems Have big text file, want to extract data – Phone numbers (503)
Overview A regular expression defines a search pattern for strings. Regular expressions can be used to search, edit and manipulate text. The pattern defined.
Working with Forms and Regular Expressions Validating a Web Form with JavaScript.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Regular Expressions. Overview Regular expressions allow you to do complex searches within text documents. Examples: Search 8-K filings for restatements.
Module 6 – Generics Module 7 – Regular Expressions.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Assignment An assignment statement changes the value of a variable The assignment operator is the = sign total = 55; Copyright © 2012 Pearson Education,
An Introduction to Programming with C++ Sixth Edition Chapter 13 Strings.
Introduction to Programming the WWW I CMSC Winter 2004 Lecture 13.
17-Feb-16 String and StringBuilder Part I: String.
An Introduction to Regular Expressions Specifying a Pattern that a String must meet.
OOP Tirgul 11. What We’ll Be Seeing Today  Regular Expressions Basics  Doing it in Java  Advanced Regular Expressions  Summary 2.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
Regular Expressions.
REGULAR EXPRESSION Java provides the java.util.regex package for pattern matching with regular expressions. Java regular expressions are very similar.
Regular Expressions Upsorn Praphamontripong CS 1110
Lecture 19 Strings and Regular Expressions
Strings, Characters and Regular Expressions
University of Central Florida COP 3330 Object Oriented Programming
Multiple variables can be created in one declaration
/^Hel{2}o\s*World\n$/
JAVA RegEx Manish Shrivastava 11/11/2018.
Chapter 2: Java Fundamentals
CSE 1020:Software Development
Selenium WebDriver Web Test Tool Training
String methods 26-Apr-19.
Lecture 25: Regular Expressions
Chap 2. Identifiers, Keywords, and Types
REGEX.
Presentation transcript:

Regular Expressions

Tokenizing strings When you read a sentence, your mind breaks it into tokens individual words and punctuation marks that convey meaning. String method split breaks a String into component tokens and returns an array of Strings. Tokens are separated by delimiters Typically white-space characters such as space, tab, newline and carriage return. Other characters can also be used as delimiters to separate tokens.

Regular expressions A regular expression a specially formatted String describing a search pattern useful for validating input One application is to construct a compiler Large and complex regular expression are used to this end If the program code does not match the regular expression => compiler knows that there is a syntax error

Regular Expressions (cont’d) String method matches receives a String specifying the regular expression matches the contents of the String object parameter with the regular expression. and returns a boolean indicating whether the match succeeded. A regular expression consists of literal characters and special symbols.

Character classes A character class Is an escape sequence representing a group of chars Matches a single character in the search object

Common Matching Symbols Regular Expression Description . Matches any character ^regex regex must match at the beginning of the line regex$ Finds regex must match at the end of the line [abc] Set definition, can match the letter a or b or c [abc][vz] Set definition, can match a or b or c followed by either v or z [^abc] When a "^" appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c [a-d1-7] Ranges, letter between a and d and figures from 1 to 7, will not match d1 X|Z Finds X or Z XZ Finds X directly followed by Z $ Checks if a line end follows

Ranges Ranges in characters are determined By the letters’ integer values Ex: "[A-Za-z]" matches all uppercase and lowercase letters. The range "[A-z]" matches all letters and also matches those characters (such as [ and \) with an integer value between uppercase A and lowercase z.

Grouping Parts of regex can be grouped using “()” Via the “$”, one can refer to a group Example: Removing whitespace between a char and “.” or “,” String pattern = "(\\w)(\\s+)([\\.,])"; System.out.println( str.replaceAll(pattern, "$1$3"));

Negative look-ahead It is used to exclude a pattern defined via (?!pattern) Example: a(?!b) Matches a if a is not followed by b

Quantifiers

Matches Method: Examples Validating a first name firstName.matches(“[A-Z][a-zA-Z]*”); “([a-zA-Z]+|[a-zA-Z]+\\s[a-zA-Z]+)” The character "|" matches the expression to its left or to its right. "Hi (John|Jane)" matches both "Hi John" and "Hi Jane". Validating a Zip code “\\d{5}”;

Split Method: examples public class RegexTestStrings { public static final String EXAMPLE_TEST = "This is my small example " + "string which I'm going to " + "use for pattern matching."; public static void main(String[] args) { System.out.println(EXAMPLE_TEST.matches("\\w.*")); String[] splitString = (EXAMPLE_TEST.split("\\s+")); System.out.println(splitString.length);// Should be 14 for (String string : splitString) { System.out.println(string); } // Replace all whitespace with tabs System.out.println(EXAMPLE_TEST.replaceAll("\\s+", "\t")); }

RegEx examples // Returns true if the string matches exactly "true" public boolean isTrue(String s){ return s.matches("true"); } // Returns true if the string matches exactly "true" or "True“ public boolean isTrueVersion2(String s){ return s.matches("[tT]rue"); } // Returns true if the string matches exactly "true" or "True" // or "yes" or "Yes" public boolean isTrueOrYes(String s){ return s.matches("[tT]rue|[yY]es"); } // Returns true if the string contains exactly "true" public boolean containsTrue(String s){ return s.matches(".*true.*"); }

RegEx examples (cont’d) // Returns true if the string consists of three letters public boolean isThreeLetters(String s){ return s.matches("[a-zA-Z]{3}");} // Returns true if the string does not have a number at the beginning public boolean isNoNumberAtBeginning(String s){ return s.matches("^[^\\d].*"); } // Returns true if the string contains arbitrary number of characters //except b public boolean isIntersection(String s){ return s.matches("([\\w&&[^b]])*"); }

Pattern and Matcher classes Java provides java.util.regex That helps developers manipulate regular expressions Class Pattern represents a regular expression Class Matcher Contains a search pattern and a CharSequence object If regular expression to be used once Use static method matches of Pattern class, which Accepts a regular expression and a search object And returns a boolean value

Pattern and Matcher classes (cont’d) If a regular expression is used more than once Use static method compile of Pattern to Create a specific Pattern object based on a regular expression Use the resulting Pattern object to Call the method matcher, which Receives a CharSequence to search and returns a Matcher Finally, use the following methods of the obtained Matcher find, group, lookingAt, replaceFirst, and replaceAll

Methods of Matcher The dot character "." in a regular expression matches any single character except a newline character. Matcher method find attempts to match a piece of the search object to the search pattern. each call to this method starts at the point where the last call ended, so multiple matches can be found. Matcher method lookingAt performs the same way except that it starts from the beginning of the search object and will always find the first match if there is one.

Pattern and Matcher example import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexTestPatternMatcher { public static final String EXAMPLE_TEST = "This is my small example string which I'm going to use for pattern matching."; public static void main(String[] args) { Pattern pattern = Pattern.compile("\\w+"); Matcher matcher = pattern.matcher(EXAMPLE_TEST); while (matcher.find()) { System.out.print("Start index: " + matcher.start()); System.out.print(" End index: " + matcher.end()); System.out.println(matcher.group()); } Pattern replace = Pattern.compile("\\s+"); Matcher matcher2 = replace.matcher(EXAMPLE_TEST); System.out.println(matcher2.replaceAll("\t")); }

More examples of Regular Expressions in Java Appendix More examples of Regular Expressions in Java

Validating a username Examples of usernames that don’t match import java.util.regex.Matcher; import java.util.regex.Pattern;   public class UsernameValidator{   private Pattern pattern; private Matcher matcher;   private static final String USERNAME_PATTERN = "^[a-z0-9_-]{3,15}$";   public UsernameValidator(){ pattern = Pattern.compile(USERNAME_PATTERN); }   /** * Validate username with regular expression * @param username username for validation * @return true valid username, false invalid username */ public boolean validate(final String username){   matcher = pattern.matcher(username); return matcher.matches();   } Examples of usernames that don’t match mk (too short, min 3 chars); w@lau (“@” not allowed)

Validating image file extension import java.util.regex.Matcher; import java.util.regex.Pattern;   public class ImageValidator{   private Pattern pattern; private Matcher matcher;   private static final String IMAGE_PATTERN ="([^\\s]+(\\.(?i)(jpg|png|gif|bmp))$)";   public ImageValidator(){ pattern = Pattern.compile(IMAGE_PATTERN); }   /** * Validate image with regular expression * @param image image for validation * @return true valid image, false invalid image */ public boolean validate(final String image){   matcher = pattern.matcher(image); return matcher.matches();   }

Time in 12 Hours Format validator import java.util.regex.Matcher; import java.util.regex.Pattern; public class Time12HoursValidator{ private Pattern pattern; private Matcher matcher; private static final String TIME12HOURS_PATTERN = "(1[012]|[1-9]):[0-5][0-9](\\s)?(?i)(am|pm)"; public Time12HoursValidator(){ pattern = Pattern.compile(TIME12HOURS_PATTERN); } /** * Validate time in 12 hours format with regular expression * @param time time address for validation * @return true valid time fromat, false invalid time format */ public boolean validate(final String time){ matcher = pattern.matcher(time); return matcher.matches(); }

Validating date Date format validation (0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[012])/((19|20)\\d\\d) ( start of group #1 0?[1-9] => 01-09 or 1-9 | ..or [12][0-9] # 10-19 or 20-29 3[01] => 30, 31 ) end of group #1 / # followed by a "/" ( # start of group #2 0?[1-9] # 01-09 or 1-9 | # ..or 1[012] # 10,11,12 ) # end of group #2 ( # start of group #3 (19|20)\\d\\d # 19[0-9][0-9] or 20[0-9][0-9] ) # end of group #3