Fall 2006 CSE 467/567 1 RE review (Perl syntax) single-character disjunction: [aeiou] ranges: [0-9] negation: [^aeiou] conjunction: /cat/ matching zero.

Slides:



Advertisements
Similar presentations
Regular Expressions in Perl By Josue Vazquez. What are Regular Expressions? A template that either matches or doesn’t match a given string. Often called.
Advertisements

CSCI 330 T HE UNIX S YSTEM Regular Expressions. R EGULAR E XPRESSION A pattern of special characters used to match strings in a search Typically made.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
Regular Expressions (RE) Used for specifying text search strings. Standarized and used widely (UNIX: vi, perl, grep. Microsoft Word and other text editors…)
Programming and Perl for Bioinformatics Part III.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
Asp.NET Core Vaidation Controls. Slide 2 ASP.NET Validation Controls (Introduction) The ASP.NET validation controls can be used to validate data on the.
LING 388: Language and Computers Sandiway Fong Lecture 2: 8/23.
Computational Language Finite State Machines and Regular Expressions.
W3101: Programming Languages (Perl) 1 Perl Regular Expressions Syntax for purpose of slides –Regular expression = /pattern/ –Broader syntax: if (/pattern/)
More Regular Expressions. List/Scalar Context for m// Last week, we said that m// returns ‘true’ or ‘false’ in scalar context. (really, 1 or 0). In list.
Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl Linux editors and commands (e.g.
1 Foundations of Software Design Lecture 22: Regular Expressions and Finite Automata Marti Hearst Fall 2002.
1 Overview Regular expressions Notation Patterns Java support.
Scripting Languages Chapter 8 More About Regular Expressions.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Regular Expression A regular expression is a template that either matches or doesn’t match a given string.
REGULAR EXPRESSIONS CHAPTER 14. REGULAR EXPRESSIONS A coded pattern used to search for matching patterns in text strings Commonly used for data validation.
Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:
Last Updated March 2006 Slide 1 Regular Expressions.
Overview of the grep Command Alex Dukhovny CS 265 Spring 2011.
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
System Programming Regular Expressions Regular Expressions
Programming Perl in UNIX Course Number : CIT 370 Week 4 Prof. Daniel Chen.
Globalisation & Computer systems Week 7 Text processes and globalisation part 1: Sorting strings: collation Searching strings and regular expressions Practical:
File Processing. Introduction More UNIX commands for handling files Regular Expressions and Searching files Redirection and pipes Bash facilities.
Chapter 2. Regular Expressions and Automata From: Chapter 2 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition,
March 1, 2009 Dr. Muhammed Al-mulhem 1 ICS 482 Natural Language Processing Regular Expression and Finite Automata Muhammed Al-Mulhem March 1, 2009.
ASP.NET Programming with C# and SQL Server First Edition Chapter 5 Manipulating Strings with C#
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2015, Fred McClurg, All Rights.
Perl: Lecture 2 Advanced RE & CGI. Regular Expressions 2.
1 Perl Syntax: control structures Learning Perl, Schwartz.
I/O Redirection and Regular Expressions February 9 th, 2004 Class Meeting 4.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
Corpus Linguistics- Practical utilities (Lecture 7) Albert Gatt.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2010 All Rights Reserved. 1.
Python for NLP Regular Expressions CS1573: AI Application Development, Spring 2003 (modified from Steven Bird’s notes)
©Brooks/Cole, 2001 Chapter 9 Regular Expressions.
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
I/O Redirection & Regular Expressions CS 2204 Class meeting 4 *Notes by Doug Bowman and other members of the CS faculty at Virginia Tech. Copyright
REGULAR EXPRESSIONS 4 DAY 9 - 9/15/14 LING 3820 & 6820 Natural Language Processing Harry Howard Tulane University.
CSC 2720 Building Web Applications PHP PERL-Compatible Regular Expressions.
2004/12/051/27 SPARCS 04 Seminar Regular Expression By 박강현 (lightspd)
Unix Programming Environment Part 3-4 Regular Expression and Pattern Matching Prepared by Xu Zhenya( Draft – Xu Zhenya(
Regular Expressions CS 2204 Class meeting 6 Created by Doug Bowman, 2001 Modified by Mir Farooq Ali, 2002.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
Karthik Sangaiah.  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” 
Recursive Definations Regular Expressions Ch # 4 by Cohen
CGS – 4854 Summer 2012 Web Site Construction and Management Instructor: Francisco R. Ortega Chapter 5 Regular Expressions.
Regular expressions and the Corpus Query Language Albert Gatt.
PZ02CX Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ02CX - Perl Programming Language Design and Implementation.
Perl References arrays and hashes can only contain scalars (numbers and strings)‏ if we want something more complicated (like an array of arrays) we use.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
May 2006CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
Regular Expressions Copyright Doug Maxwell (
RE Tutorial.
Lists 1 Day /17/14 LING 3820 & 6820 Natural Language Processing
Theory of Computation Lecture #
Looking for Patterns - Finding them with Regular Expressions
Perl Programming Language Design and Implementation (4th Edition)
Lecture 9 Shell Programming – Command substitution
Advanced Find and Replace with Regular Expressions
CSE 1020:Software Development
Regular Expressions
Regular Expressions and Grep
CSCI The UNIX System Regular Expressions
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
Presentation transcript:

Fall 2006 CSE 467/567 1 RE review (Perl syntax) single-character disjunction: [aeiou] ranges: [0-9] negation: [^aeiou] conjunction: /cat/ matching zero or one: /cats?/ Kleene * and +: /[ab]+/ matches ‘a’, ‘b’, ‘aa’, ‘ab’, ‘ba’, ‘bb’, etc wildcard: /c.t/ matches “cat”, “cbt”, “cct”, … anchors: ^, $, \b, \B /projects/CSE467/Resources/Code/Perl

Fall 2006 CSE 467/567 2 Conjunction Two regular expressions are conjoined by juxtaposition (placing the expressions side by side). Examples: /a/ matches ‘a’ /m/ matches ‘m’ /am/ matches ‘am’ but not ‘a’ or ‘m’ alone

Fall 2006 CSE 467/567 3 Disjunction We have already seen disjunction of characters using the square bracket notation General disjunction is expressed using the vertical bar (|), also called the pipe symbol. This form of disjunction allows us to match any one of the alternative patterns, not just characters like the [ ] disjunction form.

Fall 2006 CSE 467/567 4 Grouping Parentheses, ‘(’ and ‘)’, are used to group subpatterns of a larger pattern. Ex: /[Gg](ee)|(oo)se/

Fall 2006 CSE 467/567 5 Replacement In addition to matching, we can do replacements when a match is found: Example: To replace the British spelling of color with the American spelling, we can write: s/colour/color/

Fall 2006 CSE 467/567 6 Registers – saving matches To save a match from part of a pattern, to reuse it later on, Perl provides registers Registers are named \#, where # is the number of the register Ex. DE DO DO DO DE DA DA DA IS ALL I WANT TO SAY TO YOU /(D[AEO].)*/ will match the first line /(D[AEO])(.D[AEO]) \2 \2\s \1 (.D[AEO]) \3 \3/ matches it more specifically This pattern also matches strings like DA DE DE DE DA DO DO DO \s matches a whitespace character

Fall 2006 CSE 467/567 7 For more information PERL Regular Expression TUTorial – PERL Regular Expression reference page –

Fall 2006 CSE 467/567 8 Eliza Published by Weizenbaum in 1966 Modelled a Rogerian therapist Had no intelligence – worked by pattern matching and replacement Had some people convinced that it really understood! demo at

Fall 2006 CSE 467/567 9 Wordcount program Unix wordcount program (wc) counts lines, words and characters Determining counts & probabilities of words has many applications: –augmentative communiction –context-sensitive spelling error correction –speech recognition –hand-writing recognition

Fall 2006 CSE 467/ Counting words in a corpora (preview) #!/usr/bin/perl #FROM Perl BOOK, PAGE 39 $/ = ""; # Enable paragraph mode. $* = 1; # ENABLE multi-line patterns. # Now read each paragraph and split into words. Record each # instance of a word in the %wordcount associative array. $total = 0; while (<>) { s/-\n//g; # Dehyphenate hyphenations (across lines) s/ //g; # Remove tr/A-Z/a-z/; # Canonicalize to = split(/\W*\s+\W*/, $_); foreach $word { $wordcount{$word}++; # Increment the entry. $total++; } } # Now print out all the entries in the %wordcount array foreach $word (sort keys(%wordcount)) { printf "(%8.6f\%) %20s occurs %3d time(s)\n", (100 * $wordcount{$word}/$total), $word, $wordcount{$word}; } printf "Total number of distinct words is %d.\n", $total;