LING 388: Language and Computers Sandiway Fong Lecture 3: 8/28.

Slides:



Advertisements
Similar presentations
CIS 240 Introduction to UNIX Instructor: Sue Sampson.
Advertisements

LING/C SC/PSYC 438/538 Lecture 6 9/13 Sandiway Fong.
Introduction to Unix – CS 21 Lecture 11. Lecture Overview Shell Programming Variable Discussion Command line parameters Arithmetic Discussion Control.
Second edition Your UNIX: The Ultimate Guide Das © 2006 The McGraw-Hill Companies, Inc. All rights reserved. UNIX – The Master Manipulator perl Perl is.
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
Regular Expression Original Notes by Song Guo. What Regular Expressions Are Exactly - Terminology a regular expression is a pattern describing a certain.
Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.
Asp.NET Core Vaidation Controls. Slide 2 ASP.NET Validation Controls (Introduction) The ASP.NET validation controls can be used to validate data on the.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 3: 8/28.
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
CS Lecture 03 Outline Sed and awk from previous lecture Writing simple bash script Assignment 1 discussion 1CS 311 Operating SystemsLecture 03.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 2: 8/23.
Scalar Variables Start the file with: #! /usr/bin/perl –w No spaces or newlines before the the #! “#!” is sometimes called a “shebang”. It is a signal.
LING 388: Language and Computers Sandiway Fong Lecture 2: 8/23.
Perl Basics A Perl Tutorial NLP Course What is Perl?  Practical Extraction and Report Language  Interpreted Language Optimized for String Manipulation.
Perl Lecture #1 Scripting Languages Fall Perl Practical Extraction and Report Language -created by Larry Wall -- mid – 1980’s –needed a quick language.
JavaScript, Fourth Edition
JavaScript, Third Edition
Scripting Languages Chapter 8 More About Regular Expressions.
String Escape Sequences
Shell Script Examples.
Regular Expressions in ColdFusion Applications Dave Fauth DOMAIN technologies Knowledge Engineering : Systems Integration : Web.
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
1 Operating Systems Lecture 3 Shell Scripts. 2 Shell Programming 1.Shell scripts must be marked as executable: chmod a+x myScript 2. Use # to start a.
1 Operating Systems Lecture 3 Shell Scripts. 2 Brief review of unix1.txt n Glob Construct (metacharacters) and other special characters F ?, *, [] F Ex.
1 An Introduction to Perl Part 1 CSC8304 – Computing Environments for Bioinformatics - Lecture 7.
The UNIX Shell. The Shell Program that constantly runs at terminal after a user has logged in. Prompts the user and waits for user input. Interprets command.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 4: 8/30.
2440: 211 Interactive Web Programming Expressions & Operators.
Constants in C A Presentation On Department of Computer & Information Technology, M.S.P.V.L. Polytechnic College, Pavoorchatram.
Introduction to Programming David Goldschmidt, Ph.D. Computer Science The College of Saint Rose Java Fundamentals (Comments, Variables, etc.)
Input, Output, and Processing
Regular Expressions Regular expressions are a language for string patterns. RegEx is integral to many programming languages:  Perl  Python  Javascript.
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
Shell Programming. Creating Shell Scripts: Some Basic Principles A script name is arbitrary. Choose names that make it easy to quickly identify file function.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular.
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormick 3rd floor 607 Office Hours – Tuesday and.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
(A Very Short) Introduction to Shell Scripts CSCI N321 – System and Network Administration Copyright © 2000, 2003 by Scott Orr and the Trustees of Indiana.
©Colin Jamison 2004 Shell scripting in Linux Colin Jamison.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia Homework 4 not yet graded …
Topic 2: Working with scalars CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 2, pages 19-38, Programming Perl 3rd edition chapter.
Introduction to Perl NICOLE VECERE. Background General Purpose Language ◦ Procedural, Functional, and Object-oriented Developed for text manipulation.
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
Operators and Expressions. 2 String Concatenation  The plus operator (+) is also used for arithmetic addition  The function that the + operator performs.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fluency with Information Technology Third Edition by Lawrence Snyder Chapter.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
BIL 104E Introduction to Scientific and Engineering Computing Lecture 2.
1 Lecture 8 Shell Programming – Control Constructs COP 3353 Introduction to UNIX.
© 2004 Pearson Addison-Wesley. All rights reserved August 27, 2007 Primitive Data Types ComS 207: Programming I (in Java) Iowa State University, FALL 2007.
CSC 4630 Meeting 7 February 7, 2007.
Topics Designing a Program Input, Processing, and Output
Looking for Patterns - Finding them with Regular Expressions
Primitive Data Types August 28, 2006 ComS 207: Programming I (in Java)
Javascript, Loops, and Encryption
What is Bash Shell Scripting?
Perl for Bioinformatics
“If you can’t write it down in English, you can’t code it.”
Topics Designing a Program Input, Processing, and Output
Topics Designing a Program Input, Processing, and Output
The Selection Structure
INTRODUCTION to PERL PART 1.
Karan Thaker CS 265 Section 001
Presentation transcript:

LING 388: Language and Computers Sandiway Fong Lecture 3: 8/28

Today’s Lecture regexp: recap hands on introduction to Perl –follow along with your laptop –do the background reading practice writing Perl –Homework 1 will be out on Thursday

Background Reading Perl Quick Intro – Perl Regular Expressions (RE) –perlrequick - Perl regular expressions quick startperlrequick –perlretut - Perl regular expressions tutorialperlretut

regexp: Recap Repetition abbreviations: –a exactly one a –a? a optional –a* zero or more a’s –a+ one or more a’s –a{n,m} between n and m a’s –a{n,} at least n a’s –a{n} exactly n a’s Metacharacters: –{}[]()^$.|*+?\ –may be escaped using by prefixing the metacharacter with backslash (\) Concatenation –two regexps may be concatenated to form a new regexp Disjunction –infix operator: | (vertical bar) –[set of characters] match one of the characters –[^set of characters] don’t match any of the characters –[char1-char2] dash (-) shorthand for a range of characters (ASCII)

regexp: Recap Range Abbreviations: –period (.) stands for any character (except newline) –\d (digit) = [0-9] –\s (whitespace character) = space (SP), tab (HT), carriage return (CR), newline (LF) or form feed (FF) –\w (word character) = [0-9a-zA-Z_] –uppercase versions, e.g. \D and \W denote negation... Line-oriented metacharacters: –caret (^) at the beginning of a regexp string matches the “beginning of a line” –dollar sign ($) at the end of a regexp string matches the “end of the line” Word-oriented metacharacters: –a word is any sequence of digits [0-9], underscores (_) and letters [a-zA-Z] –\b matches a word boundary

Perl we’re going to use the regexp facility built into Perl

Perl Run from the command line in Windows –Start > Run... –cmd (brings up command line interpreter) Running a Perl program: –perl -help (gives options) –perl filename.pl (runs Perl command file filename.pl ) –perl filename.pl inputfile.txt (runs Perl command file filename.pl, inputfile.txt is supplied to filename.pl ) e.g. filename.pl reads and processes input file inputfile.txt

Perl Example Perl program ( match.pl ) to read in a text file and print lines matching a regexp enclosed by /.../ Example input file ( text.txt ) Command perl match.pl text.txt open (F,$ARGV[0]) or die "$ARGV[0] not found!\n"; while ( ) { print $_ if (/The/); } This is a test. The cat sat on the mat. These shoes are made for walking. Otherwise, I thought it was cold. 45

Perl Program: open (F,$ARGV[0]) or die "$ARGV[0] not found!\n"; while ( ) { print $_ if (/The/); } –while ( ) first evaluates – reads in a line from the file referenced by F and places the line in the program variable $_ –then it executes the program code between the curly braces –then it goes back and reads another line –it does this repeatedly while produces a valid line – if we reach the end of the file, the while loop stops –print $_ if (/.../); is conditional code that means print the contents of variable $_ if the regexp between the /.../ can be found in $_ Program explained: – open the file referenced in $AGRV[0] for input – $AGRV[0] is the first command line argument following the program name – F is the file descriptor associated with the opened file – if there is a problem opening the file, e.g. file doesn’t exist, program execution dies and prints the value of the string enclosed in double quotes "$ARGV[0] not found!\n"

More Perl Reference: – /perlintro.html

More Perl Variables: –always prefixed by $ –e.g. $count, $i Assignment and arithmetic expressions: –e.g. –$count = 0; –$count = $count + 1; –$count++; (auto-increment) Arithmetic operators: + addition - subtraction * multiplication ** exponentiation / division

More Perl Variables and strings: –$i = “this”; –$i = $i. “ moment”; –. is the string concatentation operator Printing: –print $count; –print “Count: “, $count, “\n”; means print the string “Count: “ followed by the value of the variable $count followed by a newline –, is a separator –\n is the newline character

More Perl Conditionals: –if ($count < 1000) {... } –infix version –{... }if ($count != 0) –if-then-else version –if ($count == 1000) {... } else {...} Numeric comparisons: == equality != inequality < less thanless > greater than <= less than or equalless >= greater than or equal String comparisons: eq equality ne inequality lt less thanless gt greater than le less than or equalless ge greater than or equal