Programming Languages Meeting 13 December 2/3, 2014.

Slides:



Advertisements
Similar presentations
String and Lists Dr. Benito Mendoza. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list List.
Advertisements

ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 3: 8/28.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
CSC 4630 Meeting 9 February 14, 2007 Valentine’s Day; Snow Day.
Information Technology Center Hany Abdelwahab Computer Specialist.
Regular Expressions. u A regular expression is a pattern which matches some regular (predictable) text. u Regular expressions are used in many Unix utilities.
Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl Linux editors and commands (e.g.
Guide To UNIX Using Linux Third Edition
Scripting Languages Chapter 8 More About Regular Expressions.
CPSC 388 – Compiler Design and Construction
CMSC 330 Exercise: Write a Ruby function that takes an array of names in “Last, First Middle” format and returns the same list in “First Middle Last” format.
Shell Scripting Awk (part1) Awk Programming Language standard unix language that is geared for text processing and creating formatted reports but it.
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Last Updated March 2006 Slide 1 Regular Expressions.
Regular Expressions Week 07 TCNJ Web 2 Jean Chu. Regular Expressions Regular Expressions are a powerful way to validate and format text strings that may.
Tutorial 14 Working with Forms and Regular Expressions.
Language Recognizer Connecting Type 3 languages and Finite State Automata Copyright © – Curt Hill.
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
IST 210: PHP BASICS IST 210: Organization of Data IST210 1.
MCB 5472 Assignment #6: HMMER and using perl to perform repetitive tasks February 26, 2014.
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions.
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
November 2003Bent Thomsen - FIT 6-11 IT – som værktøj Bent Thomsen Institut for Datalogi Aalborg Universitet.
RegExp. Regular Expression A regular expression is a certain way to describe a pattern of characters. Pattern-matching or keyword search. Regular expressions.
Computabilty Computability Finite State Machine. Regular Languages. Homework: Finish Craps. Next Week: On your own: videos +
COMP313A Programming Languages Lexical Analysis. Lecture Outline Lexical Analysis The language of Lexical Analysis Regular Expressions.
Awk Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
JavaScript Syntax and Semantics. Slide 2 Lecture Overview Core JavaScript Syntax (I will not review every nuance of the language)
13 More Advanced Awk Mauro Jaskelioff (originally by Gail Hopkins)
Prof. Alfred J Bird, Ph.D., NBCT Door Code for IT441 Students.
Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}
JavaScript, Part 2 Instructor: Charles Moen CSCI/CINF 4230.
1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai (Room AB326) Division of Computer Science The University.
CSC 4630 Meeting 21 April 4, Return to Perl Where are we? What is confusing? What practice do you need?
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
By Mr. Muhammad Pervez Akhtar
1 Lecture 10 Introduction to AWK COP 3344 Introduction to UNIX.
Session 2: PHP Language Basics iNET Academy Open Source Web Development.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
-Joseph Beberman *Some slides are inspired by a PowerPoint presentation used by professor Seikyung Jung, which was derived from Charlie Wiseman.
XML Schema – XSLT Week 8 Web site:
May 2006CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
IST 210: PHP Basics IST 210: Organization of Data IST2101.
Programming Languages Meeting 12 November 18/19, 2014.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
Unix RE’s Text Processing Lexical Analysis.   RE’s appear in many systems, often private software that needs a simple language to describe sequences.
String and Lists Dr. José M. Reyes Álamo. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list.
Awk 2 – more awk. AWK INVOCATION AND OPERATION the "-F" option allows changing Awk's "field separator" character. Awk regards each line of input data.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
String and Lists Dr. José M. Reyes Álamo.
CSC 4630 Meeting 7 February 7, 2007.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Chapter 19 PHP Part II Credits: Parts of the slides are based on slides created by textbook authors, P.J. Deitel and H. M. Deitel by Prentice Hall ©
Expressions and Control Flow in JavaScript
Some slides by Elsa L Gunter, NJIT, and by Costas Busch
String and Lists Dr. José M. Reyes Álamo.
Programming Languages
String Processing 1 MIS 3406 Department of MIS Fox School of Business
Introduction to Bash Programming, part 3
CIS 136 Building Mobile Apps
Presentation transcript:

Programming Languages Meeting 13 December 2/3, 2014

Planning Matrix System: Lisp code and semantic specifications due Thursday, December 4 by 11:59 pm. Next week: Finish scripting languages; review for final Final Exam: Tuesday, December 16, 2014, 2:30 – 5:00 pm, MSC 290

Matrix System Remember that recursion is your friend. Questions?

Continuing with AWK AWK statement format pattern {action} pattern {action} Defaults – No action: print current line, namely {print $0} – No pattern: every line matches, that is the action is performed for every line in the input file(s)

Your Turn (1) Write an AWK program to: Determine how many flights were cancelled /Cancelled/ {count++} END {print count, “flights were cancelled”} Running the program awk –f cancelled.awk flightaware.txt

Your Turn (2) Write an AWK program to: Find the latest arriving flight from those in the list {if ($11 > latest) latest = $11} END {print latest} Running the program awk –f latest.awk flightaware.txt

Your Turn (3) Write an AWK program to: Find the earliest arriving flight from those in the list Try modifying your “latest” program by switching the inequality. What else do you have to change to make the program work?

Your Turn (4) Write an AWK program to List the different aircraft used for this flight number. {type[$2]++} END {for (a in type) print a} Modify the program to print the frequency of use for each type. Modify again to remove the cancelled flights.

Constructing Patterns Arithmetic comparisons NF > 0 Regular expressions – Coming next Combinations with logical operators && || ! Matching ~ !~

Regular Expressions A single character not otherwise endowed with special meaning matches that character. Example: m matches any string containing an m – Two regular expressions concatenated match a match of the first pattern followed immediately by a match of the second. Examples: mc matches any string containing mc Robert matches any string containing Robert (this is a concatenation of 6 regular expressions)

Regular Expressions (2) Characters with special meaning in certain contexts are: \ [ ]. ^ $ * + ? ( ) | Two questions: 1.What are the “certain contexts” that give rise to “special meanings”? 2.What if you don’t want a “special meaning”? Answer: A \ followed by a single character matches that character. Examples: \$ matches the dollar sign $ \\ matches the backslash \\ \\\\ matches two backslashes in a row

Regular Expressions (3) A. matches any (one) character. Examples: A.b matches Aeb, A$b, A_b, Abb

Regular Expressions (4) A string enclosed in brackets [ ] matches any single character from the string. Some characters in the string may have special meanings depending on their positions. [aeiou] matches a line with a vowel in it ] as a string element must be the first character. []a] matches either ] in the line or a in the line

Regular Expressions (5) - between two characters in ascending ASCII order denotes the inclusive set of characters; otherwise, it has no special meaning. A-M denotes the first 13 upper case letters. ^ as the first string element indicates the complement of the string in the alphabet; otherwise, it has no special meaning. [^0-9] matches a string that has a character other than a digit in it [^a-zA-Z0-9] matches a string that has a least one symbol in it

Regular Expressions (6) 1.A regular expression followed by * matches a sequence of 0 or more matches of the regular expression. 2.A regular expression followed by + matches a sequence of 1 or more matches of the regular expression. 3.A regular expression preceded by ^ is constrained to matches at the beginning of the line. 4.A regular expression followed by $ is constrained to matches at the end of the line.

AWK Patterns One form of a pattern for AWK is a regular expression enclosed in a pair of slashes / / A pattern can be limited to one field by using the match (or does not match) symbols /plane/ ~ $2 /train/ !~ $4

AWK Patterns (2) Your turn: For each of the patterns on the next slide, describe the lines that would be matched by an AWK program using that pattern.

AWK Patterns (3) 1./F$/ 2./[xy]/ 3./Mc/ 4./ab*c/ 5./[A-Za-z]*/ 6./[0-9].[0-9]/ 7./$$/ 8./ab*c*/ 9./^[^A-Za-z0-9]/ 10./[JS]r\.$/

AWK Patterns (4) 1.For the web form data, generate a list of the properly constructed US and Canada telephone numbers. 2.Find all proper names in the Moby Dick extract. Caution: This is an html file, so tags abound 3.For the countries data a.What is the population of Asia? b.What percentage of the world’s population lives in Asia? c.How many countries the size of Sudan will fit inside Canada?

AWK Exercises 1.Given a file of words, one per line, write an AWK script that returns the frequency count of the letters in the words. Use a template that – has one action statement in body, a for loop – has one statement for the END pattern, a for loop that controls the printing – uses one user-defined variable, an array called lc – uses the substring function, substr, to split each word into its individual characters.

AWK Exercises 2.Suppose each line of a file looks like the 14 character string that represents a US phone number in cellphone standard form (610) a.How many fields are there in the string if you are using default FS? What are their lengths? b.What if FS = “ - ” ? 3.Neither of these answers is acceptable for processing phone numbers. Reformat the string (610) and others of this form by using an AWK script with a single action statement and writing the string as