CSE 340 Recitation Week 3 : Sept 1st – 7th Regular Expressions

Slides:



Advertisements
Similar presentations
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Advertisements

Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Testing a program Remove syntax and link errors: Look at compiler comments where errors occurred and check program around these lines Run time errors:
1 Regular Expressions/Languages Regular languages –Inductive definitions –Regular expressions syntax semantics Not covered in lecture.
CPSC 388 – Compiler Design and Construction
CMSC 330 Exercise: Write a Ruby function that takes an array of names in “Last, First Middle” format and returns the same list in “First Middle Last” format.
Chapter 2 Languages.
Lexical Analysis CSE 340 – Principles of Programming Languages Fall 2015 Adam Doupé Arizona State University
1 Syntax Specification Regular Expressions. 2 Phases of Compilation.
Lecture # 1 (Automata Theory)
1 Welcome to ! Theory Of Automata. 2 Text and Reference Material 1.Introduction to Computer Theory, by Daniel I. Cohen, John Wiley and Sons, Inc., 1991,
Module 2 How to design Computer Language Huma Ayub Software Construction Lecture 7 1.
CSC312 Automata Theory Lecture # 2 Languages.
Introduction to Theory of Automata
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions.
1 Language Definitions Lecture # 2. Defining Languages The languages can be defined in different ways, such as Descriptive definition, Recursive definition,
1 Chapter 1 Introduction to the Theory of Computation.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Push-down Automata Section 3.3 Fri, Oct 21, 2005.
1 Module 14 Regular languages –Inductive definitions –Regular expressions syntax semantics.
Regular Grammars Chapter 7. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
Strings and Languages CS 130: Theory of Computation HMU textbook, Chapter 1 (Sec 1.5)
1 Lecture 3.3: Recursion CS 250, Discrete Structures, Fall 2012 Nitesh Saxena Adopted from previous lectures by Cinda Heeren, Zeph Grunschlag.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
CSC312 Automata Theory Lecture # 3 Languages-II. Formal Language A formal language is a set of words—that is, strings of symbols drawn from a common alphabet.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Exercise Solution for Exercise (a) {1,2} {3,4} a b {6} a {5,6,1} {6,2} {4} {3} {5,6} { } b a b a a b b a a b a,b b b a.
Lecture # Book Introduction to Theory of Computation by Anil Maheshwari Michiel Smid, 2014 “Introduction to computer theory” by Daniel I.A. Cohen.
Scanning & Regular Expressions CPSC 388 Ellen Walker Hiram College.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Expressions and Data Types Professor Robin Burke.
Conversions Regular Expression to FA FA to Regular Expression.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
1 Strings and Languages Lecture 2-3 Ref. Handout p12-17.
Lecture 01: Theory of Automata:2013 Asif Nawaz Theory of Automata.
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/9/2006 Lecture 6 – String Processing.
CMSC201 Computer Science I for Majors Lecture 05 – Comparison Operators and Boolean (Logical) Operators Prof. Katherine Gibson Based on slides by Shawn.
CSE 374 Programming Concepts & Tools
Theory of Computation Lecture #
CS510 Compiler Lecture 2.
Chapter 3 Lexical Analysis.
Lexical Analysis CSE 340 – Principles of Programming Languages
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Lecture 1 Theory of Automata
Syntax Specification and Analysis
Formal Language & Automata Theory
PROGRAMMING LANGUAGES
Chapter 2 Finite Automata
Push-down Automata Section 3.3 Wed, Oct 27, 2004.
Push-down Automata.
Web Systems Development (CSC-215)
Chapter 7 Regular Grammars
Review: Compiler Phases:
Recap lecture 29 Example of prefixes of a language, Theorem: pref(Q in R) is regular, proof, example, Decidablity, deciding whether two languages are equivalent.
Specification of tokens using regular expressions
CS 250, Discrete Structures, Fall 2015 Nitesh Saxena
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
CSE 311 Foundations of Computing I
CSC312 Automata Theory Lecture # 2 Languages.
CSC312 Automata Theory Lecture # 3 Languages-II.
LECTURE # 07.
Announcements - P1 part 1 due Today - P1 part 2 due on Friday Feb 1st
Prepared by- Patel priya( ) Guided by – Prof. Archana Singh Gandhinagar Institute of Technology SUBJECT - CD ( ) Introcution to Regular.
Finite Automata Part Three
Presentation transcript:

CSE 340 Recitation Week 3 : Sept 1st – 7th Regular Expressions Questions Project 2 Regular Expressions Valid Syntax Using REs to Define a Language Evaluating whether “????” is in a Language RE v. MATH Application of REs -- Tokens

Questions? Current Project Homework Lecture Other?

Project 2 Project 2 is due 9/9/16 FRIDAY before 11:59pm Part 1 (30 points) Debug buggy program Common mistakes Extract with $tar –xzvf <filename> Included test script, runs the 6 test cases in test.sh Use diff command to create patch : $ diff -uwB buggy_program.c good_program.c > fix_bugs.patch

Project 2 – Part 2 Part 2 – Lexer + Linked List Will use getToken() to get input from STDIN getToken() returns token_type enum ID, NUM, IF, WHILE, DO, THEN, PRINT Global variables set by getToken() T_type – same as returned by getToken() current_token – contains token value or blank token_length – length of string stored in current_token line – the line number of current_token

Project 2 – Part 2 Part 2 – Lexer + Linked List ID and NUM tokens stored into Linked List Need to store Token Type, Token Value, Line number Will need to print output out in reverse (one option might be to use a doubly linked list) Token Type Value Line # Next Token Type Value Line # Next Token Type Value Line # Next Next Next Next Previous Previous Previous

Project 2 – Part 2 Part 2 – Lexer + Linked List Must use a Linked List data structure Must create reversed output from list, will not receive full credit if output to string to and print string in reverse. Token Type Value Line # Next Token Type Value Line # Next Token Type Value Line # Next Next Next Next Previous Previous Previous

Project 2 – Part 2 Part 2 – Lexer + Linked List

Project 2 – Part 2 Part 2 – Lexer + Linked List Standard Output Standard Input Standard Output

Project 2 – Part 2 Part 2 – Lexer + Linked List Evaluation Testing A test script that runs multiple test cases is provided Details on test scripts in project document Evaluation Graded on whether the test cases are passed Must use a C-Style linked list (a struct with a self-referential field) CANNOT USE the STL

Regular Expressions Valid Syntax for REs Definition of Languages using REs Is“????”  Language (i.e. L(RE) or {}) RE v MATH

Valid Syntax for REs A syntactically valid regular expression has ∅ 𝜺 a, where a is an element of the alphabet R1 | R2, where R1 and R2 are regular expressions R1 . R2, where R1 and R2 are regular expressions (R), where R is a regular expression R*, where R is a regular expression

Valid Syntax for REs Given:  = {a, b, c, d, 1, 2 , 3, 4} Are these valid REs? Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = ((1.2 | 2.3 ) 4*) V = X.Z.Y.X.Y ∅ 𝜺 a, where a is an element of the alphabet R1 | R2, where R1 and R2 are regular expressions R1 . R2, where R1 and R2 are regular expressions (R), where R is a regular expression R*, where R is a regular expression

Valid Syntax for REs Given:  = {a, b, c, d, 1, 2 , 3, 4} Are these valid REs? Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = ((1.2 | 2.3 ).4*) V = X.Z.Y.X.Y ∅ 𝜺 a, where a is an element of the alphabet R1 | R2, where R1 and R2 are regular expressions R1 . R2, where R1 and R2 are regular expressions (R), where R is a regular expression R*, where R is a regular expression

Definition of Languages using REs What is 𝛴? What is 𝛴*? Given :  = {a, b, c, d, 1, 2, 3} Is “dddddddddddddddcccccccccccccccaaaaaaaaaaaaaaaa11111111aaaaaaaaaaaaaa333333333333bbbbbbbbbbbbaaaaaaaa3333333”  𝛴*?

Definition of Languages using REs What is 𝛴? What is 𝛴*? Given :  = {a, b, c, d, 1, 2, 3} Is “dddddddddddddddcccccccccccccccaaaaaaaaaaaaaaaa11111111aaaaaaaaaaaaaa333333333333bbbbbbbbbbbbaaaaaaaa3333333”  𝛴*? YES

Definition of Languages using REs What is a Language? A Language is a subSET of 𝛴* i.e., L  𝛴* How do we describe that subset? Using REs

Definition of Languages using REs  = {a, b, c, d, 1, 2, 3} Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = (2.3.4*) | 𝜺 V = X.Y.Z.W.X.Y L(V) = {…}

Definition of Languages using REs L(V) = {…} This uses the V regular expression to define the subset 𝛴* Thus, L(V)  𝛴*  = {a, b, c, d, 1, 2, 3} -------------------- Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = (2.3.4*) | 𝜺 V = X.Y.Z.W.X.Y

Definition of Languages using REs What are some examples of strings that are in L(V)? L(V) = {a1a234a3, 1aa3, …} Is “a1a234a3”  𝛴* YES  = {a, b, c, d, 1, 2, 3} -------------------- Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = (2.3.4*) | 𝜺 V = X.Y.Z.W.X.Y

Is“????”  Language (i.e. L(RE) or {}) Does L(V) contain: a b123 ab123ab123 ab123c8 a12344321 312d333 Why is it not in L(V)?  = {a, b, c, d, 1, 2, 3, 4} -------------------- Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = (2.3.4*) | 𝜺 V = X.Y.Z.W.X.Y √ X

Is“????”  Language (i.e. L(RE) or {}) What must L(V) always contain? a, b, or c Why? b/c of RE Z  = {a, b, c, d, 1, 2, 3} -------------------- Z = a | b | c Y = (1 | 2 | 3)* X = (Z | 𝜺) W = (2.3.4*) | 𝜺 V = X.Y.Z.W.X.Y

EPSILON Is this a valid set? Is this a valid regular expression? = {a, b, c, d, 𝜺, |, .} Is this a valid regular expression? Z=a.b.c .𝜺.𝜺 Is 𝜺 the character represented by the alphabet or the RE representation of empty string?

EPSILON Usually we will define like this, for clarity. = {a, b, c, d, \𝜺, \|, \.} Is this a valid regular expression? Z=a.b.c.\ 𝜺 . 𝜺 “abc𝜺”  L(Z)?

EPSILON = {a, b, c, d} Is this a valid regular expression? Z=a.b.c.(a|𝜺).b.c What strings are in L(Z)? Is “abc𝜺”  L(Z)?

EPSILON √ X = {a, b, c, d} Is this a valid regular expression? Z=a.b.c.(a|𝜺).b.c What is in the language? L(Z) = {abcabc, abcbc} Are these strings in L(Z)? “abcbc”  L(Z)? “abc𝜺bc”  L(Z)? WHY? √ X

RE v. Math A regular expression defines a subset of 𝛴* L(∅) = ∅ L(a) = {a} L(R1 | R2) = L(R1) ∪ L(R2) L(R1 . R2) = L(R1) . L(R2) L((R)) = L(R) L(R*) = L(R*) = ∪i≥0 Li(R) ∅ 𝜺 a, where a is an element of the alphabet R1 | R2, where R1 and R2 are regular expressions R1 . R2, where R1 and R2 are regular expressions (R), where R is a regular expression R*, where R is a regular expression

RE v. Math Operator Precedence Just like Math () () ^ * . . | similar to +

RE v. Math L(R1 . R2) = L(R1) . L(R2) For two sets A and B of strings: A . B = {xy : x ∈ A and y ∈ B}

RE v. Math  = {a, b, c, d, 1, 2, 3} -------------------- Z = a | b | c X = (Z | 𝜺) A . B = {xy : x ∈ A and y ∈ B} Example: L(X.Z) = L(X).L(Z) = L(Z|𝜺).L(a|b|c) = (L(Z) U L(𝜺)).(L(a) U L(b) U L(c))= (L(a) U L(b) U L(c) U L(𝜺)).(L(a) U L(b) U L(c))= ({a} U {b} U {c} U {𝜺}) . ({a} U {b} U {c}) = {a, b, c, 𝜺}.{a,b,c} = {aa, ab, ac, ba, bb, bc, ca, cb, cc, a, b, c}

RE v. Math L(R*) = ∪i≥0 Li(R), where L0(R) = {𝜺} Definition Li(R) = Li-1(R) . L(R) L(R*) = ∪i≥0 (Li-1(R) . L(R))

RE v. Math L(R*) = ∪i≥0 Li(R), where L0(R) = {𝜺} L (R*) = {𝜺} ∪ L(R) ∪ L(R) . L( R) . L(R) . L(R) …

RE v. Math Example: L(Y) L((1 | 2 | 3)*) = {𝜺} U L(1|2|3) U L(1|2|3). L(1|2|3) U L(1|2|3). L(1|2|3) . L(1|2|3) U … L(1|2|3) = L(1) U L(2) U L(3) = {1} U {2} U {3} = {1,2,3}  = {a, b, c, d, 1, 2, 3} -------------------- Y = (1 | 2 | 3)* L(R1 | R2) = L(R1) ∪ L(R2) L(R1 . R2) = L(R1) . L(R2) A . B = {xy : x ∈ A and y ∈ B} L(R*) = ∪i≥0 Li(R) where L0(R) = {𝜺}