Theory of Computation Languages.

Slides:



Advertisements
Similar presentations
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Advertisements

Regular Expressions, Backus-Naur Form and Reverse Polish Notation.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Lexical and Syntactic Analysis Here, we look at two of the tasks involved in the compilation process –Given source code, we need to first break it into.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Fall 2007CS 2251 Miscellaneous Topics Deque Recursion and Grammars.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee.
Theory Of Automata By Dr. MM Alam
Description of programming languages 1 Using regular expressions and context free grammars.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
COMP Parsing 2 of 4 Lecture 22. How do we write programs to do this? The process of getting from the input string to the parse tree consists of.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Grammars CPSC 5135.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Programming Paradigms Backus Naur Form and Syntax Diagrams.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Computability Review homework. Regular Operations. Nondeterministic machines. NFSM = FSM Homework: By hand, build FSM version of specific NFSM. Try to.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
ISBN Chapter 3 Describing Syntax and Semantics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
Language Translation Part 2: Finite State Machines.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Math Expression Evaluation With RegEx and Finite State Machines.
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Finite-State Machines (FSM) Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth Rosen.
CS 3304 Comparative Languages
Regular Expressions, Backus-Naur Form and Reverse Polish Notation
Chapter 3 – Describing Syntax
Computability Joke. Context-free grammars Parsing. Chomsky
Parsing 2 of 4: Scanner and Parsing
Pushdown Automata.
Intro to compilers Based on end of Ch. 1 and start of Ch. 2 of textbook, plus a few additional references.
BCT 2083 DISCRETE STRUCTURE AND APPLICATIONS
Context-Free Grammars: an overview
CS510 Compiler Lecture 4.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Chapter 3 – Describing Syntax
What does it mean? Notes from Robert Sebesta Programming Languages
Automata and Languages What do these have in common?
Syntax versus Semantics
CS 363 Comparative Programming Languages
Lexical and Syntax Analysis
Programming Language Syntax 2
Theory of Computation Turing Machines.
CHAPTER 2 Context-Free Languages
CSE 311: Foundations of Computing
R.Rajkumar Asst.Professor CSE
CS 3304 Comparative Languages
What Are They? Who Needs ‘em? An Example: Scoring in Tennis
Fundamentals of Data Representation
CS 3304 Comparative Languages
Coding Concepts (Data- Types)
MA/CSSE 474 Theory of Computation
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Flowcharts and Pseudo Code
Chapter 10: Compilers and Language Translation
Discrete Maths 13. Grammars Objectives
Lecture 5 Scanning.
COMPILER CONSTRUCTION
Presentation transcript:

Theory of Computation Languages

Languages We’re well accustomed to at least one programming language Whether that’s a low level language (like Assembly) Or a high level language (like Java) However, languages aren’t just restricted to creating programs We can use them to represent lots of different things Like pattern matching pieces of text!

Languages These non-programming languages can be split into two distinct categories Regular Languages Context-Free Languages They can both be used to represent/create words or phrases, or even whole programming languages! We’re going to look at an example of both categories Regular Languages: Regular Expressions Context-Free Languages: Backus-Naur Form We’ll see the differences between these two types during this presentation

Languages It is worth mentioning something before looking at these examples They can both end up with a similar result, but both are better at different kinds of expressions Regular Languages: small, singular expressions (like single words) Context-Free Languages: defining entire languages Both Regular and Context-Free languages are aimed at creating expressions (which could be sentences in a language, or statements in a programming language)

Regular Languages Regular languages have very specific, strict rules that work best for small text Like single words The rules set up in regular languages are also created in such a way that they can be used by Finite State Machines Specifically, Deterministic Finite Automata versions The most common regular language we see these days is Regular Expression Regex, for short

Regular Languages: Regex Overview Regex is a popular regular language used primarily for pattern matching It’s alphabet is small and easy to use It works by creating an expression Which we can use to compare against any input If the input follows the rules of the expression, we return true Or the position of the match Otherwise we return false

Regular Languages: Regex Overview We can also think of a Regex as representing a set Of all accepted inputs For example, say we wanted to create a set of inputs (using only 0s and 1s) The set only allows 0s, followed by 1s We cannot mix-and-match

Regular Languages: Regex Overview The set comprehension for this would look like so However, we can simplify this into a single Regex expression That’s much simpler! Each symbol in regex has a certain rule associated with it 𝐼𝑛𝑝𝑢𝑡= 0 𝑥 1 𝑦 | 𝑥, 𝑦∈ℕ 𝐼𝑛𝑝𝑢𝑡= 0 ∗ 1 ∗

Regular Languages: Regex Rules Here is a handy table for all the Regex symbols And their respective rules There are other symbols with rules, but any other ones will be explained next to exam questions requiring them Symbol Rule * 0 or more occurrences + 1 or more occurrences ? 0 or 1 occurrences | Alternation (either the left-side or the ride-side occurs) () Group rules into a priority (like how brackets work in maths)

Can you write at least three accepted inputs for the following Regex Can you come up with a Regex for any UK mobile number? 𝑎+ 𝑏 ∗ 01 + 01+0 𝑎𝑏 ?| 𝑏𝑎 ? ∗

Regular Languages: Regex  FSM Because of the simple, strict nature of Regex rules, it is really easy to turn one expression into its own Finite State Machine It takes some practice, but it is possible to make a FSM for all Regexes The best way to make one is to closely examine the rules used in an expression And the order they are used in

Regular Languages: Regex  FSM If a Regex is being used to say true or false to an input, then the FSM will need two things A starting state At least one accepting state We can take any input we think of, and ‘feed’ it into the FSM Every character in the input is used as the input for that state Where the FSM will transition to the next state based off it If we consume the final character and finish on the accepting state, it is a valid input

Regular Languages: Regex  FSM For example, we had the Regex 𝑎+ 𝑏 ∗ earlier This is the FSM for that expression

Regular Languages: Regex  FSM The 𝑎 has the “1 or more” symbol (+) after it So we need at least one 𝑎 before getting to the accepting state However, we can then have as many of these as we want Until we move on to 𝑏 At that point, we can have has many of those as we want

Regular Languages: Regex  FSM Here are some tips you can follow to create FSMs If it asks for 0 or more, be ready to either skip or ‘land’ on that state (by using it as an accepting state) If it asks for 1 or more, we can add a ‘transition’ state that uses at least one of those symbols Then have a transition to itself if needed For 0 or 1, the state itself needs to be accepting (unless its moving over to somewhere else) Then have a single state it can transition to using that symbol That state can then transition to other states as needed

Create state transition diagrams for the following Regexes 01 + 01+0 𝑎𝑏 ?| 𝑏𝑎 ? ∗

Convert the following FSM into a Regex

Context-Free Languages Context-Free languages are similar to Regular languages They still use rules to define acceptable inputs However, rather then being strict like regular languages In that we can create FSM from them Context-Free languages are a lot more ‘loose’ And will often rely on recursion to freely ‘build-up’ an acceptable input

Context-Free Languages We often form actual languages (whether they’re spoken or programming-based) from Context-Free rules Because of how free we can be with rules So, programming languages (like Java) are created using Context-Free rules That define first acceptable words (called lexemes) Then define acceptable statements (the syntax of the language)

Context-Free Languages: BNF One common example of a Context-Free language is Backus-Naur Form It is a ‘language’ designed for creating other languages Using set rules BNF is created off the idea of terminals and nonterminals Terminals: hard-coded characters/symbols that are accepted Nonterminals: groups of terminals that can be replaced with terminals as needed

Context-Free Languages: BNF We use BNF to create production rules Nonterminals that build up into accepted inputs We can check if an input is valid by breaking it down into its subsequent terminals And keeping track of the groups of nonterminals we used The output of breaking this input down is a syntax tree Let’s look at an example of BNF production rules first

Context-Free Languages: BNF Rules We define any nonterminal via angular brackets And any raw text is understood to be terminal For example, here we create a nonterminal for a digit Using possible terminals for its value <digit>::= 0|1|2|3|4|5|6|7|8|9

Context-Free Languages: BNF Rules Like in a Regex, the vertical bar | represents a choice So, a <digit> is a single number From 0 to 9 <digit>::= 0|1|2|3|4|5|6|7|8|9

Context-Free Languages: BNF Rules We can then build up this into a <number> However, a number should be able to have one or more digits So, we start by saying a number is a single digit Then we say it could also be a digit followed by a number <digit>::= 0|1|2|3|4|5|6|7|8|9 <number>::= <digit> | <digit><number>

Context-Free Languages: BNF Rules Since a number, itself, could be a digit, this leads to a recursive possibility For example, the number 123 would be broken up as the following 123 = <number 123> = <digit 1><number 12> = <digit 1><digit 2><number 3> = <digit 1><digit 2><digit 3>

Come up with the BNF production rules for a UK telephone number Be sure to include all terminals/nonterminals needed Come up with the BNF production rules for a licence plate of a car Licence plates are made up of 4 letters followed by 4 numbers Letters include the alphabet, from A to M (all capitals) Numbers include digits, from 0 to 9

Context-Free Languages: Syntax Trees Going back to the example of an input for a number The BNF production rules broke the input into the following sequence This creates a syntax tree (though it’s not obvious now) 123 = <number 123> = <digit 1><number 12> = <digit 1><digit 2><number 3> = <digit 1><digit 2><digit 3>

Context-Free Languages: Syntax Trees The syntax tree would look something like this Note that we start with the topmost nonterminal <number> in this case Then we break it up into the smaller nonterminals Eventually ending up at the terminals

Create a syntax tree for each the following inputs (using your licence plate BNF rules from the previous exercise) PLAT3487 ABCD0123

Regular vs Context-Free While regular languages tend to focus on small, strict rules involving small input Like what letters need to be followed by other letters Context-free languages focus on overall grammatical rules Like what words/expressions we can/cannot add after others Everything we can create in a programming language (like statements, if-statements, and class declarations) is because context-free languages are more grammatical in nature

Regular vs Context-Free It also helps that context-free languages separate their rules into groups Different nonterminals That means we can (although recursively) define parts of a language using other parts of a language Via the nonterminals we put them in However, regular languages cannot have these different, split categories They require a single rule Using lots of symbols