Formal Language Theory. Homework Read documentation on Graphviz – –

Slides:



Advertisements
Similar presentations
Grammar types There are 4 types of grammars according to the types of rules: – General grammars – Context Sensitive grammars – Context Free grammars –
Advertisements

Python 3 March 15, NLTK import nltk nltk.download()
Chapter 5: Languages and Grammar 1 Compiler Designs and Constructions ( Page ) Chapter 5: Languages and Grammar Objectives: Definition of Languages.
Natural Language Processing - Formal Language - (formal) Language (formal) Grammar.
1 Lecture 32 Closure Properties for CFL’s –Kleene Closure construction examples proof of correctness –Others covered less thoroughly in lecture union,
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
ISBN Chapter 3 Describing Syntax and Semantics.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Languages, grammars, and regular expressions
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
1 Module 31 Closure Properties for CFL’s –Kleene Closure construction examples proof of correctness –Others covered less thoroughly in lecture union, concatenation.
Normal forms for Context-Free Grammars
Chapter 3: Formal Translation Models
1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix.
Lecture 9UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 9.
NLTK (Natural Language Tool Kit) Unix for Poets (without Unix) Unix  Python.
Problem of the DAY Create a regular context-free grammar that generates L= {w  {a,b}* : the number of a’s in w is not divisible by 3} Hint: start by designing.
1 Syntax Specification Regular Expressions. 2 Phases of Compilation.
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
Grammars CPSC 5135.
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
Introduction to Language Theory
1 Syntax Specification (Sections ) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC.
CS 461 – Sept. 19 Last word on finite automata… –Scanning tokens in a compiler –How do we implement a “state” ? Chapter 2 introduces the 2 nd model of.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
1 Module 14 Regular languages –Inductive definitions –Regular expressions syntax semantics.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Midterm 1 Breakdown >79 3 >29 7 >69 5 >19 5 >59 7 >49 9 >39 7.
1 Module 31 Closure Properties for CFL’s –Kleene Closure –Union –Concatenation CFL’s versus regular languages –regular languages subset of CFL.
Language: Set of Strings
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
Context-Free and Noncontext-Free Languages Chapter 13 1.
Algorithms. Homework None – Lectures & Homework Solutions: – Video:
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Context-Free and Noncontext-Free Languages Chapter 13.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
November 2003Computational Morphology III1 CSA405: Advanced Topics in NLP Xerox Notation.
Formal Languages and Grammars
Discrete Structures ICS252 Chapter 5 Lecture 2. Languages and Grammars prepared By sabiha begum.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
1 CS Programming Languages Class 04 September 5, 2000.
1 Course Overview Why this course “formal languages and automata theory?” What do computers really do? What are the practical benefits/application of formal.
Transparency No. 1 Formal Language and Automata Theory Homework 5.
C Sc 132 Computing Theory Professor Meiliu Lu Computer Science Department.
1 A well-parenthesized string is a string with the same number of (‘s as )’s which has the property that every prefix of the string has at least as many.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 3.
CSE 311 Foundations of Computing I Lecture 19 Recursive Definitions: Context-Free Grammars and Languages Spring
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
CS 326 Programming Languages, Concepts and Implementation
Syntax Specification and Analysis
Natural Language Processing - Formal Language -
Context Sensitive Grammar & Turing Machines
CSCE 355 Foundations of Computation
Formal Language Theory
ConceptNet: Search ontology classes via human senses ---A proposal
Department of Software & Media Technology
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 1 Introduction to the Theory of Computation
Overview of the Course.
Presentation transcript:

Formal Language Theory

Homework Read documentation on Graphviz – – Use graphviz to generate figures like these (more or less):

Back to Regular Expressions import re myString="I have red shoes and blue pants and a green shirt. My phone number is and my friend's phone number is (800) and my cell number is You could also call me at if you'd like.” phoneNumbersRegEx=re.compile(''1?-?\(?\d{3}\)?-?\d{3}-?\d{4}'') print phoneNumbersRegEx.findall(myString) 10. A more interesting example Answer is here, but let’s derive it together

Formal Definition of Regular Expressions  character  ( ) Concatenation:  Union:  + Kleene Star:  ( ) * Characters: – lower case: a-z – upper case: A-Z – digits: 0-9 – special cases: \t \n – octal codes: \000 – any single character:.

An Equivalence Relation (= R ) A Partition of S ≡ Set of Subsets of S –Mutually Exclusive & Exhaustive Equivalence Classes ≡ A Partition such that –All the elements in a class are equivalent (with respect to = R ) –No element from one class is equivalent to an element from another Example: Partition integers into evens & odds Even integers: 2,4,6… Odd integers: 1,3,5… –x = R y  x has the same parity as y Three Properties –Reflexive: a = R a –Symmetric: a = R b  b = R a –Transitive: a = R b & b = R c  a = R c

>>> for s in wn.synsets('car'): print s.lemma_names ['car', 'auto', 'automobile', 'machine', 'motorcar'] ['car', 'railcar', 'railway_car', 'railroad_car'] ['car', 'gondola'] ['car', 'elevator_car'] ['cable_car', 'car'] >>> for s in wn.synsets('car'): print flatten(s.lemma_names) + ': ' + s.definition car auto automobile machine motorcar: a motor vehicle with four wheels; usually propelled by an internal combustion engine car railcar railway_car railroad_car: a wheeled vehicle adapted to the rails of railroad car gondola: the compartment that is suspended from an airship and that carries personnel and the cargo and the power plant car elevator_car: where passengers ride up and down cable_car car: a conveyance for passengers or freight on a cable railway Word Net (Ch2): An Equivalence Relation

Synonymy: An Equivalence Relation?

Comments

A Partial Order (≤ R ) Powerset({x,y,z}) – Subsets ordered by inclusion – a≤ R b  a  b Three properties – Reflexive: a≤a – Antisymmetric: a≤b & b≤a  a=b – Transitivity: a≤b & b≤c  a≤c

Wordnet: A Partial Order >>> for h in wn.synsets('car')[0].hypernym_paths()[0]: print h.lemma_names ['entity'] ['physical_entity'] ['object', 'physical_object'] ['whole', 'unit'] ['artifact', 'artefact'] ['instrumentality', 'instrumentation'] ['container'] ['wheeled_vehicle'] ['self-propelled_vehicle'] ['motor_vehicle', 'automotive_vehicle'] ['car', 'auto', 'automobile', 'machine', 'motorcar']

Help s = wn.synsets('car')[0] >>> s.name 'car.n.01' >>> s.pos 'n' >>> s.lemmas [Lemma('car.n.01.car'), Lemma('car.n.01.auto'), Lemma('car.n.01.automobile'), Lemma('car.n.01.machine'), Lemma('car.n.01.motorcar')] >>> s.examples ['he needs a car to get to work'] >>> s.definition 'a motor vehicle with four wheels; usually propelled by an internal combustion engine' >>> s.hyponyms()[0:3] [Synset('stanley_steamer.n.01'), Synset('hardtop.n.01'), Synset('loaner.n.02')] >>> s.hypernyms() [Synset('motor_vehicle.n.01')]

CFGs: Context Free Grammars (Ch8)

Ambiguity

The Chomsky Hierarchy – Type 0 > Type 1 > Type 2 > Type 3 – Recursively Enumerable > CS > CF > Regular Examples – Type 3: Regular (Finite State): Grep & Regular Expressions Right-Branching: A  a A Left-Branching: B  B b – Type 2: Context-Free (CF): Center-Embedding: C  …  x C y Parenthesis Grammars:  ( ) w w R – Type 1: Context-Sensitive (CS): w w – Type 0: Recursively Enumerable – Beyond Type 0: Halting Problem

Syntax & Semantics Syntax: Symbol pushing / Parsing – Parsing: use context-free grammar to map string  tree Semantics: Meaning (making sense of trees) – Is synonymy an equivalence relation? Dichotomy is important both for – Natural Languages (English, FIGS, CJK, etc.) FIGS: French, Italian, German & Spanish CJK: Chinese, Japanese & Korean – as well as Artificial Languages Python, HTML, Javascript, SQL, C

Summary Chapter 1 NLTK (Natural Lang Toolkit) – Unix for Poets without Unix – Unix  Python Object-Oriented – Polymorphism: “len” applies to lists, sets, etc. Ditto for: +, help, print, etc. Types & Tokens – “to be or not to be” – 6 types & 4 tokens FreqDist: sort | uniq –c Concordances Chapters 2-8 Chapter 3: URLs Chapter 2 – Equivalence Relations: Parity Synonymy (?) – Partial Orders: Wordnet Ontology Chapter 8: CF Parsing – Chomsky Hierarchy CS > CF > Regular