Center for PersonKommunikation P.1 Background for NLP Questions brought up by N. Chomsky in the 1950’ies: –Can a natural language like English be described.

Slides:



Advertisements
Similar presentations
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
Advertisements

Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 4.
CS 330 Programming Languages 09 / 18 / 2007 Instructor: Michael Eckmann.
Center for PersonKommunikation P.1 Do you need speech in your project? Speech recognition? Speech synthesis? English or Danish or … (who are going to be.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Normal forms for Context-Free Grammars
Lexical Ambiguity ! Definition: a word belongs to two or more word (“part of speech”) classes Example: the round table (adjective), to round the corner.
A shorted version from: Anastasia Berdnikova & Denis Miretskiy.
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
Grammars, Languages and Finite-state automata Languages are described by grammars We need an algorithm that takes as input grammar sentence And gives a.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
TRANSFORMATIONAL GRAMMAR An introduction. LINGUISTICS Linguistics Traditional Before 1930 Structural 40s -50s Transformational ((Chomsky 1957.
The students will be able to know:
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 13. LANGUAGE AND COMPLEXITY 2007 년 11 월 03 일 인공지능연구실 한기덕 Text: Speech and Language Processing Page.477 ~ 498.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Languages & Strings String Operations Language Definitions.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
1 Introduction to Automata Theory Reading: Chapter 1.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
Grammars CPSC 5135.
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
ISBN Chapter 3 Describing Syntax and Semantics.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
Transition Network Grammars for Natural Language Analysis - W. A. Woods In-Su Yoon Pusan National University School of Electrical and Computer Engineering.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 4.
November 2011CLINT-LN CFG1 Computational Linguistics Introduction Context Free Grammars.
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Enter Chomsky Grammars. 2 What has Chomsky* to do with computing? Linguistics and computing intersect at various places: Things that are used to create.
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
9.7: Chomsky Hierarchy.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Syntax and Semantics Form and Meaning of Programming Languages Copyright © by Curt Hill.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Formal Languages and Grammars
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
1 Course Overview Why this course “formal languages and automata theory?” What do computers really do? What are the practical benefits/application of formal.
C Sc 132 Computing Theory Professor Meiliu Lu Computer Science Department.
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Introduction to Automata Theory
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Natural Language Processing Vasile Rus
Describing Syntax and Semantics
Grammar Grammar analysis.
Introduction to Formal Languages
Course 1 Introduction to Formal Languages and Automata Theory (part 1)
Theory of Computation Theory of computation is mainly concerned with the study of how problems can be solved using algorithms.  Therefore, we can infer.
FORMAL LANGUAGES AND AUTOMATA THEORY
Formal Language Theory
Introduction to Automata Theory
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Natlanco Senior Supervisor:
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COMPILER CONSTRUCTION
Presentation transcript:

Center for PersonKommunikation P.1 Background for NLP Questions brought up by N. Chomsky in the 1950’ies: –Can a natural language like English be described (“parsed”, “compiled”) with the same methods as used for formal/artificial (programming) languages in computer science? –Can we use simple finite state grammars or context-free grammars for the description of English? –Or does linguistics need to invent an own and more powerful grammar type for the description of natural languages? Offshoots: “The Chomsky Hierarchy of Grammars”, “Natural Language Processing”, “Generative Transformational Grammar”,

Center for PersonKommunikation P.2 Chomsky: Grammar Theory 0 Some key extracts/quotations from ”Syntactic Structures” –A language is a (infinite) set of sentences, each finite in length and constructed out of a finite set of elements. –A grammar is a device that separate the grammatical sequences from the ungrammatical sentences and generates the structures of the grammatical ones. –A grammar is a reconstruction of the native speaker’s competence, his ability to generate (produce and understand) an infinite number of sentences –A grammar is a theory of a language. It must comply with the empiristic axioms: The theory must be adequate and simple.

Center for PersonKommunikation P.3 Chomsky: Grammar Theory 1 English Native Speaker English Grammar/Language Theory Have you a book on modern music? The book seems interesting. …... Sentence parsable! ….. The grammar must generate (“parse”) ALL sentences acceptable to the native speaker and …. !

Center for PersonKommunikation P.4 Chomsky: Grammar Theory 2 English Native Speaker According to my intuition this sentence is OK!... English Grammar/Language Theory 1) Colorles green ideas sleep furiously. 2) Have you a book on modern music? … … the grammar must generate NOTHING BUT sentences acceptable to the native speaker and... Random sentence generation: !

Center for PersonKommunikation P.5 Chomsky: Grammar Theory 3 Grammar AGrammar B Set of Sentences generated by A and B Preferable grammar (equivalent grammars) … the grammar must be as SIMPLE (e.g. “small”) as possible Language l !

Center for PersonKommunikation P.6 Chomsky: Grammar Theory 4 What’s in the “Black Box”? What type of grammar can “generate” a natural language like English? –A Finite State Grammar without/with loops? (No! “Syntactic Structures” pp. 18 ff.) –A Phrase Structure Grammar? (No! “Syntactic Structures” pp. 26 ff.) –A Transformational Grammar? (Yes/Maybe! According to “Syntactic Structures” pp 34 ff. BUT “Generative Transformational Grammar” has turned out to be a “blind alley” in computational linguistics) ? ! ! !

Center for PersonKommunikation P.7 Chomsky: Hierarchy of Grammars Type 3: Regular Grammars –Equivalent to finite state automata, finite state transition networks, Markov models (probabilistic type). Type 2:Context free Grammars –E.g. recursive transition networks (RTNs), phrase structure grammars (PSGs). Unification grammars where attributes take values drawn from a finite table. Type 1:Context sensitive Grammars –Augmented transition networks (ATNs), transformational grammars, some unification grammars Type 0:Unrestricted Grammars ! !

Center for PersonKommunikation P.8 Finite State Grammar Structure: –Directed Graph/Transition Network structure –All transitions are terminals –The terminal symbols are either words or POS (word class) names like Noun, Verb, Pronoun. –The network structure may involve loops (iterations) and “empty” transitions (jumps, skips) 4 nodes transition loop j jump !

Center for PersonKommunikation P.9 Recursive Transition Network Grammar Structure: –A SET of named Directed Graph/Transition Network structureS –Transitions are terminals or NON-TERMINALS –Terminal symbols/loops/jumps -> see FSN -slide –A non-terminal symbol is the name of a network in the set included in the RTN X Jump X ab Equivalent BNF/PSG X -> a b X -> a X b A n B n -problem, “Syntactic Structures”, p. 30 !

Center for PersonKommunikation P.10 What’s wrong with FSNs & RTNs according to Chomsky? FSNs without loops can only generate a finite set of sentences. English is an infinite set FSNs with loops generate infinite sets of sentences but cannot describe A n B n sequences found in constructions with “respectively”. RTNs (PSGs/BNFs) generate infinite sets of sentences, can describe A n B n sequences, but applied to English a huge number of symbols is required (conflict with simplicity) !

Center for PersonKommunikation P.11 Generative Transformational Grammar

Center for PersonKommunikation P.12 Young et al.: Grammar types in speech reocgnition Level Building [obsolete] (Young p. 8 f.): –finite state grammar without loops Viterbi/Token passing (Young p. 8 f, p. 11 ff.) –finite state grammar with loops –context-free grammar provided that every non-terminal refers to a unique instance of a sub-network (No recursions!) Conclusion: Decoding algorithms used in modern speech recognition technology can only be applied on the weakest grammar type within the Chomsky Hierarchy

Center for PersonKommunikation P.13 Exercise 1 The following extremely simple grammar generates all (typed) English sentences. What is wrong with it according to the Chomsky theory? Describe the concepts “native speaker, “intuition”, “generate” as used in the Chomsky theory. char Lexicon: char=a,b,c,…z,’;’,’.’,’?’, …etc

Center for PersonKommunikation P.14 Exercise 2 Consider the "correct" (or "grammatical") ansi-C printf sequences in I and compare them with the "false" (or "ungrammatical") ones in II : Iprintf("%d %s",integer,string) printf("%s %d %d",string,integer,integer) printf("%d",integer) etc. II printf("%d %s",integer,string,string) printf("%d %s",integer) printf("%d %s",string,integer) etc. Is it possible to design a regular (finite state) grammar that generates the correct sequences in I without generating II? If not, can a context-free grammar be designed that meet these conditions?