Syntax Sudeshna Sarkar 25 Aug 2008.

Slides:



Advertisements
Similar presentations
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Statistical NLP: Lecture 3
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Drawing Trees & Ambiguity in Trees. Some Phrase Structure Rules of English S’ -> (Comp) S S’ -> (Comp) S S -> {NP/S’} (T) VP S -> {NP/S’} (T) VP VP 
PZ02A - Language translation
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Chapter 3: Formal Translation Models
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Natural Language Processing Prof: Jason Eisner Webpage: syllabus, announcements, slides, homeworks.
Natural Language Processing Ellen Back, LIS489, Spring 2015.
Linguistic Theory Lecture 2 Phrase Structure. What was there before structure? Classical studies: Classical studies: –Languages such as Latin Rich morphology.
SI485i : NLP Day 1 Intro to NLP. Assumptions about You You know… how to program Java basic UNIX usage basic probability and statistics (we’ll also review)
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING COMP3310 Natural Language Processing Eric Atwell, Language Research Group.
1 Ling 569: Introduction to Computational Linguistics Jason Eisner Johns Hopkins University Tu/Th 1:30-3:20 (also this Fri 1-5)
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Grammars CPSC 5135.
CHAPTER 13 NATURAL LANGUAGE PROCESSING. Machine Translation.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
ENGLISH SYNTAX Introduction to Transformational Grammar.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
November 2011CLINT-LN CFG1 Computational Linguistics Introduction Context Free Grammars.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
1 Syntax 1. 2 In your free time Look at the diagram again, and try to understand it. Phonetics Phonology Sounds of language Linguistics Grammar MorphologySyntax.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
SYNTAX.
Drawing Trees & Ambiguity in Trees
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Speech and Language Processing Formal Grammars Chapter 12.
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Natural Language Processing Vasile Rus
/665 Natural Language Processing
Statistical NLP: Lecture 3
Natural Language Processing - Formal Language -
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
CSE322 Chomsky classification
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
David Kauchak CS159 – Spring 2019
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COMPILER CONSTRUCTION
Presentation transcript:

Syntax Sudeshna Sarkar 25 Aug 2008

Some Fundamental Questions What is Language? How to define a Language? What makes a language different from another? Is there anything common to all languages?

Syntax Syntax: from Greek syntaxis, “setting out together, arrangmenet’ Refers to the way words are arranged together, and the relationship between them. Distinction: Prescriptive grammar: how people ought to talk Descriptive grammar: how they do talk Goal of syntax is to model the knowledge of that people unconsciously have about the grammar of their native language

The Two Schools Rationalists Empiricists It’s all hardcoded in our brains Principle and Parameter Theory Poverty of Stimulus Recursion Empiricists Just a special kind of pattern recognition No different from other cognitive abilities like vision Language is a stochastic phenomenon

The Generative Grammar “The grammatical principles underlying languages are innate and fixed, and the differences among the world's languages can be characterized in terms of parameter settings in the brain …” - www.wikipedia.org Noam Chomsky [1928-] Courtesy www.chomsky.info

I & E Languages I – Language: Mentally represented system of rules (I – internal) E – Language: Observable external products of I-language (written text, utterances) Language: Collective E-language of a very large group of speakers Syntax: Study of the I-language from E-language

The Chomsky Hierarchy Grammar Languages Automaton Production rules Type-0 Recursively enumerable Turing machine No restrictions Type-1 Context-sensitive Linear-bounded non-deterministic Turing machine αAβ → αγβ Type-2 Context-free Non-deterministic pushdown automaton A → γ Type-3 Regular Finite state automaton A → aB A → a

From Formal to Natural Languages Organizational Unit Complexity Word Regular Sounds Sentence Context-free Discourse ??

Some Observations on NLs Constituency: A group of words acts as a single unit – phrases, clauses etc. Grammatical Relations: Different words/ phrases are related to the main verb of the sentence – object, subject, instrument Subcategorization and Dependency Relations: Not all verbs can take all type of arguments – transitive, intransitive etc.

Syntax Why should you care? Grammar checkers Question answering Information extraction Machine translation

Why NLP is difficult: Newspaper headlines Iraqi Head Seeks Arms Juvenile Court to Try Shooting Defendant Teacher Strikes Idle Kids Stolen Painting Found by Tree Local High School Dropouts Cut in Half Red Tape Holds Up New Bridges Clinton Wins on Budget, but More Lies Ahead Hospitals Are Sued by 7 Foot Doctors Kids Make Nutritious Snacks

Why is NLU difficult? The hidden structure of language is hugely ambiguous Tree for: Fed raises interest rates 0.5% in effort to control inflation (NYT headline 5/17/00)

Where are the ambiguities?

The bad effects of V/N ambiguities

Context-Free Grammars Capture constituency and ordering Ordering is easy What are the rules that govern the ordering of words and bigger units in the language What’s constituency? How words group into units and how the various kinds of units behave wrt one another

Constituency We have NLP classes from 5:30 to 6:30 pm on Tuesday. On Tuesday we have NLP classes from 5:30 – 6:30 pm. From 5:30 to 6:30 pm on Tuesday we have NLP classes. We have NLP on Tuesday from 5:30 to 6:30 pm classes. On we have NLP classes from Tuesday 5:30 to 6:30 pm. From 5:30 we have to 6:30 pm on Tuesday NLP classes.

Constituency We have NLP classes from 5:30 to 6:30 pm on Tuesday. On Tuesday we have NLP classes from 5:30 – 6:30 pm. From 5:30 to 6:30 pm on Tuesday we have NLP classes. We have NLP on Tuesday from 5:30 to 6:30 pm classes. On we have NLP classes from Tuesday 5:30 to 6:30 pm. From 5:30 we have to 6:30 pm on Tuesday NLP classes.

Phrases Phrase: Group of words that act as a unit Noun Phrase NP A midsummer night’s dream, My experiments with truth, The man who knew infinity Verb Phrase VP Gone with the wind, Saving private Ryan Prepositional Phrases PP Of sons and lovers, to sir with love, Beyond the blue mountains, Into the heart of the mind

Modelling the Syntax of English Let us try CFGs S  NP VP I love India. S  VP Love your country. S  Aux NP VP Do you love your country? S  Wh-NP VP Who loves his country? S  Wh-NP Aux NP VP Which country do you live in?

Phrase Structure Grammar Context Free Grammars are also called phrase structure grammars Phrases are the building blocks of any PSG (i.e. CFG) Phrases in turn are defined by CFG (PSG)

I think that Einstein thought that Newton said … Is CFG Necessary? Can we model the syntax of English using Regular Grammar? NO! we cannot model recursion in RG S  NP VP VP  Verb S I think that Einstein thought that Newton said …

CFG Examples S -> NP VP NP -> Det NOMINAL NOMINAL -> Noun VP -> Verb Det -> a Noun -> flight Verb -> left

CFGs S -> NP VP This says that there are units called S, NP, and VP in this language That an S consists of an NP followed immediately by a VP Doesn’t say that that’s the only kind of S Nor does it say that this is the only place that NPs and VPs occur

Context Free Grammars A CFG consists of a tuple (N,T,S,P) N is a finite set of non-terminal symbols T is a finite set of terminal symbols S is the start symbol P is a finite set of rules of the form X   where X  N and {N U T}*

Phrase Structure Parsing Phrase structure organizes words into phrases, often called constituents This organization is hierarchical For a given string there is often ambiguity about the correct phrase structure This ambiguity often corresponds to semantic ambiguity

Simple examples of a CFG Take the non-terminals = {S, NP, VP, V} And the terminals {boys, study, play, books, cricket) Let the start symbol be S Let the rule set be S  NP VP VP  V VP  V NP NP  boys NP  books NP  cricket V study V play This CFG licenses a finite number of tree sentences

Generativity As with FSAs and FSTs you can view these rules as either analysis or synthesis machines Generate strings in the language Reject strings not in the language Impose structures (trees) on strings in the language

Derivations A derivation is a sequence of rules applied to a string that accounts for that string Covers all the elements in the string Covers only the elements in the string

Derivations as Trees

Two views of linguistic structure: 1. Constituency (phrase structure) Phrase structure organizes words into nested constituents. How do we know what is a constituent? (Not that linguists don't argue about some cases.) Distribution: a constituent behaves as a unit that can appear in different places: John talked [to the children] [about drugs]. John talked [about drugs] [to the children]. *John talked drugs to the children about Substitution/expansion/pro-forms: I sat [on the box/right on top of the box/there]. Coordination, regular internal structure, no intrusion, fragments, semantics, …

Two views of linguistic structure: 2. Dependency structure Dependency structure shows which words depend on (modify or are arguments of) which other words. put boy tortoise on rug The the The boy put the tortoise on the rug the

Parsing Parsing is the process of taking a string and a grammar and returning a (many?) parse tree(s) for that string It is completely analogous to running a finite-state transducer with a tape It’s just more powerful Remember this means that there are languages we can capture with CFGs that we can’t capture with finite-state methods

Other Options Regular languages (expressions) Too weak Context-sensitive or Turing equiv Too powerful (maybe)

Context? The notion of context in CFGs has nothing to do with the ordinary meaning of the word context in language. All it really means is that the non-terminal on the left-hand side of a rule is out there all by itself (free of context) A -> B C Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found Or when I see a B followed by a C I can infer an A regardless of the surrounding context

Key Constituents (English) Sentences Noun phrases Verb phrases Prepositional phrases