Syntax: Structural Descriptions of Sentences. Why Study Syntax? Syntax provides systematic rules for forming new sentences in a language. can be used.

Slides:



Advertisements
Similar presentations
Introduction to Syntax and Context-Free Grammars Owen Rambow
Advertisements

Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Syntax. Definition: a set of rules that govern how words are combined to form longer strings of meaning meaning like sentences.
1 Context Free Grammars Chapter 12 (Much influenced by Owen Rambow) September 2012 Lecture #5.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Introduction to Syntax Owen Rambow September 30.
Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 4.
Grammatical Relations and Lexical Functional Grammar Grammar Formalisms Spring Term 2004.
Statistical NLP: Lecture 3
1 Words and the Lexicon September 10th 2009 Lecture #3.
Introduction to Syntax Owen Rambow September
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Word Classes and English Grammar.
Introduction to Syntax Owen Rambow October
Artificial Intelligence 2005/06 From Syntax to Semantics.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Features and Unification
NLP and Speech 2004 English Grammar
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Introduction to Syntax and Context-Free Grammars Owen Rambow September
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Announcements Main CSE file server went down last night –Hand in your homework using ‘submit_cse467’ as soon as you can – no penalty if handed in today.
CS 4705 Lecture 11 Feature Structures and Unification Parsing.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Linguistics II Syntax. Rules of how words go together to form sentences What types of words go together How the presence of some words predetermines others.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Constituency Tests Phrase Structure Rules
THE PARTS OF SYNTAX Don’t worry, it’s just a phrase ELL113 Week 4.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 14, Feb 27, 2007.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.
Context-Free Grammars for English 1 인공지능 연구실 허 희 근.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
SYNTAX Lecture -1 SMRITI SINGH.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Chapter 12: Context-Free Grammars Heshaam Faili University of Tehran.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
Making it stick together…
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
SYNTAX.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Handling Unlike Coordinated Phrases in TAG by Mixing Syntactic Category and Grammatical Function Carlos A. Prolo Faculdade de Informática – PUCRS CELSUL,
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
Speech and Language Processing Formal Grammars Chapter 12.
College of Science and Humanity Studies, Al-Kharj.
Natural Language Processing Vasile Rus
Introduction to Syntax and Context-Free Grammars
Statistical NLP: Lecture 3
CSC NLP -Context-Free Grammars
Syntax.
Introduction to Syntax and Context-Free Grammars cs
CS 388: Natural Language Processing: Syntactic Parsing
Introduction to Syntax
David Kauchak CS159 – Spring 2019
Presentation transcript:

Syntax: Structural Descriptions of Sentences

Why Study Syntax? Syntax provides systematic rules for forming new sentences in a language. can be used to verify if a sentence is legitimate in a language. a step closer to the “meaning” of a sentence. – Who did what to whom semantics Applications Improving precision in search applications – Yankees beat red sox – Red sox beat yankees Paraphrasing – John loves Mary = Mary is loved by John Information Extraction – Fill in a form by extracting information from a document.

Structure of Words What are words? Orthographic tokens separated by white space. In some languages the distinction between words and sentences is less clear. Chinese, Japanese: no white space between words – nowhitespace  no white space/no whites pace/now hit esp ace Turkish: words could represent a complete “sentence” – Eg: uygarlastiramadiklarimizdanmissinizcasina Morphology: the structure of words Basic elements: morphemes Morphological Rules: how to combine morphemes. Syntax: the structure of sentences Rules for ordering words in a sentence Elementary units: Phrasal and Clauses

Morphology and Syntax Interplay between syntax and morphology How much information does a language allow to be packed in a word, and how easy is it to unpack. More information  less rigid syntax  more free word order Hindi: “John likes Mary” – all six orders are possible, due to rich morphological information. – John-nom Mary-acc likes English expresses relations between words through word order. Morphologically rich languages have freer word order. However, some parts have rigid word order. – Noun groups in Hindi: “one yellow book”

Outline Constituency How does this notion arise? Type of constituents Representation: Tree Structure Formal device: Context Free Grammars Derived tree and derivation tree Grammar Equivalence – Strong and weak generative capacity – Chomsky Normal Form Other Formal Frameworks (Tree-Adjoining Grammar) Other topics in syntax Dependency Spoken language syntax Structural Priming

Constituency Words are grouped into part-of-speech groups Similar morphological inflections Allows us to create new word forms (“blog”, “xerox”) Nouns, Verbs, Determiners, Adjectives etc… Certain sequences of words in a sentence are grouped as constituents Distributionally similar behavior cohesive units (move around in a sentence as a unit) – In the morning I take a walk – I take a walk in the morning Substrings are typed “Clause”, “Noun Phrase”, “Verb Phrase” “Preposition Phrase” etc.

Constituency – contd. Examples of constituents: Noun phrase: – the dog, two big light blue vans Preposition phrase: – in the box, under the bridge Clause: – the dog bit the man, John thought the dog bit the man The type of a constituent is derived from the “head word” of the constituent.

Constituent Structure Decomposition of a sentence into its constituents. Attaching constituents to each other to reflect relations among words: Emergence of Tree Structure John saw the man with the telescope (S (NP John) saw (NP (NP the man) (PP with (NP the telescope)))) (S (NP John) saw (NP the man) (PP with (NP the telescope)))) Select a sentence from a newspaper text and provide its constituent structure. Evidence of another constituent – verb phrase (“VP”) Substring involving a verb move around and can be referred to as a unit. – VP-fronting (and quickly clean the carpet he did! ) – VP-ellipsis (He cleaned the carpets quickly, and so did she ) – Can have adjuncts before and after VP, but not in VP (He often eats beans, *he eats often beans )

Relations among Words Types of relations between words Arguments: subject, object, indirect object, prepositional object Adjuncts: temporal, locative, causal, manner, … Function Words Subcategorization: List of arguments of a word (verb) with features about realization (POS, perhaps case, verb form etc) For English, the argument order: Subject-Object-IndirectObj Example: like: NP-NP (“John likes Mary”), NP-VP(to-inf) (John likes to watch movies) think: NP-S (“John thought Mary was going to the party”) put: NP-NP-PP Adjuncts are optional (typically modifiers of an action) John put the book on the table at 3pm yesterday There are words with “demands” and words that fill the “demands”. Demands are typed (NP, VP, PP, S)

English Syntax: A Sample Sentence types: Declarative (John closed the door) Imperative (close the door!!) Yes-No-Question (can you close the door?) Wh-question (who closed the door? What did John close?) Clause types: Infinitival (to read a book) Gerundive (reading of a book) Relative Clause (that has a green cover)

English Syntax: A Sample – contd. Noun Phrase: Before the head noun: – Pre-determiner Determiner Post-determiner (Adjective|Noun) Noun After the head noun (Modifiers) – Preposition phrases – Relative Clauses (the book that has only one sentence) – Gerundive (the flight arriving after 10pm) Auxiliary Verbs Modal (could, might, will, should…) < perfect (have) < progressive (be) < passive (be) “might have been destroyed” Large wide-coverage grammars have been developed/under development XTAG ( HPSG, LFGwww.cis.upenn.edu/~xtag

Two Representations of Syntactic Structure Phrase structure: illustrates the constituents and its type. Dependency structure: Relations between words without intervening structure. reads boybook thea boy the reads book a DetP NP DetP S Adv slowly adj arg0 arg1 fw

Context Free-Grammars String Rewriting Systems Transform one string to another (until termination) G=(V,T,P,S) where V: vocabulary of non-terminals T: vocabulary of terminals S: start symbol P: set of productions of the form   where   V and   (V U T)* Derivation: Rewrite a non-terminal with the production of the grammar until no non-terminals exist in the string. Start with “ S ” Sample Context-Free Grammar, derivation and derived structure.

Two Representations String rewriting system: we derive a string (=derived structure) But derivation history represented by phrase-structure tree (=derivation structure)! Grammar Equivalence Can have different grammars that generate same set of strings (weak equivalence) Can have different grammars that have same set of derivation trees (strong equivalence) Strong equivalence implies weak equivalence CFG Normal Forms: Chomsky Normal Form (   ) Griebarch Normal Form (  w ) Convert a grammar into CNF and GNF

Penn Treebank (PTB) Syntactically annotated corpus (phrase structure) Contains 1 miilion words of Wall Street Journal sentences marked up with syntactic structure. Can be converted into a dependency Treebank. – need for head percolation tables Completely flat structure in NP – brown bag lunch, pink-and-yellow child seat Represents a particular linguistic theory PropBank PTB with some grammatical relations made explicit

Unification Mechanism needed to pass and check constraints. Constraints, syntactic and semantic: Subject-verb agreement – S  NP VP – the boy reads / the boys read / * the boys reads Subject/Auxiliary inversion: (Yes-no-question) – S  AuxVerb NP VP – Do you have flights / * does you have flights Selectional restrictions: – An apple reads a book Need a mechanism to encode these constraints Refine the non-terminal set to encode these constraints. S  3sgAux 3sgNP VP ; 3sgAux  does | has … S  Non3sgAux Non3sgNP VP; Non3sgAux  do | have | can We need to split the NP rule into the 3sgNP and Non3sgNP. Size of the grammar grows; can we factor these constraints out of the structure of the rules?

Unification – contd. Attribute value matrix: boy :Number Person sg 3 CatN read : Number Person pl 3 Cat V Subjagr NP.number = VP.subj.agr.number NP.person = VP.subj.agr.person S  NP VP reads: Number sg CatV Subj agr VP  V VP.number = V.subj.agr.number VP.person = V.subj.agr.person Percolate Constraints Check Constraints The boy reads / * the boys reads / the boys read boys :Number Person pl 3 CatN Numbersg Person1|2

Structural Priming Structure of preceding sentences helps/hinders the reading times of subsequent sentences. Dative alternation – The woman gave her car to the church – The woman gave the church her car One of these forms is primed depending on what the prime was – V NP NP  gave the church her car – V NP PP  gave her car to the church

Spoken Language Syntax Not as “clean”, rampant disfluency. edits (restarts, repairs) Filled pauses Ungrammaticality Sentence  utterance. “Clean up” the utterance first before understanding it.