CSA3050: NLP Algorithms Sentence Grammar 19.11.2002 NLP Algorithms.

Slides:



Advertisements
Similar presentations
The Structure of Sentences Asian 401
Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
07/05/2005CSA2050: DCG31 CSA2050 Introduction to Computational Linguistics Lecture DCG3 Handling Subcategorisation Handling Relative Clauses.
Chapter 4 Syntax.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
FSG, RegEx, and CFG Chapter 2 Of Sag’s Syntactic Theory.
Dr. Abdullah S. Al-Dobaian1 Ch. 2: Phrase Structure Syntactic Structure (basic concepts) Syntactic Structure (basic concepts)  A tree diagram marks constituents.
Statistical NLP: Lecture 3
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
MORPHOLOGY - morphemes are the building blocks that make up words.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Natural Language Processing - Feature Structures - Feature Structures and Unification.
1 Words and the Lexicon September 10th 2009 Lecture #3.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
NLP and Speech 2004 English Grammar
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Linguistic Theory Lecture 2 Phrase Structure. What was there before structure? Classical studies: Classical studies: –Languages such as Latin Rich morphology.
Syntax Nuha AlWadaani.
THE PARTS OF SYNTAX Don’t worry, it’s just a phrase ELL113 Week 4.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
November 2009HLT - Sentence Grammar1 Human Language Technology Sentence Grammar.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
Natural Language Processing Lecture 6 : Revision.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
GrammaticalHierarchy in Information Flow Translation Grammatical Hierarchy in Information Flow Translation CAO Zhixi School of Foreign Studies, Lingnan.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
November 2011Sentence Grammar1 CSA3202: HLT Sentence Grammar.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Linguistic Essentials
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
November 2011CLINT-LN CFG1 Computational Linguistics Introduction Context Free Grammars.
Linguistics The eleventh week. Chapter 4 Syntax  4.1 Introduction  4.2 Word Classes.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
CSA2050 Introduction to Computational Linguistics Parsing I.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
NATURAL LANGUAGE PROCESSING
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Beginning Syntax Linda Thomas
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
SYNTAX.
Chapter Eight Syntax.
Part I: Basics and Constituency
CS 388: Natural Language Processing: Syntactic Parsing
CSCI 5832 Natural Language Processing
Chapter Eight Syntax.
Natural Language - General
Linguistic Essentials
David Kauchak CS159 – Spring 2019
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
Presentation transcript:

CSA3050: NLP Algorithms Sentence Grammar 19.11.2002 NLP Algorithms

Introduction This lecture has two aims: See also Crash course in sentence-level grammar Show how different linguistic phenomena can be captured by grammar rules. See also Jurafsky and Martin Chapter 9 Internet Grammar of English http://www.ucl.ac.uk/internet-grammar/ 19.11.2002 NLP Algorithms

Part 1 Grammar of English 19.11.2002 NLP Algorithms

Different Kinds of Rule Morphological rules.. govern how words may be composed: re+invest+ing = reinvesting. Syntactic rules .. govern how words and constituents combine to form grammatical sentences. Semantic rules .. govern how meanings may be combined. 19.11.2002 NLP Algorithms

Syntax: Why? You need knowledge of syntax in many applications: Parsing Grammar checkers Question answering/database access Information extraction Generation Translation Full versus superficial analysis? 19.11.2002 NLP Algorithms

Levels of Grammar Organisation Word Classes: different parts of speech (POS). Phrase Classes: sequences of words inheriting the characteristics of certain word classes. Clause Classes: sequences of phrases containing at least one verb phrase. On the basis of these one may define: Grammatical Relations: role played by constitutents e.g. subject; predicate; object Syntax-Semantics interface 19.11.2002 NLP Algorithms

Word Classes Closed classes. Open classes. determiners : the, a, an, four. pronouns : it, he etc. prepositions : by, on, with . conjunctions : and, or, but. Open classes. nouns refer to objects or concepts: cat , beauty , Coke. adjectives describe or qualify nouns: fried chickens. verbs describe what the noun does: John jumps. adverbs describe how it is done: John runs quickly. 19.11.2002 NLP Algorithms

Word Class Characteristics Different word classes have characteristic subclasses and properties Subclasses Properties Noun proper; mass; count number; gender Verb transitive; intransitive Number; gender; person, tense Adjective dimension; age; colour Number, gender Further Notes on Word Classes Different word classes have characteristic subclasses and properties. Nouns, for example, can be classified into different types: PROPER nouns are names like John or Paris; COMMON nouns cow, house stand for collections of properties that characterise classes of object ABSTRACT nouns like beauty stand for abstract ideas rather than physically realised objects. MASS nouns such as bread name substances rather than individual things. Consequently, they cannot be counted. COUNT nouns like loaf are precisely the opposite. Nouns also have properties such as number (i.e. singular or plural), gender (i.e masculine, feminine, neuter), and (in some languages like German) case (e.g. nominative, dative, accusative). Nouns are often preceded by the words the, a, or an. These words are called determiners. They indicate the kind of reference which the noun has which can indicate Definiteness/indefiniteness: a, the Proximity to speaker: this, that Quantity: all, both, many, several, four Adjectives come before nouns (in English) and typically describe attributes of them. These attributes can be classified into types (e.g. colour, size, age), and in some languages there is a default ordering between the types. In English, for example, we prefer big brown dog to brown big dog. Like nouns, adjectives have properties of number, gender and case, and “agree with” the nouns they modify. Hence mejda gbira not mejda gbir. Adjectives can be modified by adverbs, which in English come before (e.g. largely irrelevant). Finally, most adjectives admit comparison: Basic form: young, recent Comparative form: younger, more recent Superlative form: youngest, most recent 19.11.2002 NLP Algorithms

Phrases Longer phrases may be used rather than a single word, but fulfilling the same role in a sentence. Noun phrases refer to objects: four fried chickens. Verb phrases state what the noun phrase does: kicks the dog. Adjective phrases describe/qualify an object: sickly sweet. Adverbial phrases describe how it is done: very carefully. prepositional phrases: add information to a verb phrase: on the table 19.11.2002 NLP Algorithms

Phrases can be Complex e.g. Noun Phrases Proper Name or Pronoun: Monday; it Specifiers, noun: the day Specifier, premodifier, noun: the first wet day Specifiers, qualifiers, noun, postmodifier: The first wet day that I enjoyed in June 19.11.2002 NLP Algorithms

But they all fit the same context Monday It The day The first wet day The first wet day that I enjoyed in June was sunny. 19.11.2002 NLP Algorithms

Clauses A clause is a combination of noun phrases and verb phrases Clauses can exist at the top level (main clause) or can be embedded (subordinate clause) Top level clause is a sentence. E.g. The cat ate the mouse. Embedded clause is subordinate e.g. John said that Sandy is sick. Unlike phrases, whole sentences can be used to say something complete, e.g. to state a fact or ask a question. 19.11.2002 NLP Algorithms

Different Kinds of Sentences Assertion: John ate the cat. Yes/No question: Did John eat the cat? Wh- question: What did John eat? Command: Eat the cat John! 19.11.2002 NLP Algorithms

Context Free Grammar Rules Part II Context Free Grammar Rules 19.11.2002 NLP Algorithms

Formal Grammar A formal grammar consists of Terminal Symbols (T) Non Terminal Symbols (NT, disjoint from TS) Start Symbol (a distinguished NT) Rewrite rules of the form , where  and  are strings of symbols Different classes of grammar result from various restrictions on the form of rules 19.11.2002 NLP Algorithms

Classes of Grammar Type Grammars Languages Machines Phrase Structure Unrestricted Recursively Enumerable TM 1 Context Sensitive LBA 2 Context Free PDA 3 Regular FSA of grammar type increasing strength 19.11.2002 NLP Algorithms

Restrictions on Rules For all rules  Type 0 (unrestricted): no restrictions Type 1 (context sensitive): |||| Type 2 (context free):  is a single NT symbol Type 3 (regular) Every rule is of the form A  aB or A  a where A,B NT and aT 19.11.2002 NLP Algorithms

Which Class for NLP? Type 3 (Regular). Good for morphology. Cannot handle central embedding of sentences. The man that John saw eating died. Type 2 (Context Free). OK but problems handling certain phenomena e.g. agreement. Type 1 (Context Sensitive). Computational properties not well understood. Too powerful. Type 0 (Turing). Too powerful. 19.11.2002 NLP Algorithms

Weak versus Strong Grammar class that is too weak cannot characterise/discriminate exactly NL sentence structures. Grammar class that is too strong has the power to characterise/discriminate structures that don't exist in human languages. Stronger grammar, higher complexity → less efficient computations. 19.11.2002 NLP Algorithms

Example Grammar Cabinet discusses police chief’s case French gunman kills four s  np vp np  n np  adj n np  n np vp  v np 19.11.2002 NLP Algorithms

Classifying the Symbols NT – symbols appearing on the left Start – symbol appearing only on the left from which every other symbol can be derived. T – symbols appearing only on the right To include words we also need special rules such as n  [police] n  [gunman] n  [four] Latter rules define the lexicon or “dictionary interface”. 19.11.2002 NLP Algorithms

Grammar Induces Phrase Structure vp np np adj n v n French gunman kills four 19.11.2002 NLP Algorithms

Phrase Structure PS includes information about precedence between constituents dominance between constituents PS constitutes a trace of the rule applications used to derive a sentence PS does not tell you the order in which the rules were used 19.11.2002 NLP Algorithms

Handling Sentence Types Declaratives John left. S → NP VP Imperatives Leave! S →VP Yes-No Questions Did John leave? S →Aux NP VP WH Questions When did John leave? S →Wh-word Aux NP VP 19.11.2002 NLP Algorithms

Handling Recursive Structures Flights to Miami Flights to Miami from Boston Flights to Miami from Boston in April Flights to Miami from Boston in April on Friday Flights to Miami from Boston in April on Friday under $300. Flights to Miami from Boston in April on Friday under $300 with lunch. 19.11.2002 NLP Algorithms

Recursive Rules NP → NP PP PP → Preposition NP 19.11.2002 NLP Algorithms

Handling Agreement NP → Determiner N Include these days, this day Exclude this days, these day NP → NPSing NP → NPPlur NPPlur → DetSing NSing NPPlur → DetPlur NPlur Agreement also includes number, gender, case. Danger: proliferation of categories/rules. 19.11.2002 NLP Algorithms

Subcategorisation Intransitive verb: no object John disappeared John disappeared the cat* Transitive verb: one object John opened the window John opened* Ditransitive verb: two objects John gave Mary the book John gave Mary* 19.11.2002 NLP Algorithms

Subcategorisation Rules Intransitive verb: no object VP → V Transitive verb: one object VP → V NP Ditransitive verb: two objects VP → V NP NP If you take account of the category of items following the verb, there are about 40 different patterns like this in English. 19.11.2002 NLP Algorithms

Overgeneration A grammar should generate only sentences in the language. It should exclude sentences not in the language. s  n vp vp  v n  [John] v  [snore] v  [snores] 19.11.2002 NLP Algorithms

Undergeneration A grammar should generate all sentences in the language. There should not be sentences in the language that are not generated by the grammar. s  n vp vp  v n  [John] n  [gold] v  [found] 19.11.2002 NLP Algorithms

Movement John looked up the number John looked the number up 19.11.2002 NLP Algorithms

Appropriate Stuctures A grammar should assign linguistically plausible structures. s n v d a n John ate a juicy hamburger vp s vs. np vp np n v d a n John ate a juicy hamburger 19.11.2002 NLP Algorithms

Ambiguity np  np pp pp  prep np (the man) (on the hill with a telescope by the sea) (the man on the hill) (with a telescope by the sea) (the man on the hill with a telescope)( by the sea) etc. 19.11.2002 NLP Algorithms

Criteria for Evaluating Grammars Does it undergenerate? Does it overgenerate? Does it assign appropriate structures to sentences it generates? Is it simple to understand? How many rules are there? Does it contain generalisations or is it just a collection of special cases? How ambiguous is it? 19.11.2002 NLP Algorithms