Natural Language Processing - English Grammar -

Slides:



Advertisements
Similar presentations
Morphology Reading: Chap 3, Jurafsky & Martin Instructor: Paul Tarau, based on Rada Mihalcea’s original slides Note: Some of the material in this slide.
Advertisements

Jing-Shin Chang1 Morphology & Finite-State Transducers Morphology: the study of constituents of words Word = {a set of morphemes, combined in language-dependent.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
CS Morphological Parsing CS Parsing Taking a surface input and analyzing its components and underlying structure Morphological parsing:
Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Morphology.
Morphological Analysis Chapter 3. Morphology Morpheme = "minimal meaning-bearing unit in a language" Morphology handles the formation of words by using.
Dr. Abdullah S. Al-Dobaian1 Ch. 2: Phrase Structure Syntactic Structure (basic concepts) Syntactic Structure (basic concepts)  A tree diagram marks constituents.
Statistical NLP: Lecture 3
Chapter 8. Word Classes and Part-of-Speech Tagging From: Chapter 8 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech.
BİL711 Natural Language Processing
 Christel Kemke 1 Morphology COMP 4060 Natural Language Processing Morphology, Word Classes, POS Tagging.
Ana Bertha Camargo Mejía
MORPHOLOGY - morphemes are the building blocks that make up words.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Morphology Chapter 7 Prepared by Alaa Al Mohammadi.
Brief introduction to morphology
1 Words and the Lexicon September 10th 2009 Lecture #3.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Word Classes and English Grammar.
Stemming, tagging and chunking Text analysis short of parsing.
Artificial Intelligence 2005/06 From Syntax to Semantics.
NLP and Speech 2004 English Grammar
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Matakuliah: G0922/Introduction to Linguistics Tahun: 2008 Session 10 Syntax 1.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Grammar Sentence Constructs.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Chapter 2 A rapid overview.
Main Verb Phrases Traditional grammar categorizes verbs by tense, then equates tense with real world time In reality, there are three grammatical concepts.
Constituency Tests Phrase Structure Rules
Parts of Speech (Lexical Categories). Parts of Speech Nouns, Verbs, Adjectives, Prepositions, Adverbs (etc.) The building blocks of sentences The [ N.
Clauses and Moods by Prashanth Kamle
Syntax The number of words in a language is finite
8. Word Classes and Part-of-Speech Tagging 2007 년 5 월 26 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.287 ~ 303.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 9.
Linguistic levels of structure Sound Phoneme Morpheme Word Phrase Clause Sentence Meaning ð iː z b juː t ə f ʊ l w ɪ m ɪ n s ɛ d w iː w ɜː t r uː m ɛ n.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
Formal Properties of Language. Grammar Morphology Syntax Semantics.
© Child language acquisition To what extent do children acquire language by actively working out its rules?
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
BY HELEN LORENA SOLANO ALEXANDER ARANDA. is a group of words without both a subject and predicate. Phrases combine words into a larger unit that can function.
GrammaticalHierarchy in Information Flow Translation Grammatical Hierarchy in Information Flow Translation CAO Zhixi School of Foreign Studies, Lingnan.
Morphological Analysis Chapter 3. Morphology Morpheme = "minimal meaning-bearing unit in a language" Morphology handles the formation of words by using.
Parts of Speech (Lexical Categories). Parts of Speech n Nouns, Verbs, Adjectives, Prepositions, Adverbs (etc.) n The building blocks of sentences n The.
Natural Language Processing
Verb What is a verb? jump Verbs A verb is one of the most important parts of the sentence. It tells the subjects actions, events, or state of being.
Artificial Intelligence 2004
Verb phrases Main reference: Randolph Quirk and Sidney Greenbaum, A University Grammar of English, Longman: London, (3.23 – 3.55)
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
CSA3050: NLP Algorithms Sentence Grammar NLP Algorithms.
Basic Syntactic Structures of English CSCI-GA.2590 – Lecture 2B Ralph Grishman NYU.
Word classes and part of speech tagging Chapter 5.
Expanding verb phrases
Lecture 1 Sentences Verbs.
Descriptive Grammar – 2S, 2016 Mrs. Belén Berríos Droguett
Non-finite forms of the verb
Verb.
Morphology Morphology Morphology Dr. Amal AlSaikhan Morphology.
Beginning Syntax Linda Thomas
Chapter 6 Morphology.
Morphology.
Introduction to English morphology
Introduction to Linguistics
Presentation transcript:

74.406 Natural Language Processing - English Grammar - (Mostly) English Grammar Morphology, Word Classes, POS Tagging Grammar Extensions on the Sentence and Phrase Level Sentence Level Constructs Noun Phrase - Modifications Verb Phrase - Subcategorization (Jurafsky, Ch. 3, 6.1, 8 and 9; Allen Ch. 2)

Morphology

Basics of Morphology Morpheme = "minimal meaning-bearing unit in a language" e.g. cats, cat, -s Non-Concatenative Morphology templatic morphology: modify word templates Hebrew: lmd (study, learn) - limed ("he taught") - lumad ("he was taught") Concatenative Morphology word stem + prefix + suffix (+ infix + circumfix) Inflectional Morphology word stem + grammatical morpheme; same word class; cat+s Derivational Morphology word stem + grammat. morpheme; other word class; mob+b+ing

Inflectional Morphology word stem + grammatical morpheme cat+s only for nouns, verbs, and some adjectives Nouns plural: regular: +s, +es irregular: mouse - mice; ox - oxen rules for exceptions: e.g. -y -> -ies like: butterfly - butterflies possessive: +'s, +' Verbs main verbs (sleep, eat, walk) modal verbs (can, will, should) primary verbs (be, have, do)

Inflectional Morphology (verbs) Verb Inflections only for: main verbs (sleep, eat, walk); primary verbs (be, have, do) Morpholog. Form Regularly Inflected Form stem walk merge try map -s form walks merges tries maps -ing participle walking merging trying mapping past; -ed participle walked merged tried mapped Morph. Form Irregularly Inflected Form stem eat catch cut -s form eats catches cuts -ing participle eating catching cutting -ed past ate caught cut -ed participle eaten caught cut

Inflectional and Derivational Morphology (adjectives) Adjective Inflections and Derivations: prefix un- unhappy adjective, negation suffix -ly happily adverb, mode -er happier adjective, comparative 1 -est happiest adjective, comparative 2 suffix -ness happiness noun plus combinations, like unhappiest, unhappiness. Distinguish different adjective classes, which can or cannot take certain inflectional or derivational forms, e.g. no negation for big.

Morphological Processing Knowledge lexical entry: stem plus possible prefixes, suffixes plus word classes, e.g. endings for verb forms (see tables above) rules: how to combine stem and affixes, e.g. add s to form plural of noun as in dogs orthographic rules: spelling, e.g. double consonant as in mapping Processing: Finite State Transducers take information above and analyze word token / generate word form

Fig. 3.3 FSA for verb inflection.

Fig. 3.4 Simple FSA for adjective inflection. Fig. 3.5 More detailed FSA for adjective inflection.

Fig. 3.7 Compiled FSA for noun inflection.

Fig. 3.12 Lexical and intermediate tape of a FS Transducer Fig. 3.13 Lexical, intermediate, and surface tape after spelling transformation.

Word Classes and POS Tagging

Word Classes morphological properties distributional properties Sort words into categories according to: morphological properties Which types of morphological forms do they take? e.g. form plural: noun+s; 3rd person: verb+s distributional properties What other words or phrases can occur nearby? e.g. possessive pronoun before noun semantic coherence Classify according to similar semantic type. e.g. nouns refer to object-like entities

Open vs. Closed Word Classes Open Class Types The set of words in these classes can change over time, with the development of the language, e.g. spaghetti and download Open Class Types: nouns, verbs, adjectives, adverbs

Open vs. Closed Word Classes Closed Class Types The set of words in these classes are very much determined and hardly ever change for one language. Closed Class Types: prepositions, determiners, pronouns, conjunctions, auxiliary verbs, particles, numerals

Open Class Words: Nouns denote objects, concepts, entities, events Proper Nouns Names for specific individual objects, entities e.g. the Eiffel Tower, Dr. Kemke Common Nouns Names for categories, classes, abstracts, events e.g. fruit, banana, table, freedom, sleep, race, ... Count Nouns enumerable entities, e.g. two bananas Mass Nouns not countable items, e.g. water, salt, freedom

Open Class Words: Verbs denote actions, processes, and states e.g. smoke, dream, rest, run several morphological forms, e.g. non-3rd person - eat 3rd person - eats progressive/ - eating present participle/ gerundive past participle - eaten simple past - ate

Open Class Words: Verbs (2) Verbs - use of morphological forms, examples: non-3rd person eat I eat. We eat. They eat. 3rd person eats He eats. She eats. It eats. progressive eating He is eating. He will be eating. He has been eating. e.g. present participle He is eating. gerundive Eating scorpions [NP] is common in China. use as adjective Eating children [NP] are common at McDonalds. past participle eaten He has eaten the scorpion. The scorpion was eaten. simple past ate He ate the scorpion.

Open Class Words: Adjectives denote qualities or properties of objects e.g. heavy, blue, content most languages have concepts for colour - white, green, ... age - young, old, ... value - good, bad, ... not all languages have adjectives as separate class

Open Class Words: Adverbs 1 denote modifications of actions (verbs) or qualities (adjectives) e.g. walk slowly or heavily drunk Directional or Locational adverbs specify direction or location e.g. go home, stay here

Open Class Words: Adverbs 2 Degree Adverbs specify extent of process, action, property e.g. extremely slow, very modest Manner Adverbs specify manner of action or process e.g. walk slowly, run fast Temporal Adverbs specify time of event or action e.g. yesterday, Monday

Closed Word Classes Closed Class Types: Prepositions: on, under, over, at, from, to, with, ... Determiners: a, an, the, ... Pronouns: he, she, it, his, her, who, I, ... Conjunctions: and, or, as, if, when, ... Auxiliary verbs: can, may, should, are, … Particles: up, down, on, off, in, out, … Numerals: one, two, three, ..., first, second, ...

Closed Word Class: Prepositions occur before noun phrases; describe relations; often spatial or temporal relations e.g. on the table spatial in two hours temporal

Closed Word Class: Pronouns reference to entities, events, relations etc. Personal Pronouns refer to persons or entities, e.g. you, he, it, ... Possessive Pronouns possession or relation between person and object, e.g. his, her, my, its, ... Wh-Pronouns reference in question or back reference, e.g. Who did this ..., Frieda, who is 80 years old ...

Closed Word Class: Conjunctions join phrases or sentences semantics is varied and complex Coordinating Conjunction Join two phrases or sentences on the same level through conjunctions like and, or, but, ... e.g. He takes a cat and a dog. He takes a dog and she takes a cat. Subordinating Conjunction Connect embedded phrases through e.g. that e.g. He thinks that the cat is nicer than the dog.

Closed Word Class: Auxiliary Verbs Mark semantic features of main verb. Often describe tense and modality aspects. Semantics is difficult. Tense addition expressing present, past or future, ... e.g. He will take the cat home. Aspect addition expressing completion of action e.g. He is taking the cat home. (incomplete) Mood addition expressing necessity of action e.g. He can take the cat home. (possible)

Closed Word Class: Copula, Modal Verbs Copula (be, do, have) and Modal Verbs (can, should, ...) are subclasses of Auxiliary Verbs. Describe state, process, or tense / modality of action. Semantics: difficult (e.g. modal logic) State / Process: be and do e.g. He is at home. He does nothing. Tense: have e.g. He has taken the cat home. Modality: can, ought to, should, must e.g. He can take the cat home. (possibility)

POS Tagging - Taggers Methods for POS Tagging: Rule-Based Tagging use dictionary to assign POS; then use rules to disambiguate words Stochastic Tagging determines tags based on the probability of the occurrence of the tag, given the observed word, in the context of the preceding tags. Similar to Hidden Markov Models (probabilistic finite state machines). Learn tagging rules. Problem in POS Tagging: Ambiguity Problem in POS Tagging: Which tag set to use?

POS Tagging - Tagsets Tagsets for English Penn Treebank, 45 tags Brown corpus, 87 tags C5 tagset, 61 tags C7 tagset, 146 tags For references see Jurafsky, p.296 C5 and C7 tagsets are listed in Appendix C

Fig. 8.6 Penn Treebank, 45 tags

Fig. 8.5 English modal verbs and frequency counts from the CELEX on-line dictionary.

Ambiguity in POS Tagging Fig. 8.7 Word types and ambiguity in the Brown corpus.

Sentence Level Constructs

Sentence Level Constructs I declarative “This flight leaves at 9 am.” S → NP VP imperative “Book this flight for me.” S → VP

Sentence Level Constructs II yes-no-question “Does this flight leave at 9 am?” S → Aux NP VP wh-question “When does this flight leave Winnipeg?” S → Wh-NP Aux NP VP

Noun Phrase Modification 1 Noun Phrase Modifiers head = the central noun of the NP modifiers = additions to head noun included in NP modifiers before the head noun (prenominal) modifiers after the head noun (post-nominal) examples: determiners, adjectives, PPs e.g. the young man the girl with the red hat

Noun Phrase Modification - Prenominal determiner the, a, this, some, ... predeterminer all the flights cardinal numbers, ordinal numbers one flight, the first flight, ... quantifiers much, little

Noun Phrase Modification - Prenominal adjectives a first-class flight, a long flight adjective phrase the least expensive flight Grammar Rule NP → (Det) (Card) (Ord) (Quant) (AP) Nominal PROJECT!

Noun Phrase Modification - Postnominal prepositional phrase PP all flights from Chicago Nominal → Nominal PP (PP) (PP) non-finite clause, gerundive postmodifers all flights arriving after 7 pm Nominal → GerundVP GerundVP → GerundV NP | GerundV PP | ... relative clause a flight that serves breakfast Nominal → Nominal RelClause RelClause → (who | that) VP

Verb Subcategorization Different verbs accept or need different constituents or complements. VP = Verb + other constituents (complements) e.g. He buys the books. Verbs can be classified according to the complements they accept or need. e.g. give needs two complements He gave her the books. sleep accepts no complement He sleeps.

Verb Complements sentential complement NP complement VP  Verb NP VP  Verb inf-sentence I want to fly from Boston to Chicago. NP complement VP  Verb NP I want this flight. no complement VP  Verb I sleep.

Other Verb Complements Prepositional Phrases + other Modifiers can be added to specify location or time of action, state or event described by verb VP  Verb PP PP I fly from Boston to Chicago. VP  Verb PP I sleep in the barn. VP  Verb PP ADV I sleep in the barn tonight.

Assignment 1-B Extend the grammar in the Earley Parser by integrating: complex VPs through sub-categorization and complements complex NPs through pre- and post-modifiers some adverbs (e.g. temporal or manner) plus rule extensions You should define 3-5 new / modified rules in each category. Write down the new rules, and add sample parse outputs generated with the parser program, to illustrate the working of your rules (last chart state is sufficient).