TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Parsing: computing the grammatical structure of English sentences COMP3310.
Advertisements

Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
PARSING WITH CONTEXT-FREE GRAMMARS
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
1 LING 180 Autumn 2007 LINGUIST 180: Introduction to Computational Linguistics Dan Jurafsky, Marie-Catherine de Marneffe Lecture 9: Grammar and Parsing.
CS Basic Parsing with Context-Free Grammars.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור שמונה Context Free Grammars and.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
CS 4705 Basic Parsing with Context-Free Grammars.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Syntax Construction of phrases and sentences from morphemes and words. Usually the word syntax refers to the way words are arranged together. Syntactic.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING COMP3310 Natural Language Processing Eric Atwell, Language Research Group.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Chapter 12: FORMAL GRAMMARS OF ENGLISH Heshaam Faili University of Tehran.
Chapter 10. Parsing with CFGs From: Chapter 10 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Grammars CPSC 5135.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.
1 LIN6932 Spring 2007 LIN6932 Topics in Computational Linguistics Lecture 6: Grammar and Parsing (I) February 15, 2007 Hana Filip.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Rules, Movement, Ambiguity
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
Syntax Sudeshna Sarkar 25 Aug 2008.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
Computerlinguistik II / Sprachtechnologie Vorlesung im SS 2010 (M-GSW-10) Prof. Dr. Udo Hahn Lehrstuhl für Computerlinguistik Institut für Germanistische.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור שמונה Context Free Parsing אורן.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Parsing with Context Free Grammars. Slide 1 Outline Why should you care? Parsing Top-Down Parsing Bottom-Up Parsing Bottom-Up Space (an example) Top -
Context Free Grammars. Slide 1 Syntax Syntax = rules describing how words can connect to each other * that and after year last I saw you yesterday colorless.
Natural Language Processing Vasile Rus
Basic Parsing with Context Free Grammars Chapter 13
Lecture 13: Grammar and Parsing (I) November 9, 2004 Dan Jurafsky
CPSC 503 Computational Linguistics
Natural Language - General
CPSC 503 Computational Linguistics
CSA2050 Introduction to Computational Linguistics
David Kauchak CS159 – Spring 2019
Presentation transcript:

TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)

DAL CHUNKING ALL’ANALISI SINTATTICA COMPLETA

PROBLEMA: AMBIGUITA’ While hunting in Africa, I shot an elephant in my pajamas. How an elephant got into my pajamas I'll never know.

PROBLEMA: AMBIGUITA’ While hunting in Africa, I shot an elephant in my pajamas. How an elephant got into my pajamas I'll never know.

CARATTERIZZAZIONE DELLA SINTASSI DI UNA LINGUA: CONTEXT-FREE GRAMMARS Slides ELN?

CARATTERIZZAZIONE DELLA SINTASSI DI UNA LINGUA: CONTEXT-FREE GRAMMARS Capture constituency and ordering – Ordering: What are the rules that govern the ordering of words and bigger units in the language? – Constituency: How words group into units and how the various kinds of units behave

Constituency E.g., Noun phrases (NPs) Three parties from Brooklyn A high-class spot such as Mindy’s The Broadway coppers They Harry the Horse The reason he comes into the Hot Box How do we know these form a constituent?

Constituency (II) – They can all appear before a verb: Three parties from Brooklyn arrive… A high-class spot such as Mindy’s attracts… The Broadway coppers love… They sit – But individual words can’t always appear before verbs: *from arrive… *as attracts… *the is *spot is… – Must be able to state generalizations like: Noun phrases occur before verbs

Constituency (III) Preposing and postposing: – On September 17th, I’d like to fly from Atlanta to Denver – I’d like to fly on September 17th from Atlanta to Denver – I’d like to fly from Atlanta to Denver on September 17th. But not: – *On September, I’d like to fly 17th from Atlanta to Denver – *On I’d like to fly September 17th from Atlanta to Denver

Indicating constituents: brackets, trees [ S [ NP [ PRO I]] [ VP [ V prefer] [ NP [ Det a] [ Nom [ N morning] [ N flight] ] ] ] ] S NPVP NP VerbPro Nom DetNoun Iprefermorningaflight

CFG example S -> NP VP NP -> Det NOMINAL NOMINAL -> Noun VP -> Verb Det -> a Noun -> flight Verb -> left

NLE12 Beyond regular languages: Context- Free Grammars S  NP VP NP  Det Nominal Nominal  Noun VP  V Det  the Det  a Noun  flight V  left

CFGs: set of rules S -> NP VP – This says that there are units called S, NP, and VP in this language – That an S consists of an NP followed immediately by a VP – Doesn’t say that that’s the only kind of S – Nor does it say that this is the only place that NPs and VPs occur

Generativity As with FSAs you can view these rules as either analysis or synthesis machines – Generate strings in the language – Reject strings not in the language – Impose structures (trees) on strings in the language How can we define grammatical vs. ungrammatical sentences?

Derivations A derivation is a sequence of rules applied to a string that accounts for that string – Covers all the elements in the string – Covers only the elements in the string

Derivations as Trees S NPVP NP VerbPro Nom DetNoun Iprefermorningaflight

CFGs more formally A context-free grammar has 4 parameters (“is a 4-tuple”) 1)A set of non-terminal symbols (“variables”) N 2)A set of terminal symbols  (disjoint from N) 3)A set of productions P, each of the form A ->  Where A is a non-terminal and  is a string of symbols from the infinite set of strings (   N)* 4)A designated start symbol S

Defining a CF language via derivation A string A derives a string B if – A can be rewritten as B via some series of rule applications More formally: – If A ->  is a production of P –  and  are any strings in the set (   N)* – Then we say that  A  directly derives  or  A    – Derivation is a generalization of direct derivation – Let  1,  2, …  m be strings in (   N)*, m>= 1, s.t.  1   2,  2   3 …  m-1   m We say that  1 derives  m or  1*   m – We then formally define language L G generated by grammar G A set of strings composed of terminal symbols derived from S L G = {w | w is in  * and S *  w}

NLE19 Derivations A DERIVATION of a string is a sequence of rule applications – E.g., the string “a flight” can be derived from the grammar above and symbol NP by the (leftmost first) derivation NP => Det Nominal => a Nominal => a Noun => a flight Derivations can be visualized as PARSE TREES The LANGUAGE defined by a CFG is the set of strings derivable from the start symbol S (for Sentence)

NLE20 Derivations and parse trees

NLE 21 A more formal definition A CFG is a 4-tuple consisting of

NLE22 What `context free’ means

NLE23 Derivations and languages The language L G GENERATED by a CFG grammar G is the set of strings of TERMINAL symbols that can be derived from the start symbol S using the production rules in G – L G = {w | w is in  * and S derives w} The strings in L G are called GRAMMATICAL The strings not in L G are called UNGRAMMATICAL

NLE24 Grammar development One of the most basic skills in NLE is the ability to write a CFG for some fragment of a language (e.g., the dates) We’ll briefly cover some of the issues to be addressed when writing small CFG grammars

CFG in PYTHON NLTK, 8.3

ANALISI SINTATTICA TOP-DOWN search: the parse tree has to be rooted in the start symbol S – EXPECTATION-DRIVEN parsing – Esempio; RECURSIVE DESCENT BOTTOM-UP search: the parse tree must be an analysis of the input – DATA-DRIVEN parsing – Esempio: SHIFT-REDUCE

TOP-DOWN PARSING CON NLTK Recursive descent parsing (NLTK, 8.3) – nltk.RecursiveDescentParser(grammar) – nltk.app.rdparser()

BOTTOM-UP PARSING CON NLTK Shift-reduce (NLTK, 8.3, p. 305) – nltk.app.srparser() – ShiftReduceParser(grammar)

MODELLI PIU’ AVANZATI DI PARSING Left corner (NLTK) Chart (NLTK)

DEPENDENCIES E DEPENDENCY GRAMMAR (NLTK, 8.5)

IL PROBLEMA DELL’AMBIGUITA’ Ambiguity – Church and Patel (1982): the number of attachment ambiguities grows like the Catalan numbers C(2) = 2, C(3) = 5, C(4) = 14, C(5) = 132, C(6) = 469, C(7) = 1430, C(8) = 4867 Avoiding reparsing

COMMON STRUCTURAL AMBIGUITIES COORDINATION ambiguity – OLD (MEN AND WOMEN) vs (OLD MEN) AND WOMEN ATTACHMENT ambiguity: – Gerundive VP attachment ambiguity I saw the Eiffel Tower flying to Paris – PP attachment ambiguity I shot an elephant in my pajamas

PP ATTACHMENT AMBIGUITY

AMBIGUITY: SOLUTIONS Use a PROBABILISTIC GRAMMAR (not covered in this module) Use semantics

SCRIVERE UNA GRAMMATICA NLTK, 8.6