Grammar Engineering: What is it good for? Miriam Butt (University of Konstanz) and Martin Forst (NetBase Solutions) Colombo 2014.

Slides:



Advertisements
Similar presentations
Husse-9 Conference Pécs, January, 2009 HunGram vs. EngGram in ParGram On the Comparison of Hungarian and English in an International Computational.
Advertisements

 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Feature Structures and Parsing Unification Grammars Algorithms for NLP 18 November 2014.
Lexical Functional Grammar History: –Joan Bresnan (linguist, MIT and Stanford) –Ron Kaplan (computational psycholinguist, Xerox PARC) –Around 1978.
Grammar Development Platform Miriam Butt October 2002.
Introduction to LFG Kersti Börjars & Nigel Vincent {k.borjars, University of Manchester Winter school in LFG July University.
Grammar Engineering: Set-valued Attributes Various Kinds of Constraints Case Restrictions on Arguments Miriam Butt (University of Konstanz) and Martin.
Grammatical Relations and Lexical Functional Grammar Grammar Formalisms Spring Term 2004.
The Lexicon II Miriam Butt December Hand-Coding vs. Lexicon Induction Languages contain closed class as well as open class items Closed Class: Auxiliaries,
LFG Slides based on slides by: Kersti Börjars & Nigel Vincent {k.borjars, University of Manchester Winter school in LFG July
Kakia Chatsiou GreekGram: Building a parallel grammar for Modern Greek LAC day GreekGram Building a parallel grammar for Modern Greek Kakia.
Kakia Chatsiou Modern Greek Grammar fragment Implementation using XLE FLATLANDS GreekGram Reporting on the progress of the implementation.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
LEXICAL FUNCTIONAL GRAMMAR (LFG) Anca-Diana BIBIRI 1 st semester
Generation Miriam Butt January The Two Sides of Generation 1) Natural Language Generation (NLG) Systems which take information from some database.
Natural Language Processing - Feature Structures - Feature Structures and Unification.
Towards an NLP `module’ The role of an utterance-level interface.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Parsing I Miriam Butt May 2005 Jurafsky and Martin, Chapters 10, 13.
Issues in Computational Linguistics: Parsing and Generation Dick Crouch and Tracy King.
Integrating Finite-state Morphologies with Deep LFG Grammars Tracy Holloway King.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Issues in Computational Linguistics: Grammar Engineering Dick Crouch and Tracy King.
1 Kakia Chatsiou Department of Language and Linguistics University of Essex XLE Tutorial & Demo LG517. Introduction to LFG Introduction.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
1 CONTEXT-FREE GRAMMARS. NLE 2 Syntactic analysis (Parsing) S NPVP ATNNSVBD NP AT NNthechildrenate thecake.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Linguistics II Syntax. Rules of how words go together to form sentences What types of words go together How the presence of some words predetermines others.
Feature structures and unification Attributes and values.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Kakia Chatsiou A brief introduction to XLE LG617 - XLE Lab1 LG617 A brief introduction to XLE Kakia Chatsiou Dept of Language.
Kakia Chatsiou A brief introduction to XLE LG617 - XLE Lab1 LG617 A brief introduction to XLE Kakia Chatsiou Dept of Language.
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Notes on Pinker ch.7 Grammar, parsing, meaning. What is a grammar? A grammar is a code or function that is a database specifying what kind of sounds correspond.
Grammar Engineering: Coordination and Macros METARULEMACRO Interfacing finite-state morphology Miriam Butt (University of Konstanz) and Martin Forst (NetBase.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 4.
Head-driven Phrase Structure Grammar (HPSG)
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
Culture , Language and Communication
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Rules, Movement, Ambiguity
The Minimalist Program
October 25, : Grammars and Lexicons Lori Levin.
November 16, 2004 Lexicon (An Interacting Subsystem in UG) Part-II Rajat Kumar Mohanty IIT Bombay.
Supertagging CMSC Natural Language Processing January 31, 2006.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
ACTL 2008 Syntax: Introduction to LFG Peter Austin, Linguistics Department, SOAS with thanks to Kersti Börjars & Nigel Vincent.
1 Natural Language Processing Lecture 6 Features and Augmented Grammars Reading: James Allen NLU (Chapter 4)
NLP. Introduction to NLP (U)nderstanding and (G)eneration Language Computer (U) Language (G)
1.[ S I forced him [ S PRO to be kind]] Phrase structure analyses in traditional transformational grammar:
Lexical-Functional Grammar A Formal System for Grammatical Representation Kaplan and Bresnan, 1982 Erin Fitzgerald NLP Reading Group October 18, 2006.
NATURAL LANGUAGE PROCESSING
September 26, : Grammars and Lexicons Lori Levin.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Natural Language Processing Vasile Rus
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Department of Language and Linguistics
Parsing Unrestricted Text
Presentation transcript:

Grammar Engineering: What is it good for? Miriam Butt (University of Konstanz) and Martin Forst (NetBase Solutions) Colombo 2014

Outline What is a deep grammar and why would you want one? XLE and ParGram Robustness techniques Generation and Disambiguation Some Applications: –Question-Answering System –Murrinh-Patha Translation System –Computer Assisted Language Learning (CALL) –[Text Summarization]

Deep grammars Provide detailed syntactic/semantic analyses –HPSG (LinGO, Matrix), LFG (ParGram) –Grammatical functions, tense, number, etc. Mary wants to leave. subj(want~1,Mary~3) comp(want~1,leave~2) subj(leave~2,Mary~3) tense(leave~2,present) Usually manually constructed

Why don't people use them? Time consuming and expensive to write –shallow parsers can be induced automatically from a training set Brittle –shallow parsers produce something for everything Ambiguous –shallow parsers rank the outputs Slow –shallow parsers are very fast (real time)

Why should one pay attention now? Robustness: –Integrated Chunk Parsers/Fragment Grammars –Bad input always results in some (possibly good) output Ambiguity : –Integration of stochastic methods –Optimality Theory used to rank/pick alternatives Speed: getting closer to shallow parsers Accuracy and information content: –far beyond the capabilities of shallow parsers. New Generation of Large-Scale Grammars:

XLE at PARC Platform for Developing Large-Scale LFG Grammars LFG (Lexical-Functional Grammar) –Invented in the 1980s (Joan Bresnan and Ronald Kaplan) –Theoretically stable  Solid Implementation XLE is implemented in C, used with emacs, tcl/tk XLE includes a parser, generator and transfer (XFR) component.

Project Structure Languages: Arabic, Chinese, Danish, English, French, Georgian, German, Hungarian, Irish Gaelic, Indonesian, Japanese, Malagasy, Murrihn-Patha, Norwegian, Polish, Tigrinya, Turkish, Urdu, Welsh, Wolof… Theory: Lexical-Functional Grammar Platform: XLE –parser –generator –machine translation Loose organization: no common deliverables, but common interests.

Grammar Components Each Grammar contains: Annotated Phrase Structure Rules (S --> NP VP) Lexicon (verb stems and functional elements) Finite-State Morphological Analyzer A version of Optimality Theory (OT): used as a filter to restrict ambiguities and/or parametrize the grammar.

The Parallel in ParGram Analyze languages to a degree of abstraction that reflects the common underlying structure (i.e., identiy the subject, the object, the tense, mood, etc.) Even at this level, there is usually more than one way to analyze a construction The same theoretical analysis may have different possible implementations The ParGram Project decides on common analyses and implementations (via meetings and the feature committee)

The Parallel in ParGram Analyses at the level of c-structure are allowed to differ (variance across languages) Analyses at f-structure are held as parallel as possible across languages (crosslinguistic invariance). Theoretical Advantage: This models the idea of UG. Applicational Advantage: machine translation is made easier; applications are more easily adapted to new languages (e.g., Kim et al. 2003).

Basic LFG Constituent-Structure: tree Functional-Structure: Attribute Value Matrix universal NP PRON they S VP V appear PRED'pro' PERS 3 NUMpl SUBJ TENSEpres PRED'appear '

Basic LFG (cont.) Three basic wellformedness principles: Consistency Every attribute can only have one value. Completeness All grammatical functions listed in a subcategorization frame must be present, and if thematic, they must have a PRED. Coherence All grammatical functions in an f-structure must be licensed by a subcategorization frame.

Syntactic rules Annotated phrase structure rules Category --> Cat1: Schemata1; Cat2: Schemata2; Cat3: Schemata3. S --> NP: (^ SUBJ)=! (! CASE)=NOM; VP: ^=!.

Another sample rule "indicate comments" VP --> V: ^=!; "head" (NP: (^ OBJ)=! "() = optionality" (! CASE)=ACC) PP*: ! $ (^ ADJUNCT). "$ = set" VP consists of: a head verb an optional object zero or more PP adjuncts

Lexicon Basic form for lexical entries: word Category1 Morphcode1 Schemata1; Category2 Morphcode2 Schemata2. walk V * (^ PRED)='WALK '; N * (^ PRED) = 'WALK'. girl N * (^ PRED) = 'GIRL'. kick V * { (^ PRED)='KICK ' |(^ PRED)='KICK '}. the D * (^ DEF)=+.

Parsing a string create-parser grammar1.lfg parse ”the dog eats bones” Ungrammatical via Unification, etc. Simple Demo