Introduction to Natural Language Generation

Slides:



Advertisements
Similar presentations
25 February 2009Instructor: Tasneem Darwish1 University of Palestine Faculty of Applied Engineering and Urban Planning Software Engineering Department.
Advertisements

Database Systems: Design, Implementation, and Management Tenth Edition
December 2003CSA3050: Natural Language Generation 1 What is Natural Language Generation? When is NLG an Appropriate Technology? NLG System Architectures.
The design process IACT 403 IACT 931 CSCI 324 Human Computer Interface Lecturer:Gene Awyzio Room:3.117 Phone:
UNIT-III By Mr. M. V. Nikum (B.E.I.T). Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:-
Natural Language Generation: Discourse Planning
Chapter 20: Natural Language Generation Presented by: Anastasia Gorbunova LING538: Computational Linguistics, Fall 2006 Speech and Language Processing.
Natural Language Generation Research Presentation Presenter Shamima Mithun.
Generation Miriam Butt January The Two Sides of Generation 1) Natural Language Generation (NLG) Systems which take information from some database.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Instructor: Tasneem Darwish
Software Testing and Quality Assurance
USING SEMANTIC AUTHORING FOR BLISSYMBOLS COMUNICATION BOARDS Yael Netzer and Michael Elhadad Dep. of Computer Science, Ben Gurion University, Beer Sheva,
Natural Language Generation Martin Hassel KTH CSC Royal Institute of Technology Stockholm
Software Requirements
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Natural Language Generation Ling 571 Fei Xia Week 8: 11/17/05.
Information Modeling: The process and the required competencies of its participants Paul Frederiks Theo van der Weide.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
1. Introduction Which rules to describe Form and Function Type versus Token 2 Discourse Grammar Appreciation.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
1 A Chart Parser for Analyzing Modern Standard Arabic Sentence Eman Othman Computer Science Dept., Institute of Statistical Studies and Research (ISSR),
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
Using Rhetorical Grammar in the English 90 Classroom.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Objects What are Objects Observations
PROGRAMMING LANGUAGES The Study of Programming Languages.
® IBM Software Group © 2006 IBM Corporation Writing Good Use Cases Module 4: Detailing a Use Case.
9/8/20151 Natural Language Processing Lecture Notes 1.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Natural Language Generation An Overview
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
ITEC224 Database Programming
An Introduction to Software Architecture
Parser-Driven Games Tool programming © Allan C. Milne Abertay University v
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
Te Kaitito A dialogue system for CALL Peter Vlugter, Alistair Knott, and Victoria Weatherall Department of Computer Science School of Māori, Pacific, and.
An ICALL writing support system tunable to varying levels of learner initiative Karin Harbusch 1 & Gerard Kempen 2,3 1 University of Koblenz-Landau, Koblenz,
Dr. Francisco Perlas Dumanig
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Introduction to Computational Linguistics
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Jan 2004CSA3050: NLG21 CSA3050: Natural Language Generation 2 Surface Realisation Systemic Grammar Functional Unification Grammar see J&M Chapter 20.3.
Ch- 8. Class Diagrams Class diagrams are the most common diagram found in modeling object- oriented systems. Class diagrams are important not only for.
Natural Language Generation Martin Hassel KTH NADA Royal Institute of Technology Stockholm
SYNTAX.
Formal Specification: a Roadmap Axel van Lamsweerde published on ICSE (International Conference on Software Engineering) Jing Ai 10/28/2003.
NLP. Introduction to NLP (U)nderstanding and (G)eneration Language Computer (U) Language (G)
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
NATURAL LANGUAGE PROCESSING
Discourse & Natural Language Generation Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
SYNTAX.
Advanced Computer Systems
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Informatics 121 Software Design I
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Introduction to Natural Language Generation Yael Netzer Department of Computer Science Ben Gurion University

Outline Introduction – what is NLG Traditional architecture of NLG system Statistical methods in NLG FUF/SURGE An example in Hebrew – the noun phrase A statistical method for generation Yael Netzer BGU November 6, 2001

What is Natural Language Generation (NLG) NLG is the process of constructing natural language outputs from non-linguistic inputs. [VanLinden] NLG is mapping some communication goal to some surface utterance that satisfies the goal. [Reiter & Dale] Yael Netzer BGU November 6, 2001

Aspects in NLG Theoretical and practical interests: Theoretical: modeling various depths of human language representation and production. Practical: engineering human/computer interfaces (computer as an author/authoring aid). Yael Netzer BGU November 6, 2001

Systems for examples: NLG as an Author: NLG as an author aid Weather reports (FoG) Stock market descriptions Museum artifacts descriptions (ILEX) “Personal” letters to costumers (AlethGen) NLG as an author aid Integrated (partial) NLG uses: NLG in augmentative and alternative communication Summarization (integrate ‘cut and paste’ techniques with generation) Machine Translation (generation from interlingua) Yael Netzer BGU November 6, 2001

Inputs of NLG systems Formally, a system can be defined as a four-tuple: {k,c,u,d} k- knowledge source (tables of numbers, knowledge representation lang.) domain dependent, no generalizations. c - communicative goal: the consequence of a given execution of the system (considering appropriate information) Yael Netzer BGU November 6, 2001

NLG input spec. cont. u - user model: characterization of the hearer or intended audience for whom the text is to be generated. d - discourse history: previous interactions between user and NLG controlling anaphoric forms, preventing repetitions. Yael Netzer BGU November 6, 2001

The output for an NLG system Any text conveying the communicative goal: It can be a word like ``yes'' in a dialogue - or a text consisting of many paragraphs in other cases. The output should be related to the medium: web pages with hyperlinks, voice stream etc. Yael Netzer BGU November 6, 2001

Main (Pipeline) Architecture Content determination What information should be included in the text? Document structuring how to organize text Lexicalisation choosing particular words or phrases Aggregation composing chunks of info into sentences. Referring expression generation – what properties should be used in referring to an entity. Surface realization mapping underlying content of text to a grammatically correct sentence that expresses the desired meaning. Yael Netzer BGU November 6, 2001

Content Determination The process of deciding what to say. No general rules - domain specific. what is important - what should always be included, what is exceptional information, etc. Practically – constructs a set of messages from the underlying data (entities, concepts and relations). Yael Netzer BGU November 6, 2001

Document Structuring Document Structuring: imposing ordering and structure over the information. - conceptual grouping - rhetorical relationships. Yael Netzer BGU November 6, 2001

Lexical choice Lexical chooser: determining the particular words to be used to express concepts and relations. complexity of coding vs. richer language. choosing content words: information is mapped from conceptual vocabulary. LC should supply a variety of words, consider the user model [precise vs. general description of weather phenomenon], and account for pragmatic considerations (formal vs. casual style). Yael Netzer BGU November 6, 2001

Aggregation Aggregation - can be performed in various stages: the planner: combines similar data. In lexicalization: aggregates some concepts into one lexical element. Aggregations of sentences: The month was cooler than average. The month was drier than average into The month was cooler and drier than average Yael Netzer BGU November 6, 2001

Referring expression generation an entity can be referred in many ways: initially, subsequently, distinguishing, definite, pronouns. Proper names: באר שבע באר שבע בית הנגב Definite descriptions: The train that leaves at 10am The next train. Prounouns it Yael Netzer BGU November 6, 2001

Syntactic realizer Syntactic Realizer: syntax and morphology. Most general, domain independent (but definitely language dependent). Various Usage Scenarios Input to syntactic realization is not observable Input for syntactic realizers in NLG What knowledge is needed to prepare input? Who supplies this knowledge? Can we find a common abstraction, common across languages and applications? Yael Netzer BGU November 6, 2001

Possible techniques for realizers Bi-directional grammar specification. Grammar specifications tuned for generation. Templates Corpus statistics Yael Netzer BGU November 6, 2001

A note on bi-directional grammar Realization, in some aspects, is easier than parsing: no need to handle the full range of syntax that a human might use, no need to resolve ambiguities, no need to recover ill-formed input. A bi-directional grammar, is, theoretically, a possible elegant approach. However, most NLG systems use a generation-oriented grammar Yael Netzer BGU November 6, 2001

Why not bi-directional? Output of NLU parser is very different from the input to an NLG realizer. Not obvious that lexicalization is a part of the realization. Practically, not easy to engineer large bi-directional grammars. And more: generation is the process of choices, even to use ‘canned text’ when needed. Yael Netzer BGU November 6, 2001

Syntactic Realizer This work concerns Syntactic Realizers – the grammar Input for grammar: lexicalized representation of a phrase in various levels of abstractions. Output of grammar: a grammatical string, representing most accurately the info in the input. Yael Netzer BGU November 6, 2001

The input question is: Knowledge Input?? Syntactic base Realizer Application Content planner And lexicon Knowledge base Syntactic Realizer Yael Netzer BGU November 6, 2001

FUF/SURGE - Implementation The grammar is written in FUF – Functional Unification Formalism [Elhadad] FD - a list of (att val) val = atom\fd\path Grammar: meta-FD: disjunction with ALT, control with NONE, GIVEN, ANY. All components in the generation process can be implemented with this formalism. Yael Netzer BGU November 6, 2001

Requirements for a syntactic realizer Mapping thematic structure onto syntactic roles. Control of syntactic paraphrasing and alternations. Provision of default for syntactic features. Propagation of agreement features. Selection of closed class words. The imposition of linear precedence constraints. The inflection of open class words. Yael Netzer BGU November 6, 2001

SURGE [Elhadad&Robin 96] Functional Grammar, HPSG and descriptive studies of language Input for the grammar is a lexicalized representation of a phrase (a clause, NP, AP). Minimal syntactic information in the input allows isolating earlier stages of the process from containing purely syntactic knowledge, it gives the grammar paraphrasing power, and it is also useful for multilingual application. Yael Netzer BGU November 6, 2001

Input for SURGE in general Each constituent has the feature cat which determines which part of the grammar it will be unified with. The representation of the clause is mostly semantic: a process (in SFL terms) and its participant. Paraphrasing can be done using one feature, like focus The input of an NP uses mostly syntactic features. Paraphrases requires different input. Yael Netzer BGU November 6, 2001

The girl was kissed by John. (focus {partic affected}) An Example The girl was kissed by John. John kissed the girl. ((cat clause) (tense past) (process ((type material) (agentless no) (lex “kiss”))) (participants ((agent ((cat proper) (lex “John”))) (affected ((cat common) (lex “girl”)))))) (focus {partic affected}) Yael Netzer BGU November 6, 2001