A short introduction to Natural Language Generation Kees van Deemter Computing Science University of Aberdeen.

Slides:



Advertisements
Similar presentations
Common Core Standards (What this means in computer class)
Advertisements

Question Exploration Guide
Natural Language Generation
Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 4 Shieber 1993; van Deemter 2002.
A very short introduction to Natural Language Generation Kees van Deemter Computing Science University of Aberdeen.
Generation of Referring Expressions: Managing Structural Ambiguities I.H. KhanG. Ritchie K. van Deemter University of Aberdeen, UK.
Generation of Referring Expressions: the State of the Art SELLC Summer School, Harbin 2010 Kees van Deemter Computing Science University of Aberdeen.
Chapter 11 Designing the User Interface
MASS MEDIA Grupo Donos. NEWSPAPERS In Britain, most newspapers are daily (they come out / are published everyday); a few only come out on Sundays. Magazines.
CS4018 Formal Models of Computation weeks Computability and Complexity Kees van Deemter (partly based on lecture notes by Dirk Nikodem)
Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
“I Can” Learning Targets 5 th English/Writing 6th Six Weeks.
“I Can” Learning Targets
6+1 Writing Traits A Guide to Making Your Writing the Best That It Can Possibly Be!
When its time to give your next sales presentation, here are my favorite tips for delivering powerful, charismatic, and engaging sales presentations. Top.
December 2003CSA3050: Natural Language Generation 1 What is Natural Language Generation? When is NLG an Appropriate Technology? NLG System Architectures.
Critical Thinking Course Introduction and Lesson 1
CO1010 IT Skills in Science Lecture 3: Good Practice in Report Writing.
Natural Language Generation: Discourse Planning
Natural Language Generation Research Presentation Presenter Shamima Mithun.
Natural Language Generation Ling 571 Fei Xia Week 8: 11/17/05.
1 Info 1409 Systems Analysis & Design Module Lecture 8 – Modelling tools and techniques HND Year /9 De Montfort University.
User interface design Designing effective interfaces for software systems Objectives To suggest some general design principles for user interface design.
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. Various techniques for assessing student listening ability. You.
Creating a Web Site Back to Table of Contents. Creating a Web Site Conceiving a Web Site Planning a Web Site 2 Creating a Web Site Section 9-1 Section.
GRAMMAR: PARTS OF SPEECH
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Communication Key Skills INSET. Outline of INSET training 1. A review of the standards for all levels of communication key skill 2. Examples of portfolios.
 The ACT Writing Test is an optional, 30-minute test which measures your writing skills. The test consists of one writing prompt, following by two opposing.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Eric Cohen Books 2007 Simply Writing - Task to Project to Bagrut What's expected and what you can do! Clarity, Expectation and Format.
Use Basic Sentences – Translation software typically tries to translate whole sentences rather than translating word-by-word. SOLUTION: USE.
O VERVIEW OF THE W RITING P ROCESS Language Network – Chapter 12.
TOEIC Test: Listening Comprehension
SAMPLE HEURISTIC EVALUATION FOR 680NEWS.COM Glenn Teneycke.
What Makes an Essay an Essay. Essay is defined as a short piece of composition written from a writer’s point of view that is most commonly linked to an.
UNIT 1 ENGLISH DISCOURSE ANALYSIS (an Introduction)
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
Designing Interface Components. Components Navigation components - the user uses these components to give instructions. Input – Components that are used.
What is a reflection? serious thought or consideration the fixing of the mind on some subject;
Orna Farrell Presentation Skills Orna Farrell
Business Communication Workshop Course Coordinator:Ayyaz Qadeer Lecture # 9.
Microsoft ® Office PowerPoint ® 2003 Training Playing sound [Your company name] presents:
Multimedia ITGS. Multimedia Multimedia: Documents that contain information in more than one form: Text Sound Images Video Hypertext: A document or set.
Problem Solving Techniques. Compiler n Is a computer program whose purpose is to take a description of a desired program coded in a programming language.
Software Architecture
Introduction to Computational Linguistics
Presentations and Reports. Third Week (2/2/12)  Meet at the Albertsons Library in room LIB 203  Beth Brin will demonstrate the use of several databases.
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. The terminology and concepts of semantics, pragmatics and discourse.
CSC USI Class Meeting 10 November 9, 2010.
How (Not) to Give a Good Talk Steffen Koch, Daniel Maurer, Michael Stoll, Sebastian Volz, Andrés Bruhn with contributions by Michael Raschke and others.
Module C: Representation and Text Elective 2: History and Memory Prescribed Text: Fiftieth Gate.
What is Artificial Intelligence?
BY DR. HAMZA ABDULGHANI MBBS,DPHC,ABFM,FRCGP (UK), Diploma MedED(UK) Associate Professor DEPT. OF MEDICAL EDUCATION COLLEGE OF MEDICINE June 2012 Writing.
Unit 5 Travelling abroad Reading. Unit 5 Travelling abroad Reading.
8 Writing Style “A collection of good sentences resembles a string of pearls.” ― Chinese proverb.
Presentation On HTML & Podcast Done by: Shamelia Young & Sheriece Williamson.
1 Technical Communication A Reader-Centred Approach First Canadian Edition Paul V. Anderson Kerry Surman
1 Technical Communication A Reader-Centred Approach First Canadian Edition Paul V. Anderson Kerry Surman
US Collaborating Schools Session 2 Module 1 Week 1 1.
EXAMINERS’ COMMENTS RAPHAEL’S LONG TURN GRAMMAR Accurate use of simple grammatical structures and also of some complex sentences: ‘they could also be preparing.
TALK, DO NOT READ! or what makes a good oral presentation Десислава Зарева, НБУ 2011/2012.
1 Vocabulary acquisition from extensive reading: A case study Maria Pigada and Norbert Schmitt ( 2006)
Primary Sources WHY USE THEM? Teresa Potter, OKAGE Teacher Consultant
1. Chapter Preview Part 1 – Listening in the Classroom  Listening Skills: The Problem and the Goal  Listening Tasks in Class Part 2 – Listening outside.
Ehud Reiter, Computing Science, University of Aberdeen1 CS5545: Natural Language Generation Background Reading: Reiter and Dale, Building Natural Language.
CELDT Preparation 4- Picture Narrative
UDL Checkpoints 1.1, 1.2, 1.3, 2.1, 2.2, 2.3, 2.4, 2.5.
FCE (FIRST CERTIFICATE IN ENGLISH) General information.
Lecture 1: General Communication Skills
Presentation transcript:

A short introduction to Natural Language Generation Kees van Deemter Computing Science University of Aberdeen

These introductory slides... owe much to earlier slides by Chris Mellish

First: NLG from a practical perspective Goal (usually): computer software which produces understandable and appropriate texts in some human language Input: some non-linguistic representation of information (e.g., tables in database, logical formulas, JAVA code,...) Output: documents, reports, explanations, help messages,... Knowledge sources required: knowledge of language and of the domain; maybe of the intended audience as well

Text Language Technology Natural Language Understanding Natural Language Generation Speech Recognition Speech Synthesis Text Meaning Speech

Example System: FoG Function: Produces textual weather reports in English and French Input: Graphical/numerical weather depiction User: Environment Canada (Canadian Weather Service) Developer: CoGenTex. [Kitteridge, Goldberg and Driedger 1994.] Status: Fielded, in operational use since 1992

FoG: Input

FoG: Output

Example System: STOP Function: Produce personalised stop-smoking leaflets Input: Questionnaire about smoking status, beliefs, etc Target user: NHS Developer: Aberdeen University (CS, Medicine, GP Depts) [Reiter & Robertson 1999] See Status: Clinical trial suggested not effective

STOP: Input

STOP: Output Dear Ms Cameron Thank you for taking the trouble to return the smoking questionnaire that we sent you. It appears from your answers that although you're not planning to stop smoking in the near future, you would like to stop if it was easy. You think it would be difficult to stop because smoking helps you cope with stress, it is something to do when you are bored, and smoking stops you putting on weight. However, you have reasons to be confident of success if you did try to stop, and there are ways of coping with the difficulties.

Example System: Dial Your Disc (DYD) Function: Context-sensitive descriptions of Mozarts instrumental music Input: Music database + history of interaction Target user: Music industry, customers for music-on-demand Developer: Philips Electronics (Nat Lab – IPO, Eindhoven; ) [Van Deemter & Odijk 1995] Status: Not deployed; methods reused in GOALGETTER and other systems

Example System: Dial Your Disc (DYD) User composes a home-made CD. A number of tracks are on the CD already. Speech (with keyword spotting) tells system what type of music the user would like to add to the CD E.g., Id like some piano music. Im interested in solo performances. piano, solo System chooses one composition with solo piano (at random). The music starts. After a while, a text is spoken (while the music is turned down). Previous descriptions are taken into account. For example, the second time a piano sonata is selected, the following text may be generated: (Many choices were randomised, so you would seldom get the same monologue twice)

Example System: Dial Your Disc (DYD) Example of approximate output, in its most elaborate form: The following+ composition+, from which you are going to hear a fragment+ of part three+, was written+ by Mozart in the beginning+ of seventeen+ seventy+ five+, in Munich+. The work is also+ a sonata+ in f+, like the preceding+ composition, but now+ for piano+. The KV+ number of this work is K. two+ eight+ zero+. This sonata+ consists of three+ parts+: allegro assai+, adagio+, and presto+. The presto lasts two+ minutes+ forty+ five+ seconds+. This presto is located on track six+ of first+ CD+ of volume seventeen+. The piano+ is played by Mitsuko Uchida+. The recording+ of the sonata+ was made+ in the Henry Wood+ Hall in London+, England, in the eighties+. The quality+ of its recording is DDD+. The following+ is a fragment+ of the third+ part+. [A fragment follows] Each + marks a pitch accent on the preceding word

Example System: ILEX Function: Context-sensitive descriptions of museum artefacts Input: Museum database + history of interaction Target user: National Museums of Scotland Developer: Edinburgh University [R.Dale et al. 1998; Oberlander et al. 1998] See Status: Commercial application under investigation

When to use NLG? NLG is better than having people write texts when: There are many potential documents to be written, differing according to the context (user, situation, language) There are some general principles behind document design.

Why is NLG hard? NLG involves making many choices, e.g. which content to include, what order to say it in, what words and syntactic constructions to use. Linguistics does not yet provide us with a ready-made, precise theory about how to make such choices to produce coherent text

Why is NLG hard? The choices to be made interact with one another in complex ways Many results of choices (e.g. length and readability of the text) are only visible at the end of the process

Choices The Serbian Prime Minister, Zoran Djindjic, has been assassinated in the capital, Belgrade. The pro-reform, pro-Western leader was shot in the stomach and in the back outside government offices at around 1300 (1200 gmt), and died of his wounds in hospital. (BBC news, UK edition, 12/3/03)

Tasks and architecture Most practical NLG systems use a fixed order in which these generation tasks are performed After Reiter 1994, we often speak of the NLG pipeline Different systems use slightly different orderings.

Tasks and Architecture in NLG Content Determination Document Structuring Aggregation Lexicalisation Generation of Referring Expressions Linguistic Realisation Physical Realisation Document Planning Micro- planning Surface Realisation

Example: Noun Phrase design A noun phrase can convey an arbitrary amount of information: Someone vs a designer vs an old designer vs an old designer with red hair … How much information should we pack into a given NP?

Some Issues to Consider Telling the reader what they need to know (e.g., who youre talking about, and whats worth knowing about them) Clarity and readability of the NP; other effects on the reader (e.g., via politeness) Successful use of pronouns and abbreviated references

Example Content (NB we assume that words, basic syntax etc have been chosen) This T-shirt was made by James Sportler. Sportler is a famous British designer. He drives an ancient pink Jaguar. He works in London with Thomas Wendsop. Wendsop won the first prize in the FWJG awards. Can/should we add more to the NP?

One possible addition This T-shirt was made by James Sportler, who works in London with Thomas Wendsop. Sportler is a famous British designer. He drives an ancient pink Jaguar. Wendsop won the first prize in the FWJG awards. Facts about Wendsop are now separated from one another (focus). Wendsop now has greater prominence in the text (ordering)

Another possible addition This T-shirt was made by James Sportler, a famous British designer who works in London with Thomas Wendsop, who won the first prize in the FWJG awards. Sportler drives an ancient pink Jaguar. The NP is now very complex (readability) He now doesnt seem to work in the second sentence (pronouns)

Another possible addition This T-shirt was made by James Sportler, a famous British designer. He drives an ancient pink Jaguar. He works in London with Thomas Wendsop. Wendsop won the first prize in the FWJG awards. Possibly the best solution, but why?

NLG Beyond Words Plain text words and punctuation Printed documents (eg newspapers) need to consider typography, layout, graphics Online documents (eg Web pages) need to consider hypertext links Speech (eg radio broadcasts, telephone) need to consider prosody Visual presentation (eg Embodied Conversational Agents) need to consider animation, facial expressions too

Plain Text When time is limited, travel by limousine, unless cost is also limited, in which case go by train. When only cost is limited a bicycle should be used for journeys of less than 10 kilometers, and a bus for longer journeys. Taxis are recommended when there are no constraints on time or cost, unless the distance to be travelled exceeds 10 kilometers. For journeys longer than 10 kilometers, when time and cost are not important, journeys should be made by hire car.

With Typography and Layout When only time is limited: travel by Limousine When only cost is limited: travel by Bus if journey more than 10 kilometers travel by Bicycle if journey less than 10 kilometers When both time and cost are limited: travel by Train When time and cost are not limited: travel by Hire Car if journey more than 10 kilometers travel by Taxi if journey less than 10 kilometers

Plain Text (e.g. Andre and Rist 2000) Push the code switch S-4 to the right. The code switch is located in front of the transformer. Text and Graphics

Embodied Conversational Agents (ECAs) Until recently, textual aspects of ECAs were largely canned Recent systems use NLG Example: NECA e-Showroom system for car sales. Input to NLG includes: facts about the car agents interests interaction history

Second perspective: NLG as a branch of linguistics The choices made by an NLG system involve the mapping between words and things/ideas. Surely, this is linguistic territory! If linguists cannot say how the different stories about James Sportler differ, then who can? An NLG program might be seen as a model of language production (in terms of its output; the human production process may be very different) This course is neutral between the practical and the theoretical perspective, but I am mostly interested in contributions to (linguistic) theory.

Conclusions NLG is the (somewhat less investigated) twin brother of NL Understanding Just like the interpretive perspective (of NLU), the generative perspective (of NLG) poses deep theoretical problems about language and communication NLG has great potential for applications In applications and theory alike, NLG and NLU are sometimes difficult to separate

Hidden agenda Highlight open questions Get more people to work on Natural Language Generation (NLG)