Generation of Referring Expressions (GRE) Reading: Dale & Reiter (1995) (key paper in this area)

Slides:

Advertisements

Similar presentations

Costs and Benefits.

Advertisements

Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 4 Shieber 1993; van Deemter 2002.

Generation of Referring Expressions: Managing Structural Ambiguities I.H. KhanG. Ritchie K. van Deemter University of Aberdeen, UK.

Some common assumptions behind Computational Generation of Referring Expressions (GRE) (Introductory remarks at the start of the workshop)

Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 5 Stone, Doran,Webber, Bleam & Palmer.

Conceptual coherence in the generation of referring expressions Albert Gatt & Kees van Deemter University of Aberdeen {agatt,

Generation of Referring Expressions: the State of the Art SELLC Summer School, Harbin 2010 Kees van Deemter Computing Science University of Aberdeen.

Charting the Potential of Description Logic for the Generation of Referring Expression SELLC, Guangzhou, Dec Yuan Ren, Kees van Deemter and Jeff.

Generation of Referring Expressions: the State of the Art SELLC Winter School, Guangzhou 2010 Kees van Deemter Computing Science University of Aberdeen.

Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)

Microplanning (Sentence planning) Part 1 Kees van Deemter.

Review: Search problem formulation

CS4018 Formal Models of Computation weeks Computability and Complexity Kees van Deemter (partly based on lecture notes by Dirk Nikodem)

Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble (Eds.) “Information Sharing: Reference and Presupposition.

Computer Science CPSC 322 Lecture 25 Top Down Proof Procedure (Ch 5.2.2)

Schema Refinement: Normal Forms

Thinking Like an Economist

Shortest Vector In A Lattice is NP-Hard to approximate

Kaplan’s Theory of Indexicals

Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.

Copyright © Cengage Learning. All rights reserved.

Copyright © Cengage Learning. All rights reserved.

L41 Lecture 2: Predicates and Quantifiers.. L42 Agenda Predicates and Quantifiers –Existential Quantifier  –Universal Quantifier 

F22H1 Logic and Proof Week 7 Clausal Form and Resolution.

Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.

Knowledge Representation I Suppose I tell you the following... The Duck-Bill Platypus and the Echidna are the only two mammals that lay eggs. Only birds.

ECE 331 – Digital System Design

Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.

Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.

The Theory of NP-Completeness

CSE 830: Design and Theory of Algorithms

Let remember from the previous lesson what is Knowledge representation

Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.

Monadic Predicate Logic is Decidable Boolos et al, Computability and Logic (textbook, 4 th Ed.)

Meaning and Language Part 1.

Generating Referring Expressions (Dale & Reiter 1995) Ivana Kruijff-Korbayová (based on slides by Gardent&Webber, and Stone&van Deemter) Einfürung.

Natural Language Processing Lab Northeastern University, China Feiliang Ren EBMT Based on Finite Automata State Transfer Generation Feiliang Ren.

Logics for Data and Knowledge Representation

Copyright © Cengage Learning. All rights reserved. CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF ELEMENTARY NUMBER THEORY AND METHODS OF PROOF.

Algebra Form and Function by McCallum Connally Hughes-Hallett et al. Copyright 2010 by John Wiley & Sons. All rights reserved. 3.1 Solving Equations Section.

Planning, page 1 CSI 4106, Winter 2005 Planning Points Elements of a planning problem Planning as resolution Conditional plans Actions as preconditions.

The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives.

1 Sections 1.5 & 3.1 Methods of Proof / Proof Strategy.

The Integers. The Division Algorithms A high-school question: Compute 58/17. We can write 58 as 58 = 3 (17) + 7 This forms illustrates the answer: “3.

CMPF144 FUNDAMENTALS OF COMPUTING THEORY Module 5: Classical Logic.

Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.

LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.

Generation of Referring Expressions (GRE) The Incremental Algorithm Dale & Reiter (1995)

Copyright, Harris Corporation & Ophir Frieder, The Process of Normalization.

© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.

Machine Learning Concept Learning General-to Specific Ordering

1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.

Charting the Potential of Description Logic for the Generation of Referring Expression SELLC, Guangzhou, Dec Yuan Ren, Kees van Deemter and Jeff.

Computational Learning Theory Part 1: Preliminaries 1.

Copyright © 2014 Curt Hill Algorithms From the Mathematical Perspective.

Kees van Deemter Generation of Referring Expressions: a crash course Background information and Project HIT 2010.

8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.

Sorting by placement and Shift Sergi Elizalde Peter Winkler By 資工四 B 周于荃.

Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.

Copyright © Cengage Learning. All rights reserved.

CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.

Version Space Machine Learning Fall 2018.

Generation of Referring Expressions (GRE)

Presentation transcript:

Generation of Referring Expressions (GRE) Reading: Dale & Reiter (1995) (key paper in this area)

The task: GRE NLG can have different kinds of inputs: –Flat data (collections of atoms, e.g., in the tables of a database) –Logically complex data In both cases, unfamiliar constants may be used, and this is sometimes unavoidable

No familiar constant available: 1.The referent has a familiar name, but its not unique, e.g., John Smith 2.The referent has no familiar name: trains, furniture, trees, atomic particles, … ( In such cases, databases use database keys, e.g., Smith$73527$, TRAIN-3821 ) 3. Similar: sets of objects (lecture 4).

Natural Languages are too economic to have a proper name for everything Names may not even be most appropriate So, speakers/NLG systems have to invent ways of referring to things. E.g., the 7:38 Trenton express Note: the problem arises whether the referent is a token or a type

GRE tries to find the best description GRE is microcosm of NLG: e.g., determines –which properties to express (Content Determination) –which syntactic configuration to use (Syntactic Realization) –which words to choose (Lexical Choice)

This lecture: Simplification 1: Content Determination only (until lecture 5). Simplification 2: Definite descriptions only (Pronouns, demonstratives, etc., are disregarded; until tomorrow)

Dale & Reiter (1995): best description fulfills the Gricean maxims. E.g., (Quality:) list properties truthfully (Quantity:) list sufficient properties to allow hearer to identify referent – but not more (Relevance:) use properties that are of interest in themselves * (Manner:) be brief * Slightly different from D&R 1995

D&Rs expectation: Violation of a maxim leads to implicatures. For example, – [Quantity] the pitbull (when there is only one dog). –[Manner] Get the cordless drill thats in the toolbox (Appelt). Theres just one problem: …

…people dont speak this way For example, –[Manner] the red chair (when there is only one red object in the domain). –[Manner/Quantity] I broke my arm (when I have two). General: empirical work shows much redundancy Similar for other maxims, e.g., –[Quality] the man with the martini (Donellan)

Example Situation a, £100 b, £150 c, £100 d, £150 e, £? SwedishItalian

Formalized in a KB Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), grey (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal ({abcde}), cotton(d) Assumption: all this is shared knowledge.

Violations of … Manner: *The £100 grey Swedish desk which is made of metal (Description of a) Relevance: The cotton chair is a fire hazard? ?Then why not buy the Swedish chair? (Descriptions of d and c respectively)

In fact, there is a second problem with Manner. Consider the following formalization: Full Brevity: Never use more than the minimal number of properties required for identification (Dale 1989) An algorithm:

Dale 1989: 1.Check whether 1 property is enough 2.Check whether 2 properties is enough …. Etc., until success {minimal description is generated} or failure {no description is possible}

Problem: exponential complexity Worst-case, this algorithm would have to inspect all combinations of properties. n properties combinations. Recall: one grain of rice on square one; twice as many on any subsequent square. Some algorithms may be faster, but … Theoretical result: algorithm must be exponential in the number of properties.

D&R conclude that Full Brevity cannot be achieved in practice. They designed an algorithm that only approximates Full Brevity: the Incremental Algorithm.

Incremental Algorithm (informal): Properties are considered in a fixed order: P = A property is included if it is useful: true of target; false of some distractors Stop when done; so earlier properties have a greater chance of being included. (E.g., a perceptually salient property) Therefore called preference order.

r = individual to be described P = list of properties, in preference order P is a property L= properties in generated description (Recall: were not worried about realization today)

P = < furniture (abcde), desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ ({ac}), 150£(bd), 250£ ({}), wooden ({}), metal (abcde), cotton ({d}) > Domain = {a,b,c,d,e}. Now describe: a = d = e =

P = < furniture (abcde), desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ (ac),200£ (bd),250£ ({}), wooden ({}), metal (abcde), cotton (d) > Domain = {a,b,c,d,e}. Now describe: a = d = (Nonminimal) e = (Impossible)

[ An aside: shared info will be assumed to be complete and uncontroversial. Consider Speaker: [[Student]] = {a,b,…} Hearer: [[Student]] = {a,c,…} Does this make a referable? ]

Incremental Algorithm Its a hillclimbing algorithm: ever better approximations of a successful description. Incremental means no backtracking. Not always the minimal number of properties.

Incremental Algorithm Logical completeness: A unique description is found in finite time if there exists one. (Given reasonable assumptions, see van Deemter 2002) Computational complexity: Assume that testing for usefulness takes constant time. Then worst-case time complexity is O(n p ) where n p is the number of properties in P.

Better approximation of Full Brevity (D&R 1995) Attribute + Value model: Properties grouped together as in original example: Origin: Sweden, Italy,... Colour: dark, grey,... Optimization within the set of properties based on the same Attribute

Incremental Algorithm, using Attributes and Values r = individual to be described A = list of Attributes, in preference order Def: = Value i of Attribute j L= properties in generated description

FindBestValue(r,A): - Find Values of A that are true of r, while removing some distractors (If these dont exist, go to next Attribute) - Within this set, select the Value that removes the largest number of distractors - If theres a tie, select the most general one - If theres still a tie, select an arbitrary one

Example: D = {a,b,c,d,f,g} Type: furniture (abcd), desk (ab), chair (cd) Origin: Europe (bdfg), USA (ac), Italy (bd) Describe a: {desk, American} (furniture removes fewer distractors than desk) Describe b: {desk, European} (European is more general than Italian) N.B. This disregards relevance, etc.

P.S. Note the similarity with Van Rooy & Dekkers semantic of answers: Let A and B be truthful answers to a question, then A is a better answer than B Utility(A) > Utility(B) or Utility(A) = Utility(B) & B A (More about this in the next lecture …)

Exercise on Logical Completeness: Construct an example where no description is found, although one exists. Hint: Let Attribute have Values whose extensions overlap.

Example: D = {a,b,c,d,f} Contains: wood (abe), plastic (acdf) Colour: grey (ab), yellow (cd) Describe a: {wood, grey,...} - Failure (wood removes more distractors than plastic) Compare: Describe a: {plastic, grey} - Success

Complexity of the algorithm n d = nr. of distractors n l = nr. of properties in the description n v = nr. of Values ( for all Attributes ) Alternative assessment: O(n v ) (Worst-case running time) According to D&R: O(n d n l ) (Typical running time)

Minor complication: Head nouns Another way in which human descriptions are nonminimal –A description needs a Noun, but not all properties are expressed as Nouns –Example: Suppose Colour was the most-preferred Attribute, and target = a

Colours: dark (ade), light (bc), grey (a) Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal ({abcde}), cotton(d) target = a Describe a: {grey} The grey ? (Not in English)

D&Rs repair: Assume that Values of the Attribute Type can be expressed in a Noun. After the core algorithm: - check whether Type is represented. - if not, then add the best Value of the Type Attribute to the description

Versions of Dale and Reiters Incremental Algorithm have often been implemented Still the starting point for many new algorithms. (See later lectures.) Worth reading!

Limitations of the algorithm 1.Redundancy does not arise for principled reasons, e.g., for - marking topic changes, etc. (Corpus work by Pam Jordan et. al.) - making it easy to find the referent (Experimental work by Paraboni et al. - Next lecture)

Limitations of the algorithm 2.Targets are individual objects, never sets. What changes when target = {a,b,c} ? (Lecture 4) 3.Incremental algorithm uses only conjunctions of atomic properties. No negations, disjunctions, etc. (Lecture 4)

Limitations of the algorithm 4.No relations with other objects, e.g., the orange on the table. (Lecture 3) 5.Differences in salience are not taken into account. (Lecture 3) 6.Language realization is disregarded. (Lecture 5)

Discussion: How bad is it for a GRE algorithm to take exponential time? –More complex types of referring expressions problem becomes even harder –Restrict to combinations whose length is less than x problem not exponential. –Example: descriptions containing a most n properties (Full Brevity)

However: –Mathematicians view: structure of problem shows when no restrictions are put. –What if the input does not conform with these restrictions? (GRE does not control its own input!)

Compare with Description Logic: - Increasingly complex algorithms … - that tackle larger and larger fragments of logic … - and whose complexity is conservative Question: how do human speakers cope?