Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble (Eds.) “Information Sharing: Reference and Presupposition.

Slides:

Advertisements

Similar presentations

Testing Relational Database

Advertisements

Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 4 Shieber 1993; van Deemter 2002.

Generation of Referring Expressions: Managing Structural Ambiguities I.H. KhanG. Ritchie K. van Deemter University of Aberdeen, UK.

Some common assumptions behind Computational Generation of Referring Expressions (GRE) (Introductory remarks at the start of the workshop)

Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 5 Stone, Doran,Webber, Bleam & Palmer.

Generation of Referring Expressions (GRE) Reading: Dale & Reiter (1995) (key paper in this area)

Generation of Referring Expressions: the State of the Art SELLC Summer School, Harbin 2010 Kees van Deemter Computing Science University of Aberdeen.

Charting the Potential of Description Logic for the Generation of Referring Expression SELLC, Guangzhou, Dec Yuan Ren, Kees van Deemter and Jeff.

Generation of Referring Expressions: the State of the Art SELLC Winter School, Guangzhou 2010 Kees van Deemter Computing Science University of Aberdeen.

Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)

A small taste of inferential statistics

CS1512 Foundations of Computing Science 2 Week 3 (CSD week 32) Probability © J R W Hunter, 2006, K van Deemter 2007.

Microplanning (Sentence planning) Part 1 Kees van Deemter.

Review: Search problem formulation

Heuristic Search techniques

Lecture 3 – February 17, 2003.

CS4018 Formal Models of Computation weeks Computability and Complexity Kees van Deemter (partly based on lecture notes by Dirk Nikodem)

Thinking Like an Economist

Academic Writing Writing an Abstract.

HOW TO WRITE AN ACADEMIC PAPER

Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.

Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Justification-based TMSs (JTMS) JTMS utilizes 3 types of nodes, where each node is associated with an assertion: 1.Premises. Their justifications (provided.

Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?

Inference and Reasoning. Basic Idea Given a set of statements, does a new statement logically follow from this. For example If an animal has wings and.

Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.

Project Proposal.

Introduction to phrases & clauses

Albert Gatt LIN3021 Formal Semantics Lecture 5. In this lecture Modification: How adjectives modify nouns The problem of vagueness Different types of.

Data Structures Hash Tables

Discussion #36 Spanning Trees

Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.

Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.

Writing tips Based on Michael Kremer’s “Checklist”,

Basic Scientific Writing in English Lecture 3 Professor Ralph Kirby Faculty of Life Sciences Extension 7323 Room B322.

Lesson 6. Refinement of the Operator Model This page describes formally how we refine Figure 2.5 into a more detailed model so that we can connect it.

Meaning and Language Part 1.

Generating Referring Expressions (Dale & Reiter 1995) Ivana Kruijff-Korbayová (based on slides by Gardent&Webber, and Stone&van Deemter) Einfürung.

Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.

The Node Voltage Method

Induction and recursion

CP Summer School Modelling for Constraint Programming Barbara Smith 1.Definitions, Viewpoints, Constraints 2.Implied Constraints, Optimization,

CS1Q Computer Systems Lecture 8

Fall Week 4 CSCI-141 Scott C. Johnson.  Computers can process text as well as numbers ◦ Example: a news agency might want to find all the articles.

How to read a scientific paper

Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.

1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 24, April 3, 2007.

The Writing Process Planning and Drafting. What will you write about?  Often, instructors assign a specific topic or provide some structure for your.

Generation of Referring Expressions (GRE) The Incremental Algorithm Dale & Reiter (1995)

Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.

Key Concepts Representation Inference Semantics Discourse Pragmatics Computation.

Rules, Movement, Ambiguity

Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.

OilEd An Introduction to OilEd Sean Bechhofer. Topics we will discuss Basic OilEd use –Defining Classes, Properties and Individuals in an Ontology –This.

Naïve Set Theory. Basic Definitions Naïve set theory is the non-axiomatic treatment of set theory. In the axiomatic treatment, which we will only allude.

1 Algorithms  Algorithms are simply a list of steps required to solve some particular problem  They are designed as abstractions of processes carried.

Corpus-based evaluation of Referring Expression Generation Albert Gatt Ielka van der Sluis Kees van Deemter Department of Computing Science University.

CS 3343: Analysis of Algorithms Lecture 19: Introduction to Greedy Algorithms.

 2003 CSLI Publications Ling 566 Oct 17, 2011 How the Grammar Works.

Copyright © 2014 Curt Hill Algorithms From the Mathematical Perspective.

Kees van Deemter Generation of Referring Expressions: a crash course Background information and Project HIT 2010.

1 CSC160 Chapter 1: Introduction to JavaScript Chapter 2: Placing JavaScript in an HTML File.

Ch03-Algorithms 1. Algorithms What is an algorithm? An algorithm is a finite set of precise instructions for performing a computation or for solving a.

Part I: Basics and Constituency

Data Structures Unsorted Arrays

The conditional and the bi-conditional

Kees van Deemter Computing Science University of Aberdeen

Generation of Referring Expressions (GRE)

Presentation transcript:

Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble (Eds.) “Information Sharing: Reference and Presupposition in Language Generation and Interpretation”, CSLI Publications

Leftovers from yesterday D&R’s algorithm embodies the assumption that Content Determination can be done before everything else. Alternative account: Lecture 5. Some issues:

Leftovers from yesterday Does CD know which properties can be expressed in the language? Strong form of the assumption: Realization may take any amount of ‘space’, e.g., ‘(The treasure can be found...) –… at the peak of the hill’ –… on the hill; the steep one with lots of green grass’ –(even an entire book)

Leftovers from yesterday Properties can be context-dependent and vague (e.g., ‘steep (hill)’). –In context, the description can ‘nail’ the target –GRE algorithms can be expanded to do this –vague descriptions from crisp input –L now really becomes a list These and other extensions: see web page

1. Salience in GRE Before talking about ‘proper’ GRE, let’s briefly talk about category choice. Let every x i be a referring expression:....x 1....x x x 2....x 2....x x 1....x x Definite descriptions are one option among many:

Category choice Choosing between proper names, pronouns, demonstratives, definite descriptions, etc. Theories about category choice are often studied using corpora, via hypothesis testing or learning. Salience is a key concept, which takes a different form in different theories (e.g., centering theory) Related notions: focus, discourse-old/new,... (e.g., McCoy & Strube 1999; Henschel, Cheng & Poesio 2000)

Most research has focussed on possibility of pronominal reference. ‘Use pronoun if there is an antecedent in the previous clause, and there is no competing referent’ (Dale and Reiter 1995) (K&Th) This undergenerates pronouns Example of a more generous account:

Henschel, Cheng & Poesio (2000) Choose pronoun if –antecedent is realized as subject or discourse-old & –no competing referent is realized as subject or discourse-old & –no competing referent is ‘amplified’ by appositive or restrictive relative clause Otherwise choose definite description

We will largely ignore category choice, focussing on generation of definite descriptions. So far, we have also ignored salience, arguably at our peril...

Salience in GRE Reiter and Dale (2000) “Building Natural Language Generation Systems”: Domain = { elements that are salient enough } Krahmer and Theune (2002): 1.This disregards different degrees of salience within the Domain 2.This fails to reflect that even the least salient object can be referable

Salience in GRE 1.Suppose D contains many dogs. Still, if my chihuahua is the most salient dog in D then ‘the dog’ refers unambiguously to it. 2.If our chihuahua is the least salient object in the D then we might still refer to him (e.g.,‘the small ratty creature that’s trying to hide behind the chair’).

Krahmer and Theune (2002) Abandon D&R’s dichotomy. Assume: ‘the N’ = ‘the most salient N’. Exercise: Get the Incremental Algorithm to say ‘the N’ iff N is the most salient N. Reminder: This is the Incremental Algorithm …

Krahmer and Theune (2002) (My version): re-interpret Domain as

Example Situation a, £100 b, £150 c, £100 d, £150 e, £? Swedish Italian most salient least salient

Sal Max ={ac}, Sal Mid ={b}, Sal Min ={de} Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), brown (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal (abcde), cotton (d) Exercise: Describe a; Describe b; Describe d

Sal Max ={ac}, Sal Mid ={b}, Sal Min ={de} Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), brown (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal (abcde), cotton (d) a: Domain = {a,c}; description = {desk} b: Domain = {a,b,c}; description = {desk, Italy} d: Domain = {a,b,c,d,e}; description = {chair, Italy, 150}

Krahmer & Theune are noncommittal about how salience is determined Compare Praguian/centering account Focus on textual salience:....x 1....x x x 2....x 2....x x 1....x x Salience has a physical component as well (e.g., ‘the door’ = the nearest door)

Pronouns K&Th explore how their account may be generalized to generate pronouns: –‘it/he/she’ = ‘the object’ (etc.) –Given their account, this means ‘the most salient object’. Predictions look OK, though it does not seem to allow antecedents beyond previous clause.

Pronouns Example: ‘The white chihuahua 1 was chasing the cat 2. It 1 /the cat 2 ran fast’. K&Th: Perhaps it’s not enough being slightly more salient than your competitors: –‘The white chihuahua 1 was chasing the cat 2. The chihuahua 1 /the cat 2 ran fast’. –‘The white chihuahua 1 was eating. It 1 was eating a cat’.

K&Th discuss two other extensions: –Bridging (e.g., ‘the car …. the motor’) –Relational properties Since bridging involves a relation, let us start with relations.

2. Relational properties Tuesday’s lecture: Some properties involve a relation with another object, e.g., Origin: Sweden (ac), Italy (bde) From (a,Sweden) Recursion requires reification: ‘ x comes from the country where y lives’

Dale & Haddock (1991) D&H modelled 2-place relations in GRE Constraint satisfaction perspective, e.g., Constraints: {Orange(a), Orange(b), Table(c), On(a,c)} Problem: construct sets of atoms that have r as the only value of a designated variable: {Orange(x), Table(y), On(x,y)}

D&H accumulate atoms until the target r is identified. This can be done in any order (cf., Dale and Reiter 1995) D&H choose a ‘greedy’ order: adding atoms that remove maximum number of distractors.

Exercise (relations) Greediness: you always add an atom that removes the maximum number of distractors. Construct an example that shows this approach not to be logically complete.

Many later accounts, e.g., by Horacek, (also Krahmer et al.) Krahmer and Theune’s paper contains an alternative model that we will use for expository purposes –One of the ‘extensions’ in K&Th –Incremental rather than greedy

Krahmer and Theune (2002) K&Th mix Content Determination with Syntactic Realization and Lexical Choice. We will continue to focus on Content Determination. We will make some other simplifications:

Simplifications Unlike K&Th, –We forget about salience –property P instead of –No indefinite descriptions. –Nothing about contrastive stress. (Reminder: NLG is relevant to speech!)

Krahmer and Theune (2002) Preference ordering P contains ordinary properties and relations: x:chair(x), x:from(x,Italy) Properties precede relations. In other respects they are treated alike. (Alternative: Mariet Theune’s thesis)

D&R, simplified:

Changes to incremental algorithm This function, Ref, now needs to become recursive. Whether a property is Useful may depend on the properties already present in L Suppose you want to identify x. This makes properties of y irrelevant …. unless L contains a relation between x and y This leads to the following changes:

Changes to incremental algorithm 1.Make L an argument of Useful and Ref. 2.Record in L - the properties that were found useful - the things of which they were true 3.Useful(P,r,P,L)  def  Confusables r (L  {P})  Confusables r (L)

Example P = r = d1

Example (steps) Step 1: r = d1 P = x:dog(x)

Example (steps) Step 1: r = d1 P = x:dog(x) Step 2: r = d1 P = x:in(x,h1) (Success if h1 can be identified)

Example (steps) Step 1: r = d1 P = x:dog(x) Step 2: r = d1 P = x:in(x,h1) (Success if h1 can be identified) Step 3 (recursion): r = h1 P = x:red(x) (Success)

Example (details) Step 1: r = d1 P = dog(x) d1  [[P]] Conf d1 ( )  Conf d1 ( (d1?)) (Therefore, P is a useful addition to L)

Example (details) Step 2: r = d1 P = in(x,h1) d1  [[P]] Conf d1  Conf d1

Example (details) Step 3 (recursion): r = h1 P = red(x) h1  [[P]] Conf h1  Conf h1

Example 2 P = r = d1 Failure during REF(h1,P,C,L), where L =

Example 3 P = r = d1 Success through mutual identification: ‘The dog in the doghouse’ (D&H)

Problems with algorithms like this: Not very elegant; easy to make errors. (Worse with relations of larger arity.) Risk of loops: ‘The orange on the table under the orange on the table,...’. Variant proposals: – Krahmer et al. (2001): labelled directed graphs – Gardent (2002): constraint satisfaction – Etc.

A more general problem Any preference order will sometimes have strange results. Exercise: construct example where putting 1-place properties first causes an excessively lengthy description.

Complexity Theoretical worst-case complexity of GRE + relations is exponential. This algorithm: –Number of loops is bounded by number of properties (n-ary). –Whenever a relation is used, another recursive call of Ref may be necessary.

A red thread ‘Simple’ GRE produces plausible descriptions at reasonable speed. But, when relations are added, fairly awful descriptions are generated slowly. This will become worse when other complications are taken into account: More options  More problems (‘embarrassment of riches’)

Combining relations and salience: Bridging { trailer(t1), trailer(t2) car(c1), car(c2), behind(t1,c1) } Sal(c1)>Sal(c2), Sal(t1)>Sal(t2): –‘The trailer behind the car’ –‘The trailer’

Bridging (etc.) But …what if {trailer(t1), trailer(t2), car(c1), car(c2), behind(t1,c1), behind(t2,c2)} Sal(t2) > Sal(t1), Sal(c2) < Sal(c1) Can we still say ‘The trailer behind the car’?, ‘The trailer’ ?

The problem: Relations involve more than one object These objects can have different degrees of salience. It is unclear how this should affect the algorithm. In fact, this is a very common problem: Different extensions of GRE combine in nontrivial ways.

Combining salience and relations: Paraboni and Van Deemter (2002) GRE algorithms tend to be applied to ‘flat’ domains. Let’s see what happens in a hierarchically ordered domain. Before doing this, let us step back...

Making references easy Consider these descriptions: 1.‘the woman with red hair’ (easy to find) 2.‘the woman with green eyes’ (difficult to find) Incremental Algorithm can deal with this by making Hair-Colour more preferred than Eyes-Colour

Making references easy: the case of hierarchically ordered domains Now consider these descriptions: 1.‘... no Lincoln Street, Brighton’ 2.‘... no. 2068, Brighton’ Determining the sense is faster with (2); Determining the reference is faster with (1).

Hierarchically ordered domains can be used to highlight some interesting issues. First issue: Salience can be determined by factors other than discourse structure.

Example: To describe TARGET, it’s enough to distinguish it from distractors in building 1 So: Here ‘the copier’ is specific enough

So far, K&Th’s account applies, provided salience is measured adequately: SAL (tree (parent (d) ) ) = max SAL (tree (parent (parent(d) ) ) = max-1 … Given a starting point d, the focus domain is the smallest subtree that contains d and r.

So far, hierarchy does not pose any big problems. But let’s consider some possible preference orders ….

Exercise: if this is the situation, then which properties will be chosen to identify the TARGET?

1.[Complex preferred over Building]: ‘the copier in the Medical complex’ (-unique) This is not optimally helpful.

2.[Building preferred over Complex]: ‘the copier in building 2’ (-unique) This seems actually infelicitous. No preference ordering gives accurate results.

Issues Issue 1: Salience can be determined by non-textual factors. Our example: structural ‘distance’ between Description and Target Issue 2: Contradicting incrementality, redundancy can be crucial. E.g., ‘the copier in building 2 of the Medical Complex’ Our example: if you can reduce the search space strongly by one extra property then do it! (Experimentally validated.)

Issues Issue 3: Mutual identification is not always allowed. E.g., ‘the copier in building 2’. Our example: D&H’s approach assumes that all referents are highly salient, and all properties/relations are highly transparent.

Ivandré Paraboni ’s thesis Documents are structured domains Generating references to parts of texts or documents. E.g., –‘see figure 3 in section 5’, –‘the issues discussed in the Introduction’ When to generate such references How to do it

Back to the issue of complexity Salience of objects helps reducing the number of distractors. Might properties also be subject to salience (reducing the size of P)? What is the role of incrementality in GRE? (Next lecture)

Next lecture Theoretical departure: “What is NLG anyway” (Shieber 1993). Another way in which referring expressions can go beyond conjunction of atomic properties: Boolean descriptions (Van Deemter 2002).