Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble (Eds.) “Information Sharing: Reference and Presupposition in Language Generation and Interpretation”, CSLI Publications
Leftovers from yesterday D&R’s algorithm embodies the assumption that Content Determination can be done before everything else. Alternative account: Lecture 5. Some issues:
Leftovers from yesterday Does CD know which properties can be expressed in the language? Strong form of the assumption: Realization may take any amount of ‘space’, e.g., ‘(The treasure can be found...) –… at the peak of the hill’ –… on the hill; the steep one with lots of green grass’ –(even an entire book)
Leftovers from yesterday Properties can be context-dependent and vague (e.g., ‘steep (hill)’). –In context, the description can ‘nail’ the target –GRE algorithms can be expanded to do this –vague descriptions from crisp input –L now really becomes a list These and other extensions: see web page
1. Salience in GRE Before talking about ‘proper’ GRE, let’s briefly talk about category choice. Let every x i be a referring expression:....x 1....x x x 2....x 2....x x 1....x x Definite descriptions are one option among many:
Category choice Choosing between proper names, pronouns, demonstratives, definite descriptions, etc. Theories about category choice are often studied using corpora, via hypothesis testing or learning. Salience is a key concept, which takes a different form in different theories (e.g., centering theory) Related notions: focus, discourse-old/new,... (e.g., McCoy & Strube 1999; Henschel, Cheng & Poesio 2000)
Most research has focussed on possibility of pronominal reference. ‘Use pronoun if there is an antecedent in the previous clause, and there is no competing referent’ (Dale and Reiter 1995) (K&Th) This undergenerates pronouns Example of a more generous account:
Henschel, Cheng & Poesio (2000) Choose pronoun if –antecedent is realized as subject or discourse-old & –no competing referent is realized as subject or discourse-old & –no competing referent is ‘amplified’ by appositive or restrictive relative clause Otherwise choose definite description
We will largely ignore category choice, focussing on generation of definite descriptions. So far, we have also ignored salience, arguably at our peril...
Salience in GRE Reiter and Dale (2000) “Building Natural Language Generation Systems”: Domain = { elements that are salient enough } Krahmer and Theune (2002): 1.This disregards different degrees of salience within the Domain 2.This fails to reflect that even the least salient object can be referable
Salience in GRE 1.Suppose D contains many dogs. Still, if my chihuahua is the most salient dog in D then ‘the dog’ refers unambiguously to it. 2.If our chihuahua is the least salient object in the D then we might still refer to him (e.g.,‘the small ratty creature that’s trying to hide behind the chair’).
Krahmer and Theune (2002) Abandon D&R’s dichotomy. Assume: ‘the N’ = ‘the most salient N’. Exercise: Get the Incremental Algorithm to say ‘the N’ iff N is the most salient N. Reminder: This is the Incremental Algorithm …
Krahmer and Theune (2002) (My version): re-interpret Domain as
Example Situation a, £100 b, £150 c, £100 d, £150 e, £? Swedish Italian most salient least salient
Sal Max ={ac}, Sal Mid ={b}, Sal Min ={de} Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), brown (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal (abcde), cotton (d) Exercise: Describe a; Describe b; Describe d
Sal Max ={ac}, Sal Mid ={b}, Sal Min ={de} Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), brown (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal (abcde), cotton (d) a: Domain = {a,c}; description = {desk} b: Domain = {a,b,c}; description = {desk, Italy} d: Domain = {a,b,c,d,e}; description = {chair, Italy, 150}
Krahmer & Theune are noncommittal about how salience is determined Compare Praguian/centering account Focus on textual salience:....x 1....x x x 2....x 2....x x 1....x x Salience has a physical component as well (e.g., ‘the door’ = the nearest door)
Pronouns K&Th explore how their account may be generalized to generate pronouns: –‘it/he/she’ = ‘the object’ (etc.) –Given their account, this means ‘the most salient object’. Predictions look OK, though it does not seem to allow antecedents beyond previous clause.
Pronouns Example: ‘The white chihuahua 1 was chasing the cat 2. It 1 /the cat 2 ran fast’. K&Th: Perhaps it’s not enough being slightly more salient than your competitors: –‘The white chihuahua 1 was chasing the cat 2. The chihuahua 1 /the cat 2 ran fast’. –‘The white chihuahua 1 was eating. It 1 was eating a cat’.
K&Th discuss two other extensions: –Bridging (e.g., ‘the car …. the motor’) –Relational properties Since bridging involves a relation, let us start with relations.
2. Relational properties Tuesday’s lecture: Some properties involve a relation with another object, e.g., Origin: Sweden (ac), Italy (bde) From (a,Sweden) Recursion requires reification: ‘ x comes from the country where y lives’
Dale & Haddock (1991) D&H modelled 2-place relations in GRE Constraint satisfaction perspective, e.g., Constraints: {Orange(a), Orange(b), Table(c), On(a,c)} Problem: construct sets of atoms that have r as the only value of a designated variable: {Orange(x), Table(y), On(x,y)}
D&H accumulate atoms until the target r is identified. This can be done in any order (cf., Dale and Reiter 1995) D&H choose a ‘greedy’ order: adding atoms that remove maximum number of distractors.
Exercise (relations) Greediness: you always add an atom that removes the maximum number of distractors. Construct an example that shows this approach not to be logically complete.
Many later accounts, e.g., by Horacek, (also Krahmer et al.) Krahmer and Theune’s paper contains an alternative model that we will use for expository purposes –One of the ‘extensions’ in K&Th –Incremental rather than greedy
Krahmer and Theune (2002) K&Th mix Content Determination with Syntactic Realization and Lexical Choice. We will continue to focus on Content Determination. We will make some other simplifications:
Simplifications Unlike K&Th, –We forget about salience –property P instead of –No indefinite descriptions. –Nothing about contrastive stress. (Reminder: NLG is relevant to speech!)
Krahmer and Theune (2002) Preference ordering P contains ordinary properties and relations: x:chair(x), x:from(x,Italy) Properties precede relations. In other respects they are treated alike. (Alternative: Mariet Theune’s thesis)
D&R, simplified:
Changes to incremental algorithm This function, Ref, now needs to become recursive. Whether a property is Useful may depend on the properties already present in L Suppose you want to identify x. This makes properties of y irrelevant …. unless L contains a relation between x and y This leads to the following changes:
Changes to incremental algorithm 1.Make L an argument of Useful and Ref. 2.Record in L - the properties that were found useful - the things of which they were true 3.Useful(P,r,P,L) def Confusables r (L {P}) Confusables r (L)
Example P = r = d1
Example (steps) Step 1: r = d1 P = x:dog(x)
Example (steps) Step 1: r = d1 P = x:dog(x) Step 2: r = d1 P = x:in(x,h1) (Success if h1 can be identified)
Example (steps) Step 1: r = d1 P = x:dog(x) Step 2: r = d1 P = x:in(x,h1) (Success if h1 can be identified) Step 3 (recursion): r = h1 P = x:red(x) (Success)
Example (details) Step 1: r = d1 P = dog(x) d1 [[P]] Conf d1 ( ) Conf d1 ( (d1?)) (Therefore, P is a useful addition to L)
Example (details) Step 2: r = d1 P = in(x,h1) d1 [[P]] Conf d1 Conf d1
Example (details) Step 3 (recursion): r = h1 P = red(x) h1 [[P]] Conf h1 Conf h1
Example 2 P = r = d1 Failure during REF(h1,P,C,L), where L =
Example 3 P = r = d1 Success through mutual identification: ‘The dog in the doghouse’ (D&H)
Problems with algorithms like this: Not very elegant; easy to make errors. (Worse with relations of larger arity.) Risk of loops: ‘The orange on the table under the orange on the table,...’. Variant proposals: – Krahmer et al. (2001): labelled directed graphs – Gardent (2002): constraint satisfaction – Etc.
A more general problem Any preference order will sometimes have strange results. Exercise: construct example where putting 1-place properties first causes an excessively lengthy description.
Complexity Theoretical worst-case complexity of GRE + relations is exponential. This algorithm: –Number of loops is bounded by number of properties (n-ary). –Whenever a relation is used, another recursive call of Ref may be necessary.
A red thread ‘Simple’ GRE produces plausible descriptions at reasonable speed. But, when relations are added, fairly awful descriptions are generated slowly. This will become worse when other complications are taken into account: More options More problems (‘embarrassment of riches’)
Combining relations and salience: Bridging { trailer(t1), trailer(t2) car(c1), car(c2), behind(t1,c1) } Sal(c1)>Sal(c2), Sal(t1)>Sal(t2): –‘The trailer behind the car’ –‘The trailer’
Bridging (etc.) But …what if {trailer(t1), trailer(t2), car(c1), car(c2), behind(t1,c1), behind(t2,c2)} Sal(t2) > Sal(t1), Sal(c2) < Sal(c1) Can we still say ‘The trailer behind the car’?, ‘The trailer’ ?
The problem: Relations involve more than one object These objects can have different degrees of salience. It is unclear how this should affect the algorithm. In fact, this is a very common problem: Different extensions of GRE combine in nontrivial ways.
Combining salience and relations: Paraboni and Van Deemter (2002) GRE algorithms tend to be applied to ‘flat’ domains. Let’s see what happens in a hierarchically ordered domain. Before doing this, let us step back...
Making references easy Consider these descriptions: 1.‘the woman with red hair’ (easy to find) 2.‘the woman with green eyes’ (difficult to find) Incremental Algorithm can deal with this by making Hair-Colour more preferred than Eyes-Colour
Making references easy: the case of hierarchically ordered domains Now consider these descriptions: 1.‘... no Lincoln Street, Brighton’ 2.‘... no. 2068, Brighton’ Determining the sense is faster with (2); Determining the reference is faster with (1).
Hierarchically ordered domains can be used to highlight some interesting issues. First issue: Salience can be determined by factors other than discourse structure.
Example: To describe TARGET, it’s enough to distinguish it from distractors in building 1 So: Here ‘the copier’ is specific enough
So far, K&Th’s account applies, provided salience is measured adequately: SAL (tree (parent (d) ) ) = max SAL (tree (parent (parent(d) ) ) = max-1 … Given a starting point d, the focus domain is the smallest subtree that contains d and r.
So far, hierarchy does not pose any big problems. But let’s consider some possible preference orders ….
Exercise: if this is the situation, then which properties will be chosen to identify the TARGET?
1.[Complex preferred over Building]: ‘the copier in the Medical complex’ (-unique) This is not optimally helpful.
2.[Building preferred over Complex]: ‘the copier in building 2’ (-unique) This seems actually infelicitous. No preference ordering gives accurate results.
Issues Issue 1: Salience can be determined by non-textual factors. Our example: structural ‘distance’ between Description and Target Issue 2: Contradicting incrementality, redundancy can be crucial. E.g., ‘the copier in building 2 of the Medical Complex’ Our example: if you can reduce the search space strongly by one extra property then do it! (Experimentally validated.)
Issues Issue 3: Mutual identification is not always allowed. E.g., ‘the copier in building 2’. Our example: D&H’s approach assumes that all referents are highly salient, and all properties/relations are highly transparent.
Ivandré Paraboni ’s thesis Documents are structured domains Generating references to parts of texts or documents. E.g., –‘see figure 3 in section 5’, –‘the issues discussed in the Introduction’ When to generate such references How to do it
Back to the issue of complexity Salience of objects helps reducing the number of distractors. Might properties also be subject to salience (reducing the size of P)? What is the role of incrementality in GRE? (Next lecture)
Next lecture Theoretical departure: “What is NLG anyway” (Shieber 1993). Another way in which referring expressions can go beyond conjunction of atomic properties: Boolean descriptions (Van Deemter 2002).