Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)

The task: GRE NLG can have different kinds of inputs: Flat data (collections of atoms, e.g., in the tables of a database) Logically complex data In both cases, unfamiliar constants may be used, and this is sometimes unavoidable

No familiar constant available: 1. The referent has a familiar name, but its not unique, e.g., John Smith 2. The referent has no familiar name: trains, furniture, trees, atomic particles, … ( In such cases, databases use database keys, e.g., Smith$73527$, TRAIN-3821 ) 3. Similar: sets of objects.

Natural Languages are too economic to have a proper name for everything Names may not even be most appropriate So, speakers/NLG systems have to invent ways of referring to things. E.g., the 7:38 Trenton express

Older work on GRE Winograd (1972) – the SHRDLU system, and especially Appelt (1985) – the KAMP system: trying to understand reference as part of speech acts: How can REs sometimes add information? Why can RE1 be more relevant than RE2? Dale and Reiter isolate GRE as a separate task, and focus on simple cases

Dale & Reiter: best description fulfils Gricean maxims. (Quality:) list properties truthfully (Quantity:) list sufficient properties to allow hearer to identify referent – but not more (Relevance:) use properties that are of interest in themselves * (Manner:) be brief * Slightly different from D&R 1995

D&Rs expectation: Violation of a maxim leads to implicatures. For example, [Quantity] the pitbull (when there is only one dog). Theres just one problem: …

…people dont speak this way For example, [Quantity] the red chair (when there is only one red object in the domain). [Quantity] I broke my arm (when I have two). General: empirical work shows much redundancy Similar for other maxims, e.g., [Quality] the man with the martini (Donellan)

Consider the following formalization: Full Brevity (FB): Never use more than the minimal number of properties required for identification (Dale 1989) An algorithm:

Dale 1989: 1. Check whether 1 property is enough 2. Check whether 2 properties is enough …. Etc., until success {minimal description is generated} or failure {no description is possible}

Problem: exponential complexity Worst-case, this algorithm would have to inspect all combinations of properties. n properties combinations. Some algorithms may be faster, but … Theoretical result: any FB algorithm must be exponential in the number of properties.

D&R conclude that Full Brevity cannot be achieved in practice. They designed an algorithm that only approximates Full Brevity: the Incremental Algorithm (IA).

Psycholinguistic inspiration behind IA (e.g. Pechmann 89; overview in Levelt 89) Speakers often include unnecessary modifiers in their referring expressions Speakers often start describing a referent before they have seen all distractors (as shown by eye-tracking experiments) Some Attributes (e.g. Colour) seem more likely to be noticed and used than others Some Attributes (e.g., Type) contribute strongly to a Gestalt. Gestalts help readers identify referents. (The red thing vs. the red bird) Lets start with a simplified version of IA, which uses properties rather than pairs. Type and head nouns are ignored, for now.

Incremental Algorithm (informal): Properties are considered in a fixed order: P = A property is included if it is useful: true of target; false of some distractors Stop when done; so earlier properties have a greater chance of being included. (E.g., a perceptually salient property) Therefore called preference order.

r = individual to be described P = list of properties, in preference order P is a property L= properties in generated description (Recall: were not worried about realization today)

P = < desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ ({ac}), 150£(bd), 250£ ({}), wooden ({}), metal (abcde), cotton ({d}) > Domain = {a,b,c,d,e}. Now describe: a = d = e =

P = < desk (ab), chair (cde), Swedish (ac), Italian (bde), dark (ade), light (bc), grey (a), 100£ (ac),150£ (bd),250£ ({}), wooden ({}), metal (abcde), cotton (d) > Domain = {a,b,c,d,e}. Now describe: a = d = (Nonminimal) e = (Impossible)

Incremental Algorithm Its a hillclimbing algorithm: ever better approximations of a successful description. Incremental implies no backtracking. Not always the minimal number of properties.

Incremental Algorithm Logical completeness: A unique description is found in finite time if there exists one Question: is IA logicaly complete? Computational complexity: Assume that testing for usefulness takes constant time. Then worst-case time complexity is O(n p ) where n p is the number of properties in P.

Better approximation of Full Brevity (D&R 1995) Attribute + Value model: Properties grouped together as in original example: Origin: Sweden, Italy,... Colour: dark, grey,... Optimization within the set of properties based on the same Attribute

Incremental Algorithm, using Attributes and Values r = individual to be described A = list of Attributes, in preference order Def: = Value j of Attribute i L= properties in generated description

FindBestValue(r,A): - Find Values of A that are true of r, while removing some distractors (If these dont exist, go to next Attribute) - Within this set, select the Value that removes the largest number of distractors (NB: discriminatory power) - If theres a tie, select the most general one - If theres still a tie, select an arbitrary one

Example: D = {a,b,c,d,f,g} Type: furniture (abcd), desk (ab), chair (cd) Origin: Europe (bdfg), USA (ac), Italy (bd) Describe a: {desk, American} (furniture removes fewer distractors than desk) Describe b: {desk, European} (European is more general than Italian) N.B. This disregards relevance, etc.

This is a better approximation of Full Brevity But is it a better algorithm? Question 1: Is it true that all values of an attribute are (roughly) equally preferred? If the colour of a car is pink, this is more notable than if its white Question 2: Doesnt the new algorithm sometimes fail unnecessarily?

About question 2 Exercise: Construct an example where no description is found, although one exists. Hint: Let Attribute have Values whose extensions overlap.

Example: D = {a,b,c,d,e,f} Contains: wood (abe), plastic (acdf) Colour: grey (ab), yellow (cd) Describe a: {wood, grey,...} - Failure (wood removes more distractors than plastic) Compare: Describe a: {plastic, grey} - Success

Conlusion The version of IA that uses format allows the use of simple ontological information (e.g., Italian European) But grouping properties into Attributes makes it difficult to model the unusualness of a property And the idea of using discriminatory power leads to logical incompleteness. IA is therefore (?) often used in its simpler form, without the format

Complexity of the algorithm n d = nr. of distractors n l = nr. of properties in the description n v = nr. of Values ( for all Attributes ) According to D&R: O(n d n l ) (Typical running time) Alternative assessment: O(n v ) (Worst-case running time)

Minor complication: Head nouns Another way in which human descriptions are nonminimal A description needs a Noun, but not all properties are expressed as Nouns Example: Suppose Colour was the most-preferred Attribute, and suppose target = a

Colours: dark (ade), light (bc), grey (a) Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal ({abcde}), cotton(d) target = a Describe a: {grey} The grey ? (Not in English)

D&Rs repair: Assume that Values of the Attribute Type can be expressed in a Noun. After the core algorithm: - check whether Type is represented. - if not, then add the best Value of the Type Attribute to the description

Versions of Dale and Reiters Incremental Algorithm (IA) have often been implemented Still the starting point for many new algorithms. But how human-like is the output of the IA really? The paper does not contain an evaluation of the algorithms discussed

Comments on the algorithm 1. Redundancy exists, but not for principled reasons, e.g., for - marking topic changes, etc. (Corpus work by Pam Jordan et. al.) - making it easy to find the referent (Experimental work by Paraboni et al.)

Limitations of the algorithm 2. Targets are individual objects, never sets. What changes when target = {a,b,c} ? 3. Incremental algorithm uses only conjunctions of atomic properties. No negations, disjunctions, etc.

Limitations of D&R 4. No relations with other objects, e.g., the orange on the table. 5. Differences in salience are not taken into account. When we say the dog, does this mean that there is only one dog in the world? 6. Language realization is disregarded.

Limitations of D&R 7. Calculation of complexity is iffy Role of Typical run time and length of description is unclear Greedy Algorithm (GA) dismissed even though it has polynomial complexity GA: always choose the property that removes the maximum number of distractors

More fundamental features Speaker and Hearer have shared knowledge This knowledge can be formalised using atomic statements Foundations were left unformalised, e.g. Closed-World Assumption Unique Name Assumption The aim of GRE is to identify the target referent uniquely. (I.e., the aim is to construct a distinguishing description of the referent.)

Discussion: How bad is it for a GRE algorithm to take exponential time choosing the best RE? How do human speakers cope? More complex types of referring expressions problem becomes even harder Restrict to combinations whose length is less than x problem not exponential. Example: descriptions containing at most n properties (Full Brevity)

Linguists view We dont pretend to mirror psychologically correct processes. (Its enough if GRE output is correct). So why worry if our algorithms are slow?

Mathematicians view structure of a problem becomes clear when no restrictions are put. Practical addition: What if the input does not conform with these restrictions? (GRE does not control its own input!)

A compromise view Compare with Description Logic: - Increasingly complex algorithms … - that tackle larger and larger fragments of logic … - and whose complexity is conservative When looking at more complex phenomena, take care not to slow down generation of simple cases too much

A note on the history of the IA Appelt (1985) did not focus on distinguishing descriptions did not describe an algorithm in detail suggested attempting properties one by one cited the Gricean maxims suggested that the shortest description may not always be the best one

Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)

Similar presentations

Presentation on theme: "Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)

Similar presentations

Presentation on theme: "Generation of Referring Expressions (GRE) The Incremental Algorithm (IA) Dale & Reiter (1995)"— Presentation transcript:

Similar presentations

About project

Feedback