Download presentation
Presentation is loading. Please wait.
Published byOwen Harper Modified over 11 years ago
1
Microplanning (Sentence planning) Part 1 Kees van Deemter
2
Natural Language Generation Taking some computer-readable gibberish Translating it into proper English Applications include –dialogue/chat systems –on-line help –summarisation, –document authoring
3
NLG Tasks (as explained by Anja): 1.Content determination: decide what to say; construct set of messages 2.Discourse planning: ordering, structuring concepts; rhetorical relationships 3.Sentence aggregation: divide content into sentences; construct sentence plans 4.Lexicalisation: map concepts and relations to lexemes (= words) 5.Referring expression generation: decide how to refer to objects 6.Linguistic realisation: put it all together in acceptable words and sentences
4
Modular structure of NLG systems (in theory!): Content determination Discourse planning Sentence aggregation Realisation Lexicalisation Referring expressions TEXT PLANNER REALISER SENTENCE PLANNER/ MICROPLANNER
5
Last week: Input to realisation message-id: msg02 relation: C_DEPARTURE departing-entity: C_CALEDON-EXPRESS args: departure-location: C_ABERDEEN departure-time: C_1000 departure-platform: C_7
6
Microplanning 1:Aggregation Distributing information over different sentences. Example: a. The Caledonian express departs Aberdeen at 10:00, from platform 7 b. The Caledonian express departs Aberdeen at 10:00. The Caledonia express departs from platform 7
7
Microplanning 2: GRE GRE = Generation of Referring Expressions Explaining which objects youre talking about a. The Caledonian express departs Aberdeen at 10:00, from platform 7 b. The Caledonian express departs -- at 10:00. The train departs from this platform
8
Microplanning 3: lexical choice Using different words for the same concept a. The Caledonian express departs Aberdeen at ten oclock, from platform 7 b. The Caledonian express departs Aberdeen at ten. The Caledonia express leaves from platform 7
9
In practice: tasks can be performed in different order Example: aggregation can be performed on messages:
10
message-id: msg02 relation: C_DEPARTURE_1 departing-entity: C_CALEDON-EXPRESS args: departure-location: C_ABERDEEN departure-time: C_1000 message-id: msg03 relation: C_DEPARTURE_2 args: departure-entity: C_CALEDON-EXPRESS departure-platform: C_7
11
Aggregation can also be performed later: [ The Caledonian express] departs Aberdeen [at 10:00] [from platform 7] ===> [The Caledonian express] departs Aberdeen [at 10:00]. [The Caledonia express] departs [from platform 7]
12
Lets focus on GRE, but... A little detour: NLG systems do not always work as youve been told Some practically deployed systems combine canned text with NLG One possibility: system has a library of language templates, with gaps that need to be filled. E.g.,
13
[TRAIN] departs [TOWN] at [TIME] [TRAIN] departs [TOWN] from [PLATFORM] We apologise for the fact that [TRAIN] is delayed by [AMOUNT] Gap filling: using canned text or GRE. Question: which of the other tasks are still relevant?
14
Lets move on to GRE Why/when is GRE useful?
15
1.The referent has a familiar name, but its not unique, e.g., John Smith 2.The referent has no familiar name: trains, furniture, trees, atomic particles, … ( Databases use keys, e.g., Smith$73527$, TRAIN-3821 ) 3. Similar: sets of objects 4. NL is too economical to have names for everything
16
Last week: Input to realisation message-id: msg02 relation: C_DEPARTURE departing-entity: C_CALEDON-EXPRESS args: departure-location: C_ABERDEEN departure-time: C_1000
17
Last week: Input to realisation message-id: msg02 relation: C_DEPARTURE departing-entity: C_CALEDON-EXPRESS args: departure-location: C_ABERDEEN departure-time: C_1000
18
This week: more realistic input message-id: msg02 relation: C_DEPARTURE departing-entity: C_34435 args: departure-location:..... departure-time:..... the caledonian (express), the Aberdeen-Glasgow express the blue train on your left, the train
19
Communication is about saying the truth... but thats not all there is to it Paul Grice (around 1970): principles of rational, cooperative communication GRE, it a good case study. (R.Dale and E.Reiter, Cognitive Science, 1995)
20
Grice: maxims of conversation Quality: only say what you know to be true Quantity: give enough but not too much information Relevance: be relevant Manner: be clear and brief (There is overlap between these four)
21
Maxims are two-edged sword: 1.They say how one should normally speak/write. Example: Yes, theres a gasoline station around the corner (when its no longer operational) quality: yes, its true quantity: probably yes relevance: no, not relevant to hearers intentions manner: its brief, clear, etc.
22
Maxims are two-edged sword: 2. They can also be exploited. Example: Asked to write academic reference: Kees always came to my lectures and hes a nice guy quality: yes, its true (lets assume) quantity: No -- How about academic achievements? relevance: yes manner: yes
23
Application to GRE Dale & Reiter: best description of an object fulfils the Gricean maxims. E.g., (Quality:) list properties truthfully (Quantity:) use properties that allow identification – without containing more info (Relevance:) use properties that are of interest in the situation (Manner:) be brief
24
D&Rs expectation: Violation of a maxim leads to implicatures. For example, – [Quantity] the pitbull (when there is only one dog). –[Manner] Get the cordless drill thats in the toolbox (Appelt). Theres just one problem: …
25
people dont always speak this way For example, –[Manner] the red chair (when there is only one red object in the domain). –[Manner/Quantity] I broke my arm (when I have two). General: empirical work shows much redundancy Similar for other maxims, e.g., –[Quality] the man with the martini (Donellan)
26
Example Situation a, £100 b, £150 c, £100 d, £150 e, £? SwedishItalian
27
Formalized in a KB Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), grey (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal (abcde), cotton(d) Assumption: all this is shared knowledge.
28
Game 1. Describe object a. 2. Describe object e. 3. Describe object d.
29
Game 1. Describe object a: {desk,sweden}, {grey} 2. Describe object e: no solution 3. Describe object d: {Italy, 150}
30
Violations of … Manner: *The £100 grey Swedish desk which is made of metal (Description of a) Relevance: The cotton chair is a fire hazard? ?Then why not buy the Swedish chair? (Descriptions of d and c respectively)
31
In fact, there is a second problem with Quantity/Manner. Consider the following formalization: Full Brevity: Never use more than the minimal number of properties required for identification (Dale 1989) An algorithm:
32
Dale 1989: 1.Check whether 1 property is enough 2.Check whether 2 properties is enough …. Etc., until success {minimal description is generated} or failure {no description is possible}
33
Problem: exponential complexity Worst-case, this algorithm would have to inspect all combinations of properties. n properties combinations. Recall: one grain of rice on square one; twice as many on any subsequent square. Some algorithms may be faster, but … Theoretical result: algorithm must be exponential in the number of properties.
34
D&R conclude that Full Brevity cannot be achieved in practice. They designed an algorithm that only approximates Full Brevity: the Incremental Algorithm.
35
Incremental Algorithm (informal): Properties are considered in a fixed order: P = A property is included if it is useful: true of target; false of some distractors Stop when done; so earlier properties have a greater chance of being included. (E.g., a perceptually salient property) Therefore called preference order.
36
r = individual to be described P = list of properties, in preference order P is a property L= properties in generated description (Recall: were not worried about realization today)
38
Back to the KB Type: furniture (abcde), desk (ab), chair (cde) Origin: Sweden (ac), Italy (bde) Colours: dark (ade), light (bc), grey (a) Price: 100 (ac), 150 (bd), 250 ({}) Contains: wood ({}), metal (abcde), cotton(d) Assumption: all this is shared knowledge.
39
Back to our game 1. Describe object a. 2. Describe object e. 3. Describe object d. Can you see room for improvement?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.