Natural Language Generation: Discourse Planning Paul Gallagher 28 April 2005 (Material adapted from Ch. 20 of Jurafsky and Martin unless otherwise noted)
Introduction Programs that generate natural language are very common “Hello, world” (Kernighan and Ritchie) Template filling and mail merge Simple, but inflexible!
Introduction What do we really mean when we say “Natural Language Generation”? “the process of constructing natural language outputs from non-linguistic inputs” (Jurafsky and Martin, p. 765) Map from meaning to text (the inverse of natural language understanding)
Introduction Contrast with NLU Characteristics of an NLG system In NLU, focus is hypothesis management In NLG, focus is choice Characteristics of an NLG system Produce an appropriate range of forms Choose among forms based on internal meaning and context
Introduction What kinds of choices? Content selection Lexical selection Sentence structure Aggregation Referring expressions Discourse structure
Introduction NLG Examples Generate textual weather forecasts from weather maps Summarize statistical data from database or spreadsheet Explain medical information Authoring aids (Reiter and Dale)
Linguistic Realization Knowledge Base + Communicative Goal Meaning An NLG Reference Architecture Discourse Planner Text Plan Map Sentence Planning Sentence Plans Linguistic Realization Natural Language Text Text Adapted from Dale & Reiter. “Building Applied Natural Language Generation Systems”
Discourse Planner “Discourse planning…[imposes] ordering and structure over the set of messages to be conveyed.” (Reiter and Dale) Push or Pull? The planner selects or receives its content from the knowledge base. (McDonald) Outputs a tree structure defining order and rhetorical structure. (Reiter and Dale)
Text Schemata Observation: Many texts follows consistent structural patterns Example: Instructions For each step: Mention preconditions Describe the step Describe sub-steps Mention side-effects
Text Schemata Knowledge base representation of a saving procedure (Jurafsky and Martin. Fig. 20.5)
Text Schemata A schema from representing procedures. Implemented as an augmented transition network (ATN). Jurafsky and Martin. Fig 20.6
Text Schemata Sample output of the example: Save the document: First, choose the save option from the file menu. This causes the system to display the Save-As dialog box. Next choose the destination folder and type the filename. Finally, press the save button. This causes the system to save the document.
Rhetorical Relations Text schemata still not very flexible Schema is essentially a hard-coded text plan. There is an underlying structure to language which we can take advantage of to develop richer expressions: Rhetorical Structure Theory
Rhetorical Relations I love to collect classic automobiles. My favorite car is my 1899 Duryea. However, I prefer to drive my 1999 Toyota. nucleus Elaboration satellite Contrast satellite
Rhetorical Relations How do we apply RST to a discourse planner? Post a high-level goal to the planner (e.g., “Make the hearer competent to save a document”) Create plan operators which expand goals into sub-goals, creating a rhetorical structure tree.
Rhetorical Relations Name: Expand Purpose Effect: (COMPETENT hearer (DO-ACTION ?action)) Constraints: (AND (c-get-all-substeps ?action ?sub-actions) (NOT singular-list? ?sub-actions)) Nucleus: (COMPETENT hearer (DO-SEQUENCE ?sub-actions)) Satellites: (((RST-PURPOSE (INFORM s hearer (DO ?action))) *required*)) Name: Expand Sub-Actions Effect: (COMPETENT hearer (DO-SEQUENCE ?actions)) Constraints: NIL Nucleus: (foreach ?actions (RST-SEQUENCE (COMPETENT hearer (DO-ACTION ?actions)))) Satellites: Jurafsky and Martin, pp. 786 and 788
Rhetorical Relations The full rhetorical structure for the example text. Jurafsky and Martin. Fig. 20.7.
References Jurafsky, D. & Martin, J. H. (2000). Speech and and Language Processing. Reiter, E. and Dale, R. (1997). “Building Applied Natural Language Systems. McDonald, D. D. “Natural Language Generation”. (Appeared in Handbook of Natural Language Generation, edited by Dale, R., Moisl, H., and Somers, H.)