Download presentation
Presentation is loading. Please wait.
Published bySheryl Woods Modified over 8 years ago
1
Incremental Text Structuring with Hierarchical Ranking Erdong Chen Benjamin Snyder Regina Barzilay
2
Incremental Text Structuring Traditional approach: batch-mode generation Text is viewed as one-time creation Alternative: incremental generation Newsfeeds Wikipedia 3.8 Million Edits per Month 38 Edits per Article June 28, 20072
3
Barack Obama (Wikipedia Article) June 28, 20073 Barack Obama is a Democratic politician from Illinois. He is currently running for the United States Senate, which would be the highest elected office he has held thus far. Biography Obama's father is Kenyan; his mother is from Kansas. He himself was born in Hawaii, where his mother and father met at the University of Hawaii. Obama's father left his family early on, and Obama was raised in Hawaii by his mother. Created in 2004 (5 sentences)
4
5907 revisions up to 2007 (>400 sentences) Barack Obama (Wikipedia Article) June 28, 20074
5
Generation Architecture Content Selection Structuring Surface Realization June 28, 20075
6
Generation Architecture Content Selection Structuring Surface Realization June 28, 20076 Our focus
7
Task Definition Input: Output: insertion point June 28, 20077
8
Task Definition Input: text is organized hierarchically Output: insertion point June 28, 20078
9
Sample Insertion June 28, 20079 He received his B.A. degree in 1983, then worked for one year at Business International Corporation. In 1985, Obama moved to Chicago to direct a non-profit project assisting local churches to organize job training programs. In 1990, The New York Times reported his election as the Harvard Law Review's "first black president in its 104-year history.“ He entered Harvard Law School in 1988.
10
Sample Features Topical Features Word overlap with section Word overlap with paragraph Positional Features Last paragraph of article or not? First section of article or not? Temporal Features Temporal order within paragraph June 28, 200710
11
Sample Features Topical Features Word overlap with section Word overlap with paragraph Positional Features Last paragraph of article or not? First section of article or not? Temporal Features Temporal order within paragraph June 28, 200711 Red: Section feature Blue: Paragraph feature
12
Motivation for Hierarchical Model June 28, 200712 Paragraph error Section error Goal: Model should be sensitive to type of error If a section has been predicted wrongly, then errors in the paragraph should not be taken into account.
13
Hierarchical Decomposition of Features June 28, 200713 s-Insertion sentence -Local feature vector -Aggregate feature vector -Insertion point -Root path Paragraph Feature Section Feature
14
Hierarchical Ranking Model: Decoding June 28, 200714 -Predicted solutionW-Feature weight Paragraph Feature Score Section Feature Score 2314121
15
Hierarchical Ranking Model: Training June 28, 200715 2314121 Only update weights at the first divergent (green) layer a b
16
16 Flat Training vs. Hierarchical Training Φ-Aggregate feature vector -Reference solution -Predicted solution -Local feature vector a-Highest divergent node of reference solution b-Highest divergent node of predicted solution Flat: Hierarchical: June 28, 2007
17
Previous Work on Text Structuring Corpus-based Approach (Lapata, 2003; Karamanis et al., 2004; Okazaki et al., 2004; Barzilay and Lapata, 2005; Bollegala et al., 2006; Elsner and Charniak, 2007) Focus on relatively short texts Based on flat text representation Symbolic Approach (McKeown, 1985; Kasper, 1989; Reiter and Dale, 1990; Hovy, 1993; Maier and Hovy, 1993; Moore and Paris, 1993; Reiter and Dale, 1997) Hand-crafted sentence planner Based on tree-like text representation June 28, 200717
18
Previous Work on Hierarchical Learning Hierarchical Classification (Cai and Hofmann, 2004; Dekel et al., 2004; Cesa-Bianchi et al., 2006a; Cesa-Bianchi et al., 2006b) Input: a flat feature vector Output: labels from a fixed label hierarchy Model Parameter: a different weight vector for each label node Hierarchical Ranking (Our method) Input: a hierarchy with a fixed depth Output: a leaf node within the hierarchy Model Parameter: a single weight vector June 28, 200718
19
Experimental Set-Up Task: sentence Insertion Domain: biography; “Living People” category from Wikipedia Gold Standard: insertion positions from update log of Wikipedia entries Evaluation Measure: Section accuracy Paragraph accuracy Tree Distance June 28, 200719
20
Corpus Corpus: 4051 sentence/article pairs Training set: 3240 pairs (80%) Test set: 811 pairs (20%) Corpus Statistics Average # of sentences: 32.9 Average # of sections: 3.1 Average # of paragraphs: 10.9 June 28, 200720
21
Human Evaluation June 28, 200721 Randomly selected 80 sentence/article pairs Four judges, each judge took 40 pairs, and every sentence/article pair is assigned to two judges Section Acc (%) Paragraph Acc (%) Tree Dist (# of edges) Avg accuracy66531.62 Mutual Agree65481.74
22
Baselines Straw baselines RandomIns: pick up a random paragraph of an article FirstIns: pick the first paragraph of an article LastIns: pick the last paragraph of an article Pipeline Training: train two rankers for section selection and paragraph selection separately Decoding: first choose the best section, and then choose the best paragraph within the chosen section Flat Training: flat training Decoding: find the best path by aggregate score June 28, 200722
23
Results June 28, 200723 * Diacritic indicates whether differences in accuracy between the given model and Hierarchy is significant. Section Acc (%) Paragraph Acc (%) Tree Dist (# of edges) RandomIns31.8*13.4*3.10* FirstIns25.0*13.6*3.23* LastIns30.6*21.5*3.00* Pipeline59.631.4*2.18* Flat58.732.3*2.18* Hierarchy59.838.32.04 Human66531.62
24
Results June 28, 200724 LastIns outperforms RandomIns and FirstIns. Section Acc (%) Paragraph Acc (%) Tree Dist (# of edges) RandomIns31.8*13.4*3.10* FirstIns25.0*13.6*3.23* LastIns30.6*21.5*3.00* Pipeline59.631.4*2.18* Flat58.732.3*2.18* Hierarchy59.838.32.04 Human66531.62
25
Results June 28, 200725 Hierarchy outperforms all baselines Section Acc (%) Paragraph Acc (%) Tree Dist (# of edges) RandomIns31.8*13.4*3.10* FirstIns25.0*13.6*3.23* LastIns30.6*21.5*3.00* Pipeline59.631.4*2.18* Flat58.732.3*2.18* Hierarchy59.838.32.04 Human66531.62
26
Results June 28, 200726 At paragraph-level, the gap between Machine and Human is reduced by 32%. Section Acc (%) Paragraph Acc (%) Tree Dist (# of edges) RandomIns31.8*13.4*3.10* FirstIns25.0*13.6*3.23* LastIns30.6*21.5*3.00* Pipeline59.631.4*2.18* Flat58.732.3*2.18* Hierarchy59.838.32.04 Human66531.62
27
Sentence-level Evaluation Local model (Lapata, 2003; Bollegala et al., 2006) Input: a sequence of sentences Output: find the best point by examining the two surrounding sentences of each insertion point Method: Standard Ranking Perceptron (Collins, 2002) Features: Lexical, Positional, and Temporal June 28, 200727
28
Sentence-level Evaluation Results Linear baseline: Use Local model to locate the sentence by simply treating an article as a sequence of sentences Accuracy: 24% Hierarchical method Step 1: Use Hierarchy to find best paragraph to place the sentence Step 2: Use Local model to locate the exact position within the chosen paragraph Accuracy: 35% June 28, 200728
29
Conclusions & Future work Conclusions Incremental text structuring presents a new perspective on text generation Hierarchical representation coupled with hierarchically sensitive training improves performance Future work Automatic update of Wikipedia web pages Combining structure induction with text structuring Code & Data: http://people.csail.mit.edu/edc/emnlp07 http://people.csail.mit.edu/edc/emnlp07 June 28, 200729
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.