Download presentation
Presentation is loading. Please wait.
Published byScarlett Todd Modified over 9 years ago
1
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase Extraction by Bridging Vocabulary Gap 1
2
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments 2
3
Intelligent Database Systems Lab Motivation Most methods extract keyphrases according to their statistical properties in the given document. This makes a large vocabulary gap between a document and its keyphrases. ApproachProperty TFIDFstatistical frequencies TextRanktends to statistical frequencies ExpandRanktopic drift LDAsuggest general words 3
4
Intelligent Database Systems Lab Objectives We use word alignment models in statistical machine translation to learn translation probabilities between the words in documents and the words in keyphrases. 4
5
Intelligent Database Systems Lab Methodology- Bridging Vocabulary Gap Using WAM 5
6
Intelligent Database Systems Lab Methodology- Preparing Translation Pairs 6
7
Intelligent Database Systems Lab Methodology- Title-based Pairs 7
8
Intelligent Database Systems Lab Methodology- Summary-based Pairs ApproachProperty Sampling methodloses the order split methodLonger training time of WAM 8
9
Intelligent Database Systems Lab Methodology- Training Translation Models translation pair connection 9
10
Intelligent Database Systems Lab Methodology- Keyphrase Extraction Noun phrase normalized TFIDF scores 10
11
Intelligent Database Systems Lab Experiment Dataset: NameArticlekeyphrasesNumber of words Chinese news articles 13702website editors 72900 documentstitlessummaries average lengths971.711.645.8 5-fold cross validation 11
12
Intelligent Database Systems Lab Experiment- Evaluation on Keyphrase Extraction Performance Comparison and Analysis 12
13
Intelligent Database Systems Lab Experiment- Influences of Parameters to TPR Influence of Parameters When Titles/Summaries Are Unavailable 13
14
Intelligent Database Systems Lab Experiment - Beyond Extraction: Keyphrase Generation 14
15
Intelligent Database Systems Lab Conclusions We use IBM Model-1 to bridge the vocabulary gap between the two languages for keyphrase generation. 15
16
Intelligent Database Systems Lab Comments Advantages – Our method can capture the semantic relations between words in documents and keyphrases. Applications – Keyphrase extraction. 16
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.