Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Wenyi Huang, Yabin Zheng and Maosong Sun 2010, ACM Automatic Keyphrase Extraction.

Slides:



Advertisements
Similar presentations
Suleyman Cetintas 1, Monica Rogati 2, Luo Si 1, Yi Fang 1 Identifying Similar People in Professional Social Networks with Discriminative Probabilistic.
Advertisements

Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Harun Ug˘uz 2011.KBS A two-stage feature selection method for text categorization by.
Finding Topic-sensitive Influential Twitterers Presenter 吴伟涛 TwitterRank:
Linking Named Entity in Tweets with Knowledge Base via User Interest Modeling Date : 2014/01/22 Author : Wei Shen, Jianyong Wang, Ping Luo, Min Wang Source.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 3 March 23, 2005
Caimei Lu et al. (KDD 2010) Presented by Anson Liang.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
Affinity Rank Yi Liu, Benyu Zhang, Zheng Chen MSRA.
1 A Topic Modeling Approach and its Integration into the Random Walk Framework for Academic Search 1 Jie Tang, 2 Ruoming Jin, and 1 Jing Zhang 1 Knowledge.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Semantic History Embedding in Online Generative Topic Models Pu Wang (presenter) Authors: Loulwah AlSumait Daniel Barbará
Leveraging Conceptual Lexicon : Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : UNIVERSIT´E CATHOLIQUE DE LOUVAIN, BELGIUM ASSOCIATION FOR COMPUTING MACHINERY.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning Author: Chaitanya Chemudugunta America Holloway Padhraic Smyth.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Graph-based Algorithms in Large Scale Information Retrieval Fatemeh Kaveh-Yazdy Computer Engineering Department School of Electrical and Computer Engineering.
Intelligent Database Systems Lab Presenter : NENG-KAI, HONG Authors : CÉSAR DOMÍNGUEZ, ARTURO JAIME 2014, CE Database design learning: A project-based.
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors : JEROEN DE KNIJFF, FLAVIUS FRASINCAR, FREDERIK HOGENBOOM DKE Data & Knowledge.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Evaluation of novelty metrics for sentence-level novelty mining Presenter : Lin, Shu-Han Authors : Flora.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : Jorge Villalon and Rafael A. Calvo 2011, EST Concept Maps as Cognitive Visualizations.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Chinese Word Segmentation and Statistical Machine Translation Presenter : Wu, Jia-Hao Authors : RUIQIANG.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Yongzheng Zhang, Rajyashree Mukherjee, Benny Soetarman 2012, ACM Concept Extraction for.
Automatically Generating Gene Summaries from Biomedical Literature (To appear in Proceedings of PSB 2006) X. LING, J. JIANG, X. He, Q.~Z. MEI, C.~X. ZHAI,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Web usage mining: extracting unexpected periods from web.
Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Xinxiong Chen, Yabin Zheng, Maosong Sun 2011, FCCNLL Automatic Keyphrase.
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb *, Mohamed Ben Aouicha, Abdelmajid Ben Hamadou KBS Computing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog,
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
A Novel Pattern Learning Method for Open Domain Question Answering IJCNLP 2004 Yongping Du, Xuanjing Huang, Xin Li, Lide Wu.
Cube Kohonen Self-Organizing Map (CKSOM) Model
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Adaptation of the Vector-Space Model for Ontology-Based.
Intelligent Database Systems Lab Presenter: Wu, Jhen-Wei Authors: Fabian Bürger, Josef Pauli ICPRAM. Representation Optimization with Feature Selection.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia.
Convergence of PageRank and HITS Algorithms Victor Boyarshinov Eric Anderson 12/5/02.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Bui Quang Hung, Masanori Otsubo, Yoshinori Hijikata, Shogo Nishida 2010.WIA. HITS.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : HAI V. PHAM, ERIC W. COOPER, THANG CAO, KATSUARI KAMEI INFORMATION SCIENCES Hybrid.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Yu Cheng Chen Author: YU-SHENG.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : YUNG-MING LI, TSUNG-YING LI 2013, DSS Deriving market intelligence from microblogs.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Abdul Wahid, Xiaoying Gao, Peter Andreae
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Wen Zhang, Taketoshi Yoshida, Xijin Tang 2011.ESWA A comparative study of TF*IDF,
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
AUTOMATIC KEYPHRASE EXTRACTION VIA TOPIC DECOMPOSITION Proceeding EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining knowledge from natural language texts using fuzzy associated concept mapping Presenter : Wu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 TIARA: A Visual Exploratory Text Analytic System Presenter.
Leveraging Knowledge Bases for Contextual Entity Exploration Categories Date:2015/09/17 Author:Joonseok Lee, Ariel Fuxman, Bo Zhao, Yuanhua Lv Source:KDD'15.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Text Classification Improved through Multigram Models.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee IPM Multilingual document mining.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
Improving compound–protein interaction prediction by building up highly credible negative samples Toward more realistic drug-target interaction predictions.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Yong-Bin Kang, Pari Delir Haghighi, Frada Burstein ESA CFinder: An intelligent key.
Recommendation in Scholarly Big Data
Using lexical chains for keyword extraction
Word AdHoc Network: Using Google Core Distance to extract the most relevant information Presenter : Wei-Hao Huang   Authors : Ping-I Chen, Shi-Jen.
Enriching Taxonomies With Functional Domain Knowledge
Presented by Nick Janus
Presentation transcript:

Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Zhiyuan Liu, Wenyi Huang, Yabin Zheng and Maosong Sun 2010, ACM Automatic Keyphrase Extraction via Topic Decomposition

Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments 1

Intelligent Database Systems Lab Motivation Existing graph-based ranking methods for keyphrase extraction just compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics. 2

Intelligent Database Systems Lab Objectives We thus build a Topical PageRank (TPR) on word graph to measure word importance with respect to different topics. we further calculate the ranking scores of words and extract the top ranked ones as keyphrases. 3

Intelligent Database Systems Lab Methodology-Building Topic Interpreters 1 α, β from: ex: Gibbs sampling Pr(w|z) ∈ ϕ(z) ∈ ϕ θ Pr(z|d) ∈ θ (d) ∈ θ Document-topic Topic-word LDA output:

Intelligent Database Systems Lab Methodology- Topical PageRank for Keyphrase Extraction 1

Intelligent Database Systems Lab Methodology- Constructing Word Graph Slide window size = 3 The document is regarded as a word sequence 1

Intelligent Database Systems Lab Methodology- Topical PageRank(PageRank) Define: weight of link (wi,wj) as e(wi,wj) 1

Intelligent Database Systems Lab Methodology- Topical PageRank(PageRank) out-degree of vertex equal probabilities of random jump to all vertices. 1

Intelligent Database Systems Lab Methodology- Topical PageRank From LDA 1 =pr(w)*pr(z)/pr(z) focuses on word =pr(z)*pr(w)/pr(w) focuses on topic (Cohn and Chang, 2000).

Intelligent Database Systems Lab Methodology- Extract Keyphrases Using Ranking Scores 1 Step1. annotate the document with POS tags. Step2. select noun phrases. Step3. compute the ranking scores of candidate keyphrases separately for each topic. PageRank Topic PageRank Step4. integrate topic-specific rankings of candidate keyphrases into a final ranking.

Intelligent Database Systems Lab Experiment- Datasets Dataset: 1 Article keyphrases NEWS RESEARCH Topic model: build topic interpreters with LDA. corpusWeb page wordtopic Wikipedia snapshot at March to 1500

Intelligent Database Systems Lab Experiment- Evaluation Metrics 1 However, precision/recall/F-measure does not take the order of extracted keyphrases into account. The large value is better than small values. The values is between 0 and 1.

Intelligent Database Systems Lab Experiment- Influences of Parameters to TPR 1 Window Size W The Number of Topics K

Intelligent Database Systems Lab Experiment - Influences of Parameters to TPR 1 Damping Factor λ Preference Values =pr(w)*pr(z)/pr(z) focuses on word =pr(z)*pr(w)/pr(w) focuses on topic Ex.he 、 she

Intelligent Database Systems Lab Experiment - Comparing with Baseline Methods 1 do not use topic information TPR enjoys the advantages of both LDA and TFIDF/PageRank

Intelligent Database Systems Lab Experiment - Extracting Example 1

Intelligent Database Systems Lab Conclusions Experiments on two datasets show that TPR achieves better performance than other baseline methods. 1

Intelligent Database Systems Lab Comments Advantages – TPR incorporates topic information within random walk for keyphrase extraction. Applications – Automatic Keyphrase Extraction. 1