Date : 2013/09/17 Source : SIGIR’13 Authors : Zhu, Xingwei

Slides:



Advertisements
Similar presentations
Date : 2014/06/10 Author :Shahab Kamali Frank Wm. Tompa Source : SIGIR’13 Advisor : Jia-ling Koh Speaker : Shao-Chun Peng Retrieving Documents With Mathematical.
Advertisements

Date: 2014/05/06 Author: Michael Schuhmacher, Simon Paolo Ponzetto Source: WSDM’14 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Knowledge-based Graph Document.
Sumblr: Continuous Summarization of Evolving Tweet Streams
Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :
Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
A Phrase Mining Framework for Recursive Construction of a Topical Hierarchy Date : 2014/04/15 Source : KDD’13 Authors : Chi Wang, Marina Danilevsky, Nihit.
Linking Named Entity in Tweets with Knowledge Base via User Interest Modeling Date : 2014/01/22 Author : Wei Shen, Jianyong Wang, Ping Luo, Min Wang Source.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.
Mining Query Subtopics from Search Log Data Date : 2012/12/06 Resource : SIGIR’12 Advisor : Dr. Jia-Ling Koh Speaker : I-Chih Chiu.
Chapter 12: Expert Systems Design Examples
An Efficient IP Address Lookup Algorithm Using a Priority Trie Authors: Hyesook Lim and Ju Hyoung Mun Presenter: Yi-Sheng, Lin ( 林意勝 ) Date: Mar. 11, 2008.
1 Prototype Hierarchy Based Clustering for the Categorization and Navigation of Web Collections Zhao-Yan Ming, Kai Wang and Tat-Seng Chua School of Computing,
Authors: Bhavana Bharat Dalvi, Meghana Kshirsagar, S. Sudarshan Presented By: Aruna Keyword Search on External Memory Data Graphs.
On Sparsity and Drift for Effective Real- time Filtering in Microblogs Date : 2014/05/13 Source : CIKM’13 Advisor : Prof. Jia-Ling, Koh Speaker : Yi-Hsuan.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
Leveraging Conceptual Lexicon : Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.
Load Balancing for Partition-based Similarity Search Date : 2014/09/01 Author : Xun Tang, Maha Alabduljalil, Xin Jin, Tao Yang Source : SIGIR’14 Advisor.
Rate-based Data Propagation in Sensor Networks Gurdip Singh and Sandeep Pujar Computing and Information Sciences Sanjoy Das Electrical and Computer Engineering.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
New and Improved: Modeling Versions to Improve App Recommendation Date: 2014/10/2 Author: Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, Tat-Seng Chua Source:
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Date: 2012/4/23 Source: Michael J. Welch. al(WSDM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Topical semantics of twitter links 1.
Date : 2012/10/25 Author : Yosi Mass, Yehoshua Sagiv Source : WSDM’12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
LOGO Finding High-Quality Content in Social Media Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis and Gilad Mishne (WSDM 2008) Advisor.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
LOGO Summarizing Conversations with Clue Words Giuseppe Carenini, Raymond T. Ng, Xiaodong Zhou (WWW ’07) Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Templated Search over Relational Databases Date: 2015/01/15 Author: Anastasios Zouzias, Michail Vlachos, Vagelis Hristidis Source: ACM CIKM’14 Advisor:
DOCUMENT UPDATE SUMMARIZATION USING INCREMENTAL HIERARCHICAL CLUSTERING CIKM’10 (DINGDING WANG, TAO LI) Advisor: Koh, Jia-Ling Presenter: Nonhlanhla Shongwe.
Local/Global Term Analysis for Discovering Community Differences in Social Networks David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy Data Mining Research.
A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.
Date: 2013/10/23 Author: Salvatore Oriando, Francesco Pizzolon, Gabriele Tolomei Source: WWW’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang SEED:A Framework.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Dynamic Multi-Faceted Topic Discovery in Twitter Date : 2013/11/27 Source : CIKM’13 Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang 1.
LINDEN : Linking Named Entities with Knowledge Base via Semantic Knowledge Date : 2013/03/25 Resource : WWW 2012 Advisor : Dr. Jia-Ling Koh Speaker : Wei.
Compact Query Term Selection Using Topically Related Text Date : 2013/10/09 Source : SIGIR’13 Authors : K. Tamsin Maxwell, W. Bruce Croft Advisor : Dr.Jia-ling,
Multi-Aspect Query Summarization by Composite Query Date: 2013/03/11 Author: Wei Song, Qing Yu, Zhiheng Xu, Ting Liu, Sheng Li, Ji-Rong Wen Source: SIGIR.
CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,
Topical Clustering of Search Results Date : 2012/11/8 Resource : WSDM’12 Advisor : Dr. Jia-Ling Koh Speaker : Wei Chang 1.
PERSONALIZED DIVERSIFICATION OF SEARCH RESULTS Date: 2013/04/15 Author: David Vallet, Pablo Castells Source: SIGIR’12 Advisor: Dr.Jia-ling, Koh Speaker:
Leveraging Knowledge Bases for Contextual Entity Exploration Categories Date:2015/09/17 Author:Joonseok Lee, Ariel Fuxman, Bo Zhao, Yuanhua Lv Source:KDD'15.
TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:
Extracting Query Facets From Search Results Date : 2013/08/20 Source : SIGIR’13 Authors : Weize Kong and James Allan Advisor : Dr.Jia-ling, Koh Speaker.
Date: 2013/9/25 Author: Mikhail Ageev, Dmitry Lagun, Eugene Agichtein Source: SIGIR’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Improving Search Result.
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
2016/3/11 Exploiting Internal and External Semantics for the Clustering of Short Texts Using World Knowledge Xia Hu, Nan Sun, Chao Zhang, Tat-Seng Chu.
ENHANCING CLUSTER LABELING USING WIKIPEDIA David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab SIGIR’09.
Best-first search is a search algorithm which explores a graph by expanding the most promising node chosen according to a specified rule.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
ClusCite:Effective Citation Recommendation by Information Network-Based Clustering Date: 2014/10/16 Author: Xiang Ren, Jialu Liu,Xiao Yu, Urvashi Khandelwal,
Customized of Social Media Contents using Focused Topic Hierarchy
Improving Search Relevance for Short Queries in Community Question Answering Date: 2014/09/25 Author : Haocheng Wu, Wei Wu, Ming Zhou, Enhong Chen, Lei.
TITLE Authors Institution RESULTS INTRODUCTION CONCLUSION AIMS METHODS
Summarizing answers in non-factoid community Question-answering
A Large Scale Prediction Engine for App Install Clicks and Conversions
Learning Literature Search Models from Citation Behavior
Date : 2013/1/10 Author : Lanbo Zhang, Yi Zhang, Yunfei Chen
Enriching Taxonomies With Functional Domain Knowledge
Preference Based Evaluation Measures for Novelty and Diversity
Presentation transcript:

Topic Hierarchy Construction for the Organization of Multi-Source User Generated Contents Date : 2013/09/17 Source : SIGIR’13 Authors : Zhu, Xingwei Ming Zhao-Yan Zhu, Xiaoyan Chua, Tat-Seng Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang

Outline Introduction Approach Experiment Conclusion

IPhone 5s? IPhone 5c?

Multi-Source User Generated Contents

Problem Formulation Goal : Given a root topic C and its information source set Sc, we aim to build and continuously update a topic hierarchy H for C in order to organize the information in Sc according to their relevant topics. In this paper, Sc={Blogger, Twitter, community QA site(cQA)}

Outline Introduction Approach Experiment Conclusion Framework Topic Term Identification Topic Relation Identification Topic Hierarchy Generation Topic Hierarchy Update Experiment Conclusion

Framwork

Topic Term Identification User Generated Contents Potential Grounding Topics Heuristic Rules Grounding Topic Set TF-IDF Final Candidate Topic Set External Sources

Heuristic Rules

Grounding Topic Set TFIDF IPhone IPhone Blog 1 Apple Inc. Apple Inc. QA 1 T-Mobile Apple Inc. T-Mobile Apple IOS IPhone Smartphone QA 2 IOS Apple Inc. 64-bit Tweet 1 IOS Price Tweet 2 IPhone IOS

Grounding Topic Set Blogs cQAs : Tweets : Use the content and title Double weights of terms in titles Use the top 5 terms cQAs : Use the question title, description and the best answers Tweets : Use the content Use the top 1 terms

Topic Set Extension What we already have : What it lacks : Grounding topic set 𝑇 𝐺 ={ 𝑡 𝑔1 , 𝑡 𝑔2 ,…} What it lacks : Middle level topic How to get middle level topics : Search Engine : 2 patterns * such as <slot> <slot> of * WordNet : direct hypernym Wikipedia : category tags Final candidate topic set : 𝑇={𝐶}∪ 𝑇 𝐺 ∪ 𝑇 𝐺

Outline Introduction Approach Experiment Conclusion Framework Topic Term Identification Topic Relation Identification Topic Hierarchy Generation Topic Hierarchy Update Experiment Conclusion

Topic Relation Identification Apple Inc. 𝑒(𝑟( 𝑡 𝐴 , 𝑡 𝐵 )) 𝑒(𝑟( 𝑡 𝐵 , 𝑡 𝐴 )) 𝑒(𝑟( 𝑡 𝐴 , 𝑡 𝐶 )) 𝑒(𝑟( 𝑡 𝐶 , 𝑡 𝐴 )) 𝑒(𝑟( 𝑡 𝐶 , 𝑡 𝐵 )) IPhone IPhone 5s 𝑒(𝑟( 𝑡 𝐵 , 𝑡 𝐶 )) Denote 𝑟 𝑡 𝐴 , 𝑡 𝐵 as a sub-topic relation, which means 𝑡 𝐵 is a sub-topic of 𝑡 𝐴

Topic Relation Identification

Evidences from the Information Source Set 𝑒 𝑑𝑖𝑠𝑡𝑟 𝑑𝑜𝑐 ( 𝑡 𝐴 , 𝑡 𝐵 ), 𝑒 𝑑𝑖𝑠𝑡𝑟 𝑠𝑒𝑛 ( 𝑡 𝐴 , 𝑡 𝐵 ) : the cosine similarity between the corresponding contexts of them V=(smart phone, price, buy, iOS, Android) 𝑡 𝐴 =𝐴𝑝𝑝𝑙𝑒 𝐼𝑛𝑐 𝑡 𝐵 =𝑇−𝑀𝑜𝑏𝑖𝑙𝑒 𝑣 𝑡 𝐴 =(3, 5, 10, 2, 3) 𝑣 𝑡 𝐵 =(2, 4, 11, 1, 3) 𝑒 𝑑𝑖𝑠𝑡𝑟 𝑑𝑜𝑐 𝑡 𝐴 , 𝑡 𝐵 = < 𝑣 𝑡 𝐴 , 𝑣 𝑡 𝐵 > 𝑣 𝑡 𝐴 𝑣 𝑡 𝐵

Evidences from Wikipedia Pointwise Mutual Information (PMI)

Evidences from WordNet

Evidences from Search Engine Results Pattern-based evidences Query = “tA such as tB and” root topic 𝑒 𝑠𝑝𝑎𝑡𝑡𝑒𝑟𝑛 𝑖 ( 𝑡 𝐴 , 𝑡 𝐵 ) = 1 if the search engine returns more than ζ results that contain this query; otherwise it is set to 0.

Combine Evidences

Outline Introduction Approach Experiment Conclusion Framework Topic Term Identification Topic Relation Identification Topic Hierarchy Generation Topic Hierarchy Update Experiment Conclusion

Topic Hierarchy Generation

Topic Hierarchy Generation

Topic Hierarchy Generation

Topic Hierarchy Generation

Edge Weighting

Hierarchy Pruning Use the Chu- Liu/Edmond’s optimum branching algorithm every non-root node has only one parent and the sum of the edge weights are maximized remove (1) the nodes that are not reachable for the root topic and (2) the leaf nodes that are not in the grounding topic set.

Topic Hierarchy Update

Outline Introduction Approach Experiment Conclusion Framework Topic Term Identification Topic Relation Identification Topic Hierarchy Generation Topic Hierarchy Update Experiment Conclusion

Topic Term Identification

Topic Hierarchy Generation

Topic Hierarchy Generation

Hierarchy Update

Outline Introduction Approach Experiment Conclusion Framework Topic Term Identification Topic Relation Identification Topic Hierarchy Generation Topic Hierarchy Update Experiment Conclusion

Conclusion Given a root topic, we used evidences from multiple UGCs to identify topic terms and sub-topic relations between them. With these topic terms, a graph-based algorithm was applied to generate and update the topic hierarchies, on which the UGCs can be organized according to their relevant topics.