Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.

Slides:



Advertisements
Similar presentations
Term Level Search Result Diversification DATE : 2013/09/11 SOURCE : SIGIR’13 AUTHORS : VAN DANG, W. BRUCE CROFT ADVISOR : DR.JIA-LING, KOH SPEAKER : SHUN-CHEN,
Advertisements

DQR : A Probabilistic Approach to Diversified Query recommendation Date: 2013/05/20 Author: Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Source:
Collaborative Filtering in Social Tagging System on Joint Item-Tag Recommendations Date : 2011/11/7 Source : Jing Peng et. al (CIKM’10) Speaker : Chiu.
Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :
Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.
Text-Based Measures of Document Diversity Date : 2014/02/12 Source : KDD’13 Authors : Kevin Bache, David Newman, and Padhraic Smyth Advisor : Dr. Jia-Ling,
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Time-sensitive Personalized Query Auto-Completion
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Active Learning and Collaborative Filtering
Mining Query Subtopics from Search Log Data Date : 2012/12/06 Resource : SIGIR’12 Advisor : Dr. Jia-Ling Koh Speaker : I-Chih Chiu.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Leveraging Conceptual Lexicon : Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
Mining High Utility Itemsets without Candidate Generation Date: 2013/05/13 Author: Mengchi Liu, Junfeng Qu Source: CIKM "12 Advisor: Jia-ling Koh Speaker:
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
Andriy Shepitsen, Jonathan Gemmell, Bamshad Mobasher, and Robin Burke
No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.
Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.
ON THE SELECTION OF TAGS FOR TAG CLOUDS (WSDM11) Advisor: Dr. Koh. Jia-Ling Speaker: Chiang, Guang-ting Date:2011/06/20 1.
You Are What You Tag Yi-Ching Huang and Chia-Chuan Hung and Jane Yung-jen Hsu Department of Computer Science and Information Engineering Graduate Institute.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
NTU Natural Language Processing Lab. 1 An Analysis of Effectiveness of Tagging in Blogs Christopher H. Brooks and Nancy Montanez University of San Francisco.
LOGO Finding High-Quality Content in Social Media Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis and Gilad Mishne (WSDM 2008) Advisor.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Templated Search over Relational Databases Date: 2015/01/15 Author: Anastasios Zouzias, Michail Vlachos, Vagelis Hristidis Source: ACM CIKM’14 Advisor:
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
DOCUMENT UPDATE SUMMARIZATION USING INCREMENTAL HIERARCHICAL CLUSTERING CIKM’10 (DINGDING WANG, TAO LI) Advisor: Koh, Jia-Ling Presenter: Nonhlanhla Shongwe.
Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Exploiting Group Recommendation Functions for Flexible Preferences.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Date: 2012/11/29 Author: Chen Wang, Keping Bi, Yunhua Hu, Hang Li, Guihong Cao Source: WSDM’12 Advisor: Jia-ling, Koh Speaker: Shun-Chen, Cheng.
Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
PERSONALIZED DIVERSIFICATION OF SEARCH RESULTS Date: 2013/04/15 Author: David Vallet, Pablo Castells Source: SIGIR’12 Advisor: Dr.Jia-ling, Koh Speaker:
A DDING S TRUCTURE TO T OP -K: F ORM I TEMS TO E XPANSIONS Date : Source : CIKM’ 11 Speaker : I-Chih Chiu Advisor : Dr. Jia-Ling Koh 1.
Search Result Diversification in Resource Selection for Federated Search Date : 2014/06/17 Author : Dzung Hong, Luo Si Source : SIGIR’13 Advisor: Jia-ling.
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
1 Blog Cascade Affinity: Analysis and Prediction 2009 ACM Advisor : Dr. Koh Jia-Ling Speaker : Chou-Bin Fan Date :
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
ENHANCING CLUSTER LABELING USING WIKIPEDIA David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab SIGIR’09.
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
LOGO Comments-Oriented Blog Summarization by Sentence Extraction Meishan Hu, Aixin Sun, Ee-Peng Lim (ACM CIKM’07) Advisor : Dr. Koh Jia-Ling Speaker :
1 Discovering Web Communities in the Blogspace Ying Zhou, Joseph Davis (HICSS 2007)
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
CiteData: A New Multi-Faceted Dataset for Evaluating Personalized Search Performance CIKM’10 Advisor : Jia-Ling, Koh Speaker : Po-Hsien, Shih.
A Recommender System based on Tag and Time Information for Social Tagging Systems Nan Zheng and Qiudan Li (Chinese Academy of Sciences) Expert Systems.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
QUERY-PERFORMANCE PREDICTION: SETTING THE EXPECTATIONS STRAIGHT Date : 2014/08/18 Author : Fiana Raiber, Oren Kurland Source : SIGIR’14 Advisor : Jia-ling.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Speaker: Jim-an tsai advisor: professor jia-lin koh
Discovery of Blog Communities based on Mutual Awareness
Preference Based Evaluation Measures for Novelty and Diversity
Presentation transcript:

Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1

Index Introduction Tag Selection Framework Tag Selection Strategies  Based on Frequency  Based on Diversity  Based on Rank Aggregation Evaluation Methodology Experimental Evaluation Conclusions 2

Introduction Tagging has become a very common feature in Web 2.0 applications, providing a simple and effective way for users to freely annotate resources to facilitate their discovery and management. Tag clouds have become popular as a summarized representation of a collection of tagged resources. 3

Introduction Motivation  How effective is the strategy of ranking tags in item collections based on their frequency?  Are there any better strategies for this task? 4

Tag Selection Framework 5 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

Tag Selection Framework Define the overall utility value of T G 6 group G t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 ={t 1,t 2,t 3,t 4,t 5 }

Tag Selection Framework The optimal tag cloud for G is the set T G that is a subset of T(G) with size k and maximizes the utility function F Propose different tag selection methods based on different approaches for defining the utility function f for the members of the tag cloud. 7 T G ={t 1,t 2,t 3 } T G ={t 1,t 2,t 4 } T G ={t 1,t 2,t 5 } … T G ={t 3,t 4,t 5 }

Tag Selection Strategies Base on Frequency  Frequency scoring  TF.IDF scoring  Graph-based scoring Based on Diversity  Diversity  Novelty Based on Rank Aggregation 8

Based on Frequency Frequency scoring  The number of objects to which a tag is assigned. 9 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

Based on Frequency TF.IDF scoring  The computation of the utility score of a tag t with respect to a group G relies not only on the contents of this particular group but also on the contents of the other groups in the collection. 10 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

Based on Frequency Graph-based scoring  Considering combinations of tags that occur together rather than individual tags may be more informative. 11 Google similarity distance

Based on Frequency Graph-based scoring 12

Based on Diversity Diversity  To select tags that are as dissimilar as possible from each other, in the sense that appear indifferent sets of objects. 13 t={beach} sim(beach,sea) sim(beach,sky) t={forest} sim(forest,sea) sim(forest,sky) sea,sky

Based on Diversity Novelty  To emphasize on the novelty of newly selected tags, while the cloud is constructed. 14 This function can be defined to return 1 if n v,T G = 0, and 0 otherwise. For example, a tag t appears only in a single object u, and there is already another tag of u in the cloud, then the utility score of t is 0. sea,sky sea,beach,sun u TGTG

Based on Rank Aggregation The order in which the tags appear in these objects. Define a utility function based on the Borda Count method. 15 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

Evaluation Methodology Metrics for Search and Navigation  Coverage  Overlap  Selectivity User Navigation Model Group Recommendation Accuracy 16

Metrics for Search and Navigation Coverage  Since a tag cloud aims at providing an entry point for searching and navigating.  For every object, at least one of its tags should appear in the tag cloud. 17 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

Metrics for Search and Navigation Overlap  It would like to avoid cases where different tags in the cloud, when selected, lead to the same or very similar subsets of objects. 18 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

Metrics for Search and Navigation Selectivity  A tag cloud should facilitate users to drill down to specific objects of interest. 19 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G

User Navigation Model 20

Group Recommendation Accuracy They have considered the task of using the tag cloud to find items of interest within a group. Another important and common task is to recommend groups for new items. 21 t 1 t 2 t 3 t 4 t 5 t 1 t 3 t 4 t 1 t 2 t 5 t 1 t 3 t 1 t 3 t 5 u1 u2 u3 u5u4 group G t 2 t 5 t 6 u T G ={t 2,t 4 }

Experimental Evaluation Dataset  Top 60 groups.  2000 photos for each group 22

Results Coverage, Overlap and Selectivity  Increasing the size of the tag cloud improves the performance of all methods in all metrics. 23

Results Navigation Cost  The navigation cost is affected by coverage and selectivity.  The navigation cost decreases for all methods as the tag cloud size increases. 24

Results Recommendation Accuracy  The goal is to recommend groups for this photo based on their tag clouds. 25

Conclusions Methods employing diversification or rank aggregation can improve the performance of tag clouds with respect to these metrics, compared to the traditional frequency-based ranking. There exist several interesting directions for future work, these include extracting semantics of tags and exploiting content-based similarity of objects. 26