Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW 2008 2009. 03. 13. Summarized and presented.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
Bringing Order to the Web: Automatically Categorizing Search Results Hao Chen SIMS, UC Berkeley Susan Dumais Adaptive Systems & Interactions Microsoft.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
Query Specific Fusion for Image Retrieval
ARNOLD SMEULDERS MARCEL WORRING SIMONE SANTINI AMARNATH GUPTA RAMESH JAIN PRESENTERS FATIH CAKIR MELIHCAN TURK Content-Based Image Retrieval at the End.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
ImageCLEF breakout session Please help us to prepare ImageCLEF2010.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Recommender systems Ram Akella November 26 th 2008.
Deriving Emergent Web Page Semantics D.V. Sreenath*, W.I. Grosky**, and F. Fotouhi* *Wayne State University **University of Michigan-Dearborn.
Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun.
Finding Wormholes with Flickr Geotags Maarten Clements Marcel Reinders Arjen de Vries Pavel Serdyukov December 3 rd, 2009 GIS.
Tag-based Social Interest Discovery
School of Electronics Engineering and Computer Science Peking University Beijing, P.R. China Ziqi Wang, Yuwei Tan, Ming Zhang.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
Classifying Tags Using Open Content Resources Simon Overell, Borkur Sigurbjornsson & Roelof van Zwol WSDM ‘09.
Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata Anon Plangprasopchok 1, Kristina Lerman 1, Lise Getoor 2 1 USC.
Automatically Identifying Localizable Queries Center for E-Business Technology Seoul National University Seoul, Korea Nam, Kwang-hyun Intelligent Database.
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
PageRank for Product Image Search Yushi Jing, Shumeet Baluja College of Computing, Georgia Institute of Technology Google, Inc. WWW 2008 Referred Track:
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Recommendation system MOPSI project KAROL WAGA
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Perception of Content, Structure, and Presentation Changes in Web-based Hypertext Luis Francisco-Revilla Frank M. Shipman III Richard Furuta Unmil Karadkar.
Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Flickr the framework of Flickr. Observe them  How many photos does each user offer?  How many tags does each photo have?  The tag hot-list  How many.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Center for E-Business Technology Seoul National University Seoul, Korea Social Ranking: Uncovering Relevant Content Using Tag-based Recommender Systems.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
Instance-based mapping between thesauri and folksonomies Christian Wartena Rogier Brussee Telematica Instituut.
1 A Compact Feature Representation and Image Indexing in Content- Based Image Retrieval A presentation by Gita Das PhD Candidate 29 Nov 2005 Supervisor:
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Automating Readers’ Advisory to Make Book Recommendations for K-12 Readers by Alicia Wood.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Learning in a Pairwise Term-Term Proximity Framework for Information Retrieval Ronan Cummins, Colm O’Riordan Digital Enterprise Research Institute SIGIR.
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Flickr Tag Recommendation based on Collective Knowledge Hyunwoo Kim SNU IDB Lab. August 27, 2008 Borkur Sigurbjornsson, Roelof van Zwol Yahoo! Research.
CiteData: A New Multi-Faceted Dataset for Evaluating Personalized Search Performance CIKM’10 Advisor : Jia-Ling, Koh Speaker : Po-Hsien, Shih.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Efficient Top-k Querying over Social-Tagging Networks Ralf Schenkel, Tom Crecelius, Mouna Kacimi, Sebastian Michel, Thomas Neumann, Josiane Xavier Parreira,
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Neighborhood - based Tag Prediction
Analyzing and Interpreting Quantitative Data
An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛
Project 3 Image Retrieval
Presentation transcript:

Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented by Hwang Inbeom, IDS Lab., Seoul National University

Copyright  2008 by CEBT Overview  Recommending tags for an image More tags, more semantic meanings  Solves two questions How much would the recommending be effective? – Analyzing tagging behaviors How can we recommend tags? – Presenting some recommending strategies 2

Copyright  2008 by CEBT Tagging  Tagging The act of adding keywords to objects  Popular means to annotate various web resources Web page bookmarks Academic publications Multimedia objects … 3

Copyright  2008 by CEBT Advantages of Tagging Images  Content-based image retrieval is progressing, but it has not yet succeeded in reducing semantic gap  Tagging is essential for large-scale image retrieval systems to work in practice  Extension of tags Richer semantic description Can be used to retrieve the photo for a larger range of keyword queries 4 Sagrada Familia Barcelona Sagrada Familia Barcelona Sagrada Familia Barcelona Gaudi Spain Catalunya architecture church Sagrada Familia Barcelona Gaudi Spain Catalunya architecture church

Copyright  2008 by CEBT Analysis of Tagging Behaviors  How do users tag photos? Distribution of tag frequency Distribution of the number of tags per photo  What kind of tags do they provide? Tag categorization with WordNet 5

Copyright  2008 by CEBT Head Tail Tag Frequency  Distribution of tag frequency could be modeled by a power law  Tags residing in the head of power law Too generic tags – 2006, 2005, wedding  Tags in tail of power law Incidentally occurring words – ambrose tompkins, ambient vector 6

Copyright  2008 by CEBT Head Tail Number of Tags per Photo  Distribution could be modeled by power law too  Photos in head of power law Exhaustively annotated  Photos in tail of power law Tag recommendation system could be useful – Covers 64% of the photos 7

Copyright  2008 by CEBT Number of Tags per Photo (contd.)  Photos classified by number of tags annotated To be used to analyze the performance of recommending for different annotation levels 8 Tags per PhotoPhotos Class I115,500,000 Class II2-317,500,000 Class III4-612,000,000 Class IV>67,000,000

Copyright  2008 by CEBT Tag Categorization 9  52% of tags could be categorized by WordNet categories  Users provide a broader context by tags, not only visual contents of the photo Where / when the photo was taken Actions people in the photo are doing …

Copyright  2008 by CEBT Tag Recommendation System 10 Sagrada Familia Barcelona Sagrada Familia Barcelona Spain Gaudi 2006 Catalunya Europe travel Sagrada Familia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Gaudi Spain Catalunya architecture church

Copyright  2008 by CEBT Tag Recommendation Strategies  Finding candidate tags based on tag co-occurrence Symmetric measures Asymmetric measures  Aggregation and ranking of candidate tags Voting strategy Summing strategy  Promotion 11

Copyright  2008 by CEBT Tag Co-occurrence  Finding tags co-occurring with a specific tag  Co-occurring tags with higher score become candidate tags  Could be measured in two ways Symmetric measures Asymmetric measures 12

Copyright  2008 by CEBT Tag Co-occurrence (contd.)  Symmetric measures Jaccard’s coefficient – Statistic used for computing the similarity and diversity of sample sets Useful to identify equivalent tags Example – Eiffel tower – Tour Eiffel, Eiffel, Seine, La tour Eiffel, Paris 13

Copyright  2008 by CEBT Tag Co-occurrence (contd.)  Asymmetric measures Tag co-occurrence can be normalized using the frequency of one of the tags Can provide more diverse candidates than symmetric method Example – Eiffel Tower – Paris, France, Tour Eiffel, Eiffel, Europe  Asymmetric tag co-occurrence will provide a more suitable diversity 14

Copyright  2008 by CEBT Tag Aggregation  Definitions U is user-defined tags C u is top-m most co-occurring tags of a tag u in U C is the union of all candidate tags for all user-defined tag u R is recommended tags 15 Sagrada Familia Barcelona Sagrada Familia Barcelona Spain Gaudi 2006 Catalunya Europe travel Sagrada Familia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Gaudi Spain Catalunya architecture church

Copyright  2008 by CEBT Tag Aggregation (contd.)  Vote For each candidate tag c in C, whenever c is in C u a vote is cast R is obtained by sorting the candidate tags on the number of votes 16 Barcelona Spain Gaudi 2006 Catalunya Europe travel Sagrada Familia Barcelona Gaudi Spain architecture Catalunya church TagScore Barcelona1 Gaudi2 Spain2 ……

Copyright  2008 by CEBT Tag Aggregation (contd.)  Sum Sums over co-occurrence values of the candidate tags c in C u 17

Copyright  2008 by CEBT Promotion  Stability-promotion To make user-defined tags with low frequency less reliable  Descriptiveness-promotion To avoid general tags ranked too highly 18 Head Tail

Copyright  2008 by CEBT Promotion (contd.)  Rank-promotion Co-occurrence values used in summing strategy declines too fast To make co-occurrence values work better  Applying promotion 19

Copyright  2008 by CEBT Experimental Setup  For different strategies  Assessments Top 10 recommendations from each of the four strategies make a pool Assessors were asked to assess the descriptiveness of each tags – Assessed as very good, good, not good, don’t know Assessors could access and view photo directly on Flickr, to find additional context 20 votesum No-promotionvotesum Promotionvote+sum+

Copyright  2008 by CEBT Experimental Setup (contd.)  Evaluation metrics Mean Reciprocal Rank (MRR) – Evaluates probability that the system returns a “relevant” tag at the top of the ranking – Tag is relevant if its relevance score is bigger than average of relevance Success at rank k – Probability of finding a good descriptive tag among the top k recommended tags Precision at rank k – Proportion of retrieved tags that is relevant, averaged over all photos 21

Copyright  2008 by CEBT Experiment Results  Promotion worked well  Without promotion, summing is better  With promotion, voting is better 22

Copyright  2008 by CEBT Experiment Results (contd.)  Promotion acted better with more user-defined tags 23 Tags per Photo Photos Class I115,500,000 Class II2-317,500,000 Class III4-612,000,000 Class IV>67,000,000

Copyright  2008 by CEBT Experiment Results (contd.)  Semantic analysis Tags related to visual contents of the photo are more likely to accepted – Higher acceptance ratio of more physical categories 24

Copyright  2008 by CEBT Conclusions  Tag behavior in Flickr Tag frequency follows a power law Majority of photos are not annotated well enough Users annotate their photos using tags with broad spectrum of the semantic space  Extending Flickr annotations Co-occurrence model with aggregation and promotion was effective Can incrementally updated  Future work This model could be implemented as a recommendation system 25

Copyright  2008 by CEBT Discussion  Pros Analysis can be useful with other work Easy to understand and implement Reasonable evaluation strategy  Cons There should be a comparison with other recommending models Results are not so impressive Not much technical contribution 26