Download presentation
Presentation is loading. Please wait.
Published byEdith Charles Modified over 9 years ago
1
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW 2008 2009. 03. 13. Summarized and presented by Hwang Inbeom, IDS Lab., Seoul National University
2
Copyright 2008 by CEBT Overview Recommending tags for an image More tags, more semantic meanings Solves two questions How much would the recommending be effective? – Analyzing tagging behaviors How can we recommend tags? – Presenting some recommending strategies 2
3
Copyright 2008 by CEBT Tagging Tagging The act of adding keywords to objects Popular means to annotate various web resources Web page bookmarks Academic publications Multimedia objects … 3
4
Copyright 2008 by CEBT Advantages of Tagging Images Content-based image retrieval is progressing, but it has not yet succeeded in reducing semantic gap Tagging is essential for large-scale image retrieval systems to work in practice Extension of tags Richer semantic description Can be used to retrieve the photo for a larger range of keyword queries 4 Sagrada Familia Barcelona Sagrada Familia Barcelona Sagrada Familia Barcelona Gaudi Spain Catalunya architecture church Sagrada Familia Barcelona Gaudi Spain Catalunya architecture church
5
Copyright 2008 by CEBT Analysis of Tagging Behaviors How do users tag photos? Distribution of tag frequency Distribution of the number of tags per photo What kind of tags do they provide? Tag categorization with WordNet 5
6
Copyright 2008 by CEBT Head Tail Tag Frequency Distribution of tag frequency could be modeled by a power law Tags residing in the head of power law Too generic tags – 2006, 2005, wedding Tags in tail of power law Incidentally occurring words – ambrose tompkins, ambient vector 6
7
Copyright 2008 by CEBT Head Tail Number of Tags per Photo Distribution could be modeled by power law too Photos in head of power law Exhaustively annotated Photos in tail of power law Tag recommendation system could be useful – Covers 64% of the photos 7
8
Copyright 2008 by CEBT Number of Tags per Photo (contd.) Photos classified by number of tags annotated To be used to analyze the performance of recommending for different annotation levels 8 Tags per PhotoPhotos Class I115,500,000 Class II2-317,500,000 Class III4-612,000,000 Class IV>67,000,000
9
Copyright 2008 by CEBT Tag Categorization 9 52% of tags could be categorized by WordNet categories Users provide a broader context by tags, not only visual contents of the photo Where / when the photo was taken Actions people in the photo are doing …
10
Copyright 2008 by CEBT Tag Recommendation System 10 Sagrada Familia Barcelona Sagrada Familia Barcelona Spain Gaudi 2006 Catalunya Europe travel Sagrada Familia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Gaudi Spain Catalunya architecture church
11
Copyright 2008 by CEBT Tag Recommendation Strategies Finding candidate tags based on tag co-occurrence Symmetric measures Asymmetric measures Aggregation and ranking of candidate tags Voting strategy Summing strategy Promotion 11
12
Copyright 2008 by CEBT Tag Co-occurrence Finding tags co-occurring with a specific tag Co-occurring tags with higher score become candidate tags Could be measured in two ways Symmetric measures Asymmetric measures 12
13
Copyright 2008 by CEBT Tag Co-occurrence (contd.) Symmetric measures Jaccard’s coefficient – Statistic used for computing the similarity and diversity of sample sets Useful to identify equivalent tags Example – Eiffel tower – Tour Eiffel, Eiffel, Seine, La tour Eiffel, Paris 13
14
Copyright 2008 by CEBT Tag Co-occurrence (contd.) Asymmetric measures Tag co-occurrence can be normalized using the frequency of one of the tags Can provide more diverse candidates than symmetric method Example – Eiffel Tower – Paris, France, Tour Eiffel, Eiffel, Europe Asymmetric tag co-occurrence will provide a more suitable diversity 14
15
Copyright 2008 by CEBT Tag Aggregation Definitions U is user-defined tags C u is top-m most co-occurring tags of a tag u in U C is the union of all candidate tags for all user-defined tag u R is recommended tags 15 Sagrada Familia Barcelona Sagrada Familia Barcelona Spain Gaudi 2006 Catalunya Europe travel Sagrada Familia Barcelona Gaudi Spain architecture Catalunya church Gaudi Spain Catalunya architecture church Gaudi Spain Catalunya architecture church
16
Copyright 2008 by CEBT Tag Aggregation (contd.) Vote For each candidate tag c in C, whenever c is in C u a vote is cast R is obtained by sorting the candidate tags on the number of votes 16 Barcelona Spain Gaudi 2006 Catalunya Europe travel Sagrada Familia Barcelona Gaudi Spain architecture Catalunya church TagScore Barcelona1 Gaudi2 Spain2 ……
17
Copyright 2008 by CEBT Tag Aggregation (contd.) Sum Sums over co-occurrence values of the candidate tags c in C u 17
18
Copyright 2008 by CEBT Promotion Stability-promotion To make user-defined tags with low frequency less reliable Descriptiveness-promotion To avoid general tags ranked too highly 18 Head Tail
19
Copyright 2008 by CEBT Promotion (contd.) Rank-promotion Co-occurrence values used in summing strategy declines too fast To make co-occurrence values work better Applying promotion 19
20
Copyright 2008 by CEBT Experimental Setup For different strategies Assessments Top 10 recommendations from each of the four strategies make a pool Assessors were asked to assess the descriptiveness of each tags – Assessed as very good, good, not good, don’t know Assessors could access and view photo directly on Flickr, to find additional context 20 votesum No-promotionvotesum Promotionvote+sum+
21
Copyright 2008 by CEBT Experimental Setup (contd.) Evaluation metrics Mean Reciprocal Rank (MRR) – Evaluates probability that the system returns a “relevant” tag at the top of the ranking – Tag is relevant if its relevance score is bigger than average of relevance Success at rank k (S@k) – Probability of finding a good descriptive tag among the top k recommended tags Precision at rank k (P@k) – Proportion of retrieved tags that is relevant, averaged over all photos 21
22
Copyright 2008 by CEBT Experiment Results Promotion worked well Without promotion, summing is better With promotion, voting is better 22
23
Copyright 2008 by CEBT Experiment Results (contd.) Promotion acted better with more user-defined tags 23 Tags per Photo Photos Class I115,500,000 Class II2-317,500,000 Class III4-612,000,000 Class IV>67,000,000
24
Copyright 2008 by CEBT Experiment Results (contd.) Semantic analysis Tags related to visual contents of the photo are more likely to accepted – Higher acceptance ratio of more physical categories 24
25
Copyright 2008 by CEBT Conclusions Tag behavior in Flickr Tag frequency follows a power law Majority of photos are not annotated well enough Users annotate their photos using tags with broad spectrum of the semantic space Extending Flickr annotations Co-occurrence model with aggregation and promotion was effective Can incrementally updated Future work This model could be implemented as a recommendation system 25
26
Copyright 2008 by CEBT Discussion Pros Analysis can be useful with other work Easy to understand and implement Reasonable evaluation strategy Cons There should be a comparison with other recommending models Results are not so impressive Not much technical contribution 26
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.