Download presentation
Presentation is loading. Please wait.
Published byMartin Morrison Modified over 9 years ago
1
Christian Körner 1, Dominik Benz 2, Andreas Hotho 3, Markus Strohmaier 1, Gerd Stumme 2 Stop thinking, start tagging: Tag Semantics arise from Collaborative Verbosity 1 Knowledge Management Institute and Know Center, Graz University of Technology, Austria 2 Knowledge and Data Engineering Group (KDE), University of Kassel, Germany 3 Data Mining and Information Retrieval Group University of Würzburg, Germany
2
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20102 / 20 Where do Semantics come from? Semantically annotated content is the „fuel“ of the next generation World Wide Web – but where is the petrol station? Expert-built expensive Evidence for emergent semantics in Web2.0 data Built by the crowd! Which factors influence emergence of semantics? Do certain users contribute more than others?
3
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20103 / 20 The Story Emergent Tag Semantics Pragmatics of tagging Semantic Implications of Tagging Pragmatics Conclusions
4
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20104 / 20 Emergent Tag Semantics tagging is a simple and intuitive way to organize all kinds of resources uncontrolled vocabulary, tags are „just strings“ formal model: folksonomy F = (U, T, R, Y) Users U, Tags T, Resources R Tag assignments Y (U T R) evidence of emergent semantics Tag similarity measures can identify e.g. synonym tags (web2.0, web_two)
5
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20105 / 20 Tag Similarity Measures: Tag Context Similarity Tag Context Similarity is a scalable and precise tag similarity measure [Cattuto2008,Markines2009]: Describe each tag as a context vector Each dimension of the vector space correspond to another tag; entry denotes co-occurrence count Compute similar tags by cosine similarity 53011050 designsoftwareblogwebprogramming … JAVA Will be used as indicator of emergent semantics!
6
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20106 / 20 = tag Assessing the Quality of Tag Semantics JCN(t,t sim ) = 3.68 TagCont(t,t sim ) = 0.74 Folksonomy Tags = synset WordNet Hierarchy Mapping Average JCN(t,t sim ) over all tags t: „Quality of semantics“
7
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20107 / 20 The Story Pragmatics of tagging Semantic Implications of Tagging Pragmatics Conclusions Tag Similarity measures can capture emergent tag semantics
8
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20108 / 20 Tagging motivation Evidence of different ways HOW users tag (Tagging Pragmatics) Broad distinction by tagging motivation [Strohmaier2009]: donuts duff marge beer bart barty Duff-beer bev alcnalc beer wine „Categorizers“… - use a small controlled tag vocabulary - goal: „ontology-like“ categorization by tags, for later browsing - tags a replacement for folders „Describers“… - tag „verbously“ with freely chosen words - vocabulary not necessarily consistent (synomyms, spelling variants, …) - goal: describe content, ease retrieval
9
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20109 / 20 Tagging Pragmatics: Measures How to disinguish between two types of taggers? Intuition: Describers use open set of many tags, Categorizers use small set of controlled tags: Vocabulary size: Tag / Resource ratio: Average # tags per post: high low
10
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201010 / 20 Tagging Pragmatics: Measures Next Intuition: Describers don‘t care about „abandoned“ tags, Categorizers do Orphan ratio: R(t): set of resources tagged by user u with tag t high low
11
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201011 / 20 Tagging pragmatics: Limitations of measures Real users: no „perfect“ Categorizers / Describers, but „mixed“ behaviour Possibly influenced by user interfaces / recommenders Measures are correlated But: independent of semantics; measures capture usage patterns
12
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201012 / 20 The Story Semantic Implications of Tagging Pragmatics Conclusions Tag Similarity measures can capture emergent tag semantics Measures of tagging pragmatics differentiate users by tagging motivation
13
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201013 / 20 Influence of Tagging Pragmatics on Emergent Semantics Idea: Can we learn the same (or even better) semantics from the folksonomy induced by a subset of describers / categorizers? Extreme Categorizers Extreme Describers Complete folksonomy Subset of 30% categorizers = user
14
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201014 / 20 Experimental setup 1. Apply pragmatic measures vocab, trr, tpp, orphan to each user 2. Systematically create „sub-folksonomies“ CF i / DF i by subsequently adding i % of Categorizers / Describers (i = 1,2,…,25,30,…,100) 3. Compute similar tags based on each subset (TagContext Sim.) 4. Assess (semantic) quality of similar tags by avg. JCN distance TagCont(t,t sim )= … JCN(t,t sim )= … DF 20 CF 5
15
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201015 / 20 Dataset From Social Bookmarking Site Delicious in 2006 ORIGINAL Two filtering steps (to make measures more meaningful): Restrict to top 10.000 tags FULL Keep only users with > 100 resources MIN100RES dataset|T||U||R||Y| ORIGINAL2,454,546667,12818,782,132140,333,714 FULL10,000511,34814,567,465117,319,016 MIN100RES9,944100,36312,125,17696,298,409
16
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201016 / 20 Results – adding Describers (DF i ) Almost all sub-folksonomies are better than random-picked ones 40% of describers according to trr outperform complete data! Optimal performance for 70% describers (trr) more describers better semantics
17
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201017 / 20 Results – adding Categorizers (CF i ) Almost all sub-folksonomies are worse than random-picked ones Global optimum for 90% categorizers (tpp) removing 10% most extreme describers! (Spammers?) better semantics more categorizers
18
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201018 / 20 The Story Tag Similarity measures can capture emergent tag semantics Measures of tagging pragmatics differentiate users by tagging motivation Sub-folksonomies introduced by measures of pragmatics show different semantic qualities Conclusions
19
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201019 / 20 Summary & Conclusions Introduction of measures of users‘ tagging motivation (Categorizers vs. Describers) Evidence for causal link between tagging pragmatics (HOW people use tags) and tag semantics (WHAT tags mean) „Mass matters“ for „wisdom of the crowd“, but composition of crowd makes a difference („Verbosity“ of describers in general better, but with a limitation) Relevant for tag recommendation and ontology learning algorithms
20
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201020 / 20 Guess who‘s a Categorizer from the authors
21
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201021 / 20 Thanks for the attention! Questions? Be verbous Tag Similarity measures can capture emergent tag semantics Measures of tagging pragmatics differentiate users by tagging motivation Sub-folksonomies introduced by measures of pragmatics show different semantic qualities Evidende of causal link between pragmatics and semantics of tagging! christian.koerner@tugraz.at benz@cs.uni-kassel.de
22
30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201022 / 20 References [Cattuto2008] Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme: Semantic Grounding of Tag Relatedness in Social Bookmarking Systems. In: Proc. 7 th Intl. Semantic Web Conference (2008), p. 615-631 [Markines2009] Benjamin Markines, Ciro Cattuto, Filippo Menczer, Dominik Benz, Andreas Hotho, Gerd Stumme: Evaluating Similarity Measures for Emergent Semantics of Social Tagging. In: Proc. 18 th Intl. World Wide Web Conference (2009), p.641-641 [Strohmaier2009] Markus Strohmaier, Christian Körner, Roman Kern: Why do users tag? Detecting users‘ motivation for tagging in social tagging systems. Technical Report, Knowledge Management Institute – Graz University of Technology (2009)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.