Presentation is loading. Please wait.

Presentation is loading. Please wait.

Todays topic Social Tagging By Christoffer Hirsimaa.

Similar presentations


Presentation on theme: "Todays topic Social Tagging By Christoffer Hirsimaa."— Presentation transcript:

1 Todays topic Social Tagging By Christoffer Hirsimaa

2 Stop thinking, start tagging: Tag Semantics arise from Collaborative Verbosity Christian Körner, Dominik Benz, Andreas Hotho, Markus Strohmaier, Gerd Stumme From WWW2010

3 Where do Semantics come from?  Semantically annotated content is the „fuel“ of the next generation World Wide Web – but where is the petrol station?  Expert-built  expensive  Evidence for emergent semantics in Web2.0 data  Built by the crowd!  Which factors influence emergence of semantics?  Do certain users contribute more than others? 3

4 Overview Emergent Tag Semantics Pragmatics of tagging Semantic Implications of Tagging Pragmatics Conclusions 4

5 Emergent Tag Semantics  tagging is a simple and intuitive way to organize all kinds of resources  formal model: folksonomy F = (U, T, R, Y)  Users U, Tags T, Resources R  Tag assignments Y  (U  T  R)  evidence of emergent semantics  Tag similarity measures can identify e.g. synonym tags (web2.0, web_two) 5

6 Tag Similarity Measures: Tag Context Similarity  Tag Context Similarity is a scalable and precise tag similarity measure [Cattuto2008,Markines2009]:  Describe each tag as a context vector  Each dimension of the vector space correspond to another tag ; entry denotes co-occurrence count  Compute similar tags by cosine similarity 53011050 designsoftwareblogwebprogramming … JAVA  Will be used as indicator of emergent semantics! 6 / 20 6

7 = tag Assessing the Quality of Tag Semantics JCN(t,t sim ) = 3.68 TagCont(t,t sim ) = 0.74 Folksonomy Tags = synset WordNet Hierarchy Mapping Average JCN(t,t sim ) over all tags t: „Quality of semantics“ 7

8 bev alcnalc beer wine Tagging motivation  Evidence of different ways HOW users tag (Tagging Pragmatics )  Broad distinction by tagging motivation [Strohmaier2009]: donuts duff marge beer bart barty Duff-beer „Categorizers“… - use a small controlled tag vocabulary - goal: „ontology-like“ categorization by tags, for later browsing - tags a replacement for folders „Describers“… - tag „verbously“ with freely chosen words - vocabulary not necessarily consistent (synonyms, spelling variants, …) - goal: describe content, ease retrieval 8

9 Tagging Pragmatics: Measures  How to disinguish between two types of taggers?  Vocabulary size:  Tag / Resource ratio:  Average # tags per post: high low 9

10  Orphan ratio:  R(t): set of resources tagged by user u with tag t high low Tagging Pragmatics: Measures 10

11 Tagging pragmatics: Limitations of measures  Real users: no „perfect“ Categorizers / Describers, but „mixed“ behaviour  Possibly influenced by user interfaces / recommenders  Measures are correlated  But: independent of semantics ; measures capture usage patterns 11

12 Influence of Tagging Pragmatics on Emergent Semantics  Idea: Can we learn the same (or even better) semantics from the folksonomy induced by a subset of describers / categorizers? Extreme Categorizers Extreme Describers Complete folksonomy Subset of 30% categorizers = user 12

13 Experimental setup 1. Apply pragmatic measures vocab, trr, tpp, orphan to each user 2. Systematically create „ sub-folksonomies “ CF i / DF i by subsequently adding i % of Categorizers / Describers (i = 1,2,…,25,30,…,100) 3. Compute similar tags based on each subset (TagContext Sim.) 4. Assess (semantic) quality of similar tags by avg. JCN distance TagCont(t,t sim )= … JCN(t,t sim )= … DF 20 CF 5 13

14 Dataset  From Social Bookmarking Site Delicious in 2006  Two filtering steps (to make measures more meaningful):  Restrict to top 10.000 tags  FULL  Keep only users with > 100 resources  MIN100RES dataset|T||U||R||Y| ORIGINAL2,454,546667,12818,782,132140,333,714 FULL10,000511,34814,567,465117,319,016 MIN100RES9,944100,36312,125,17696,298,409 14 / 20 14

15 Results – adding Describers (DF i ) 15

16 Results – adding Categorizers (CF i ) 16

17 Summary & Conclusions  Introduction of measures of users‘ tagging motivation (Categorizers vs. Describers)  Evidence for causal link between tagging pragmatics (HOW people use tags) and tag semantics (WHAT tags mean)  „Mass matters“ for „wisdom of the crowd“, but composition of crowd makes a difference („ Verbosity “ of describers in general better, but with a limitation)  Relevant for tag recommendation and ontology learning algorithms 17

18 My thoughts and remarks  Confirmed deleting spammers is useful once again, but how useful?  Try to recursively combine the set of describers / categorizers 18

19 Q&A and discussion! 19

20 Thank you for your attention! 20

21 21 / 20 Extras: 21

22 References  [Cattuto2008] Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme: Semantic Grounding of Tag Relatedness in Social Bookmarking Systems. In: Proc. 7 th Intl. Semantic Web Conference (2008), p. 615-631  [Markines2009] Benjamin Markines, Ciro Cattuto, Filippo Menczer, Dominik Benz, Andreas Hotho, Gerd Stumme: Evaluating Similarity Measures for Emergent Semantics of Social Tagging. In: Proc. 18 th Intl. World Wide Web Conference (2009), p.641-641  [Strohmaier2009] Markus Strohmaier, Christian Körner, Roman Kern: Why do users tag? Detecting users‘ motivation for tagging in social tagging systems. Technical Report, Knowledge Management Institute – Graz University of Technology (2009) 22


Download ppt "Todays topic Social Tagging By Christoffer Hirsimaa."

Similar presentations


Ads by Google