+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January 2009
+ 2 Preview Introduction Collaborative tagging User-Induced hyperlinks Similarity of Assigned Tags Association Rule Mining Analysis of User-induced links Tag Prediction Discussion Conclusion 2
+ 3 Introduction Hyper links Makes navigation through the web possible The author decides the document to link to Due to the limited links that authors give, has lead to user- contributed content on the web. In social bookmarking sites, e.g. Delicious Users can maintain a collection of documents URLs are identified by their chosen tags 3
+ 4 Collaborative tagging (1/2) Popular Tagging systems e.g. Delicious and LibraryThing Allows users describe their favorite online resources using their own words Eg tags new, tv, sports weather, travelhttp:/// Advantages over traditional methods Flexibility and freedom offered by these systems Systems are quick to adapt to changes in the vocabulary among the users. 4
+ 5 Collaborative tagging (2/2) Collaborative tagging activities of participating user results in scheme called folksonomy Folksonomy is divided into three types of elements Users Assign tags to the Web Tags Keywords chosen by users to describe and categorize a web document Documents Object tagged by the user 5
+ 6 User-Induced hyperlinks Two types of hyperlinks For Navigation For recommendation Directs users to other documents that contain related information Two different approached to discover implicit relations in folksonomy Calculating the similarity between the sets of tags assigned to the document Analyzing the collective behavior of the user who have tagged the document User-induced Links are implicit links in a folksonomy as resulted from collaborative tagging activities by users 6
+ 7 Similarity of Assigned Tags (1/4) First approach of discovering user-induced links Calculate the pair-wise similarity between documents based on their tags Jaccard Coefficient In IR, Cosine Similarity 7
+ 8 Similarity of Assigned Tags (2/4) 8 Cosine Similarity
+ 9 Similarity of Assigned Tags (3/4) 9 Second similarity function The normalized discounted cumulative gain (NDCG) used to evaluate ranking of documents according to their relevance score Firstly list the tags of the two documents Secondly, calculate the DCG at position p
+ 10 Similarity of Assigned Tags (4/4) 10 Thirdly, iDCG Finally, calculate the NDCG Use a function
+ 11 Association Rule Mining 11 Second approach of discovering user-induced links Finding out pairs of Web documents that have both been tagged by the same group of users Aims at identifying implicit patterns within a large database of transactions Two major concepts Support confidence
+ 12 Analysis of User-Induced Links (1/3) 12 Two methods described Identify user-induced links in data collected Delicious Compared them with existing hyperlinks in terms of several different aspects. Several aspects to compare Do they connect 2 documents from the same domain/website Similarity between documents on the two ends of a link Whether users are equally interested in the linked documents
+ 13 Analysis of User-Induced Links (2/3) 13 Data collection Data collected from Delicious Documents cover a wide range of topics Documents collected on per-tag basis First collected at random 130 tags, popular tags For each tag, crawl Delicious to obtain a set of documents and users that have tag the document.
+ 14 Analysis of User-Induced Links (3/3) 14 Results Identify user-induced links between the documents using the two methods For similarity, vary the similarity threshold to 0.5 For association Rule, set minimum support to 100 and vary the minimum confidence level Findings Very few user-induced links that supported confidence of 0.5 and above
+ 15 Results (1/8) 15
+ 16 Results (2/8) on Same Domain 16 One important function of hyperlinks allow users to navigate from one hypertext document to another More beneficial if the links point to some document outside external to the current website Check whether the documents at the ends are from the same domain
+ 17 Results (3/8) on Same Domain 17
+ 18 Results (4/8) on Coincidence between existing hyperlinks and user-induced links 18 See whether such links already exist between the documents If user-induced links coincide with existing hyperlinks means that users are satisfied with the existing hyperlinks If user-induces are mostly new, means that there are user interests and perspectives that existing hyperlinks have note captures
+ 19 Results (5/8) on Coincidence between existing hyperlinks and user-induced links 19
+ 20 Results (6/8) on similarity and user preferences 20 Look at documents that are connected by user-induce links Between blog posts of highly related topics News articles on the same topics Websites offering applications of similar functionalities Q&A pages of some portal site Two different approaches for generating user-induced links Association rule, a link is generated if enough users are interested in two documents regardless of the similarity between them Similarity based, generates links based on the tags assigned regardless of whether there are many users interested in the documents
+ 21 Results (7/8) on similarity and user preferences 21
+ 22 Results (8/8) on similarity and user preferences 22
+ 23 Tags Prediction (1/3) 23 The analysis of user-induced links shows that links generated by association rule mining of user collections usually connect documents that are highly related to each other as judged by the similarity between their tags To predict the tags Identify the other documents that have a link to this document The set of documents that have a link (d x )
+ 24 Tags Prediction (2/3) 24 Firstly, consider a simple averaging method
+ 25 Tags Prediction (3/3) 25 Secondly method of aggregation method
+ 26 Experiments (1/2) 26 Measure the performance of the predictions By using NDCG Precision at the nth Term NDCG was used To investigate whether the predictions are accurate in terms of the ordering of the tags.
+ 27 Experiments (2/2) 27
+ 28 Discussion 28 Implicit relation between web documents can be discovered by examining user preferences and document similarity embedded in a folksonomy User-induced are different from hyperlinks Collaborative tagging environment shows the differences between the perspective of Web authors and Web readers Worthwhile considering an open hypermedia structure backed by a collaborative tagging system.
+ 29 Conclusion 29 User-induced links, a form of implicit relations between documents We used Tag similarity to generate many user-induced links Association rule miming to generate very high user-induced-links
+ 30 Thank you for your attention 30