Thesis Proposal: Prediction of popular social annotations Abon.

Slides:



Advertisements
Similar presentations
Ziv Bar-YossefMaxim Gurevich Google and Technion Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA.
Advertisements

TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
The Complex Dynamics of Collaborative Tagging Harry Halpin University of Edinburgh Valentin Robu CWI, Netherlands Hana Shepherd Princeton University WWW.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Del.icio.us Bill G. Kelm IDS 150: Research in the Information Age April 3, 2007.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
By Ciro Cattuto, Vittorio Loreto, and Luciano Pietronero Semiotic dynamics and collaborative tagging Present by Diyue Bu.
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
1 SOCIAL BOOKMARKING 101. HIBA KHALID BILAL SAEED KHAN FARID ALIANI ASKARI HASAN SOCIAL BOOKMARKING.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Tag-based Social Interest Discovery 2009/2/9 Presenter: Lin, Sin-Yan 1 Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc WWW 2008 Social Networks & Web 2.0.
Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
MINING RELATED QUERIES FROM SEARCH ENGINE QUERY LOGS Xiaodong Shi and Christopher C. Yang Definitions: Query Record: A query record represents the submission.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
Towards Improving Classification of Real World Biomedical Articles Kostas Fragos TEI of Athens Christos Skourlas TEI of Athens
Gradual Adaption Model for Estimation of User Information Access Behavior J. Chen, R.Y. Shtykh and Q. Jin Graduate School of Human Sciences, Waseda University,
Improved search for Socially Annotated Data Authors: Nikos Sarkas, Gautam Das, Nick Koudas Presented by: Amanda Cohen Mostafavi.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Tag Data and Personalized Information Retrieval 1.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
Searching the Web by Lorrie Brazier Revised by Paula Walton.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
ON INCENTIVE-BASED TAGGING Xuan S. Yang, Reynold Cheng, Luyi Mo, Ben Kao, David W. Cheung {xyang2, ckcheng, lymo, kao, The University.
Querying Structured Text in an XML Database By Xuemei Luo.
ON THE SELECTION OF TAGS FOR TAG CLOUDS (WSDM11) Advisor: Dr. Koh. Jia-Ling Speaker: Chiang, Guang-ting Date:2011/06/20 1.
Web Personalization Based on Static Information and Dynamic User Behavior Center for E-Business Technology Seoul National University Seoul, Korea Nam,
Searching and Browsing Using Tags Nikos Sarkas Social Information Systems Seminar DCS, University of Toronto, Winter 2007.
Chapter 6: Information Retrieval and Web Search
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Facilitating Document Annotation using Content and Querying Value.
Instance-based mapping between thesauri and folksonomies Christian Wartena Rogier Brussee Telematica Instituut.
Topical Clustering of Search Results Scaiella et al [Originally published in – “Proceedings of the fifth ACM international conference on Web search and.
Digital libraries and web- based information systems Mohsen Kamyar.
Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
A Generalized Architecture for Bookmark and Replay Techniques Thesis Proposal By Napassaporn Likhitsajjakul.
Date: 2012/11/29 Author: Chen Wang, Keping Bi, Yunhua Hu, Hang Li, Guihong Cao Source: WSDM’12 Advisor: Jia-ling, Koh Speaker: Shun-Chen, Cheng.
Presented By Amarjit Datta
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
15 Sep 2015 EunJeong Cheon i501: introduction to informatics Semiotic Dynamics and Collaborative Tagging Ciro Cattuto, Vittorio Loreto, and Luciano Pietronero.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Job Clouds Presented by: Laura Bright and Brian Lewis May 1st, 2006 Semantic Web / INF 385T.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
BioCreAtIvE Critical Assessment for Information Extraction in Biology Granada, Spain, March28-March 31, 2004 Task 2: Functional annotation of gene products.
On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore.
Neighborhood - based Tag Prediction
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Web Mining Ref:
Citation-based Extraction of Core Contents from Biomedical Articles
Information Retrieval and Web Design
Introduction to Search Engines
Presentation transcript:

Thesis Proposal: Prediction of popular social annotations Abon

Outline Background Related Work Problem Definition Possible Solution Experiment Plan Evaluation Plan

Background Prevalence of social web services e.g. MY WEBSITE WHAT DO THEY HAVE IN COMMON TAGS & User Generated Content

Background TAGs are for ? According to del.icio.us founder Tags are one-word descriptors that you can assign to your bookmarks on del.icio.us to help you organize and remember them. Tags are a little bit like keywords, but they're chosen by you, and they do not form a hierarchy. You can assign as many tags to a bookmark as you like and rename or delete the tags later. So, tagging can be a lot easier and more flexible than fitting your information into preconceived categories or folders. Blah blah blah…..

Background TAGs are for ? According to del.icio.us founder Tags are one-word descriptors that you can assign to your bookmarks on del.icio.us to help you organize and to remember them. Tags are a little bit like keywords, but they're chosen by you, and they do not form a hierarchy. You can assign as many tags to a bookmark as you like and rename or delete the tags later. So, tagging can be a lot easier and more flexible than fitting your information into preconceived categories or folders.

Background An usage example

Why TAGs are useful In Information Retrieval field, it is a common technique to expand query to get more related data. Tags are like human-expanded index term.

Query expansion here

Why TAGs are useful Traditional term expansion scheme relies on term-document relations. And each tag’s importance to a document is often determined by tf-idf. For each tag user applies, it is like voting for what tag should be with some document. Thus the term-document relations could be measured by tag applications.

Why TAGs are useful Tags are human-expanded query set which enables more complete concept mapping. With more and more people applying tags, the popularity of tags reach a stable pattern. and top tags could be used as weighting parameters for search optimization

Related Work Usage patterns of collaborative tagging systems J. Inf. Sci., Vol. 32, No. 2. (April 2006), pp by Golder SA, Huberman BA. Usage patterns of collaborative tagging systemsGolderHuberman 100+ users, stable pattern appear Urn model

Stable pattern: top 7 tags remain for one year+

Related Work Collaborative Tagging and Semiotic Dynamics Cattuto C,LoretoV, Pietronero L. Long-term memory version of the classic Yule–Simon process Memory model based on cognitive model

Yule–Simon process Qt (x) = a(t)/(x + τ). a(t) is a normalizing factor τis memory parameter

Related work The Complex Dynamics of Collaborative Tagging,'‘ H.~Halpin,V.~Robu,H.~Shepherd in Proceedings of WWW 2007

Empirical Results for Power Law Regression for Popular Sites

P(x) : tag probability distribution at each time point Q(x) : The final tag probability distribution

Problem definition In initial stage, each url is not sufficiently annotated by people. Thus, it is hard to be retrieved at this time. For an immature url, predicting future popular tags could provide better retrieval experience. Mature url : Borrowed from [Halpin] ‘s empirical results for tag dynamics. They are defined as urls with 3+ more years of history on del.icio.us

Expanding tag set Ti{ } : The tag set applied by the ith user for an url. ETi {}:The expanded tag set after the ith user. T0{ } : The tag set suggested by tf-idf term extraction. STi=T0 ETi=ET i-1 ∪ relevant n (T i ) relevant n (Ti)=The n tags with top mutual information to each tag in Ti Mutual information: f(t i,t j )/f(t i )*f(t j )

Cohesivity Each tag in ETi has a score which indicates its cohesivity to ET i cohesivity of tj to ET i Σ f(t k,t j )/f(t j )*f(t k ) t k belongs to ETi

Pruning ET i 1. Sort tags in ET i by popularity, take top 7 as suggesting tag set ST i 2. Sort tags in ET i by popularity*cohesivity, take top 7 as suggesting tag set ST i

Experiment Plan Dataset from del.icio.us rss api Mar 28~April 19, of url, of tagging, 8392 of users 1.del.icio.us/rss/popular every 30min del.icio.us/rss/recent every 2 min 2.del.icio.us/rss/url?url= xxx.com Suggesting tags from no user to the 10th user.

Evaluation Plan For each url, we have mature tags and suggested tags at each iteration. Recall rate and precision rate could be calculated. withwithout with4.2. without3.1.Baseline Expanding with relevant tags Pruning with cohesivity