Download presentation
Presentation is loading. Please wait.
Published byLiberty Miles Modified over 9 years ago
1
Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search Kay-Uwe Schmidt*, Tobias Sarnow*, Ljiljana Stojanovic** *SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany **Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany Symposium on Applied Computing (2009) 2009. 08. 13. Summarized & presented by Babar Tareen, IDS Lab., Seoul National University
2
Copyright 2008 by CEBT Introduction Search engines do not consider current work context Static results for all users Server side personalization has limited use Client side search engines rely on additional terms extracted from documents, thus not scalable Social Bookmarking based search result personalization addresses these issues 2
3
Copyright 2008 by CEBT Related Work Google History goZone.com Mahalo.com UCAIR 3
4
Copyright 2008 by CEBT Motivation 4 A developer is looking for guide lines for testing DB code Visits www.ibm.com/db2 www.hsqldb.org Googles “Test” Original Results Web based certification Personality test Bandwidth test Personalized Results DB2 training DB2 programming test
5
Copyright 2008 by CEBT Personalizing Search Results Tracking browsing behavior Create user model Url’s Tags fetched from Delicious Issue original query Enhance search query by adding tags Issue new query Display both results Tags given by a community of users provide a good summary of web page content 5 UrlTags (Metadata) www.youtube.comvideo, youtube, entertainment, web2.0 www.amazon.comshopping, books, amazon, music www.snu.ac.kruniversity, snu, korea, 서울대 www.hsqldb.orgdatabase, java, sql, opensource www.ibm.com/db2ibm, db2, database, unix
6
Copyright 2008 by CEBT Architecture [1] 6 Search Module Carries out original query Inserts space ( ) for personalized results Metric Module Includes a metric that delivers a tag for personalized search Search Enhancer Module Combines search string with metric module tags Metadata Module Extracts metadata for a visited website from delicious
7
Copyright 2008 by CEBT Architecture [2] Built as add-on on top of Firefox Internet Explorer 7
8
Copyright 2008 by CEBT Metric [1] Two datasets Collection of visited websites Tags for each website Query last 20 disjunct websites from user model Format (url, count) Sorted by weight ‘γ’ 8
9
Copyright 2008 by CEBT Metric [2] Tags assigned to website Format (tag, no of users) t → tags assigned to a website T → tags for all websites 9
10
Copyright 2008 by CEBT Algorithm 10
11
Copyright 2008 by CEBT Result 11
12
Copyright 2008 by CEBT Evaluation How effective can this be ? 12
13
Center for E-Business Technology Seoul National University Seoul, Korea 13 Can Social Bookmarking Improve Web Search? Pauly Heymann, Georgia Koutrika, Hector Garcia-Molina Dept. of Computer Science, Stanford University USA Web Search and Data Mining 2008
14
Copyright 2008 by CEBT Positive Factors [1] URLs Pages posted on delicious are often recently modified – Delicious users post interesting pages that are actively updated or have been recently created Approximately 25% of URLs posted by users are new, unindexed pages – Delicious can server as a small data source for new web pages and to help crawl ordering Roughly 9% of results for search queries are URLs present in delicious – Delicious URLs are disproportionately common in search results compared to their coverage While some users are more prolific than others, the top 10% of users only account for 56% of the posts – Delicious is not highly reliant on a relatively small group of users 14
15
Copyright 2008 by CEBT Positive Factors [2] URLs 30-40% of URLs and approximately one in eight domains posted were not previously in delicious. – Delicious has relatively little redundancy in page information Tags Popular query terms and tags overlap significantly – Delicious may be able to help with queries where tags overlap with query terms In this study, most tags were deemed relevant and objective by users – Tags are on the whole accurate 15
16
Copyright 2008 by CEBT Negative Factors URLs Approximately 120,000 URLs are posted to delicious each day – The number of posts per day is relatively small; for instance, it represents 1/10 of the number of blog posts per day There are roughly 115 million public posts, coinciding with about 30-50 million unique URLs – The number of total posts is relatively small for instance, this is a small portion of the web as whole (perhaps 1/1000) Tags Tags are present in the pagetext of 50% of the pages they annotate – A substantial proportion of tags are obvious in context, and many tagged pages would be discovered by a search engine Domains are often highly correlated with particular tags and vice versa – It may be more efficient to train librarians to label domains than to ask users to tag pages 16
17
Copyright 2008 by CEBT Discussion Query expansion model based on Social tagging What is the probability of finding tags for random URL in delicious.com? Generalization vs. Specialization 17
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.