Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

2 Outline Introduction Introduction Auto-tagging System Auto-tagging System Tag Suggestion for Query Expansion Tag Suggestion for Query Expansion User Study User Study Conclusion Conclusion

4 Introduction Factors of reducing the quality of search results Factors of reducing the quality of search results –Query ambiguity Different needs result in the same query Different needs result in the same query ex: java may imply java tutorial or java software –Vocabulary mismatch –Lack of knowledge regarding document contents

5 Motivation Query expansion / suggestion Query expansion / suggestion –Assist users to issue better queries Using tags for query expansion Using tags for query expansion –Tags reflect various users’ perspectives about a unique article. –Different from traditional topical categorization, such as ODP or Yahoo!, tags are much easier to understand Automatic tagging Automatic tagging –Assign tags to those documents not tagged

7 Auto-tagging System SVM for multi-class classification SVM for multi-class classification –A set of binary classifiers –Features: webpage text Data Data –500 pages for each of 140 most popular tags in Delicious –Training: for each tag, M positive and K negative documents –Testing: 503 webpages. –All documents are greater than 800 bytes

8 Classification Results

9 Precision > Recall Precision > Recall “String distance” reduces the affect of minor differences in tags ex: blogs and blogging “String distance” reduces the affect of minor differences in tags ex: blogs and blogging Only 328 webpages are evaluated by “String distance with min3” Only 328 webpages are evaluated by “String distance with min3”

10 Comparison with Related Works AutoTag (WWW 2006) AutoTag (WWW 2006) –A collaborative filtering approach for tagging weblog posts –Evaluated with “String distance with min3” –Boosted by weblog author’s tags –0.40 in Precision@10, 0.49 in recall@10 TagAssist (ICWSM 2007) TagAssist (ICWSM 2007) –The approach is similar to AutoTag –“Exact word match” –Manual evaluation: 42.10% accuracy –Automatic evaluation: 13.11% precision, 22.83% recall

12 The Steps 1. Submit initial query to search engine 2. Auto tagging top documents of initial result 3. Using popular tags to expand initial query 4. a. User select query suggestions manually; (Tag Suggestion System) b. Automatically combine retrieval results of various suggestions (Tag Auto-combine Sys.)

14 First Step of User Study Use Google as the search engine Use Google as the search engine 20 unique queries from 4 graduate students 20 unique queries from 4 graduate students Top 50 returned pages are tagged automatically Top 50 returned pages are tagged automatically Expanding initial query with top 8 popular tags to generate 8 suggestions Expanding initial query with top 8 popular tags to generate 8 suggestions Compared with suggestions from Google and Yahoo! Compared with suggestions from Google and Yahoo! User select preferable suggestions in a random anonymous manner User select preferable suggestions in a random anonymous manner

15 Results of User Study

16 Suggestion Examples

17 Second Step of User Study Submit modified query to Google Submit modified query to Google Scoring top-10 lists by user as 0-5, for all systems Scoring top-10 lists by user as 0-5, for all systems Metrics for comparison Metrics for comparison –Metric 1 (average relative improvement) –Metric 2 (improvement for average relevance score) –Metric 3 (average relative improvement)

18 Results of User Study Significant improvement over Google initial result Significant improvement over Google initial result

20 Conclusion We build an auto-tagging system to assign tags to webpages. We build an auto-tagging system to assign tags to webpages. Our approach focuses exclusively on the textual content, and thus is applicable when no usage information is available. Our approach focuses exclusively on the textual content, and thus is applicable when no usage information is available. Our system utilizes characteristics of tags to expand query and provide options for users. Our system utilizes characteristics of tags to expand query and provide options for users. A user study is performed, showing better performance than Google suggestion and Yahoo! suggestion. A user study is performed, showing better performance than Google suggestion and Yahoo! suggestion.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Similar presentations

Presentation on theme: "Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Similar presentations

Presentation on theme: "Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)"— Presentation transcript:

Similar presentations

About project

Feedback