Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tagging with Queries: How and Why?

Similar presentations


Presentation on theme: "Tagging with Queries: How and Why?"— Presentation transcript:

1 Tagging with Queries: How and Why?
Ioannis Antonellis Hector Garcia-Molina Jawed Karim

2 Content on the Web  Back Link Text Search queries Page Text
Forward Link Text Cnn Obama Critics news Stanford Infolab

3 How? Basic observation: http referrer field contains search query
Stanford Infolab 3

4 How?  Stanford Infolab

5 How? Basic observation: http referrer field contains search query
1) Extract queries from web access log Stanford Infolab 5

6 Web Access Log a997c d75c03f22ca8715e50b3 [28/Feb/2007:23:45: ] /group/svsa/cgi-bin/www/officers.php a64344ffd6638d0f6fb2a0284f98b28b [28/Feb/2007:23:45: ] /group/King/ " 413fa663474b2288c e7e62aea [28/Feb/2007:23:46: ] /group/pandegroup/folding/results.html " 3d2edd4dfa7778da92875ee67a [28/Feb/2007:23:46: ] /group/vpge/sgsi/entrepreneurship/ " ac a6c490023e460fd4863a48 [28/Feb/2007:23:46: ] / " 1c Stanford Infolab

7 How? Basic observation: http referrer field contains search query
1) Extract queries from web access log 2) Embed Javascript code in web pages that capture search queries Stanford Infolab 7

8 Embeddable code Stanford Infolab 8

9 How? Basic observation: http referrer field contains search query
1) Extract queries from web access log 2) Embed Javascript code in web pages and capture search queries Convince server administrator/page onwer Stanford Infolab 9

10 Stanford Infolab 10

11 Query tags Stanford Infolab 11

12 Information value of Query Tags
Datasets: Stanford Query Logs: 360,000 URLs, 900,000 query tags 3,000 URLs, 5,500 tags WebBase Stanford Infolab 12

13 Experiments - Summary URLs coverage Query vs Delicious Tags
Query/Delicious Tags vs Pagetext Stanford Infolab

14 URLs coverage Query logs provide tags for ~110 times more URLs than delicious 13% of delicious URLs (380 URLs) only tagged by delicious Stanford Infolab 14

15 Query Tags Query logs provide 42 query tags per URL on average
Stanford Infolab 15

16 Delicious Tags Delicious provides 3 tags per URL on average
Stanford Infolab 16

17 Tags for common URLs Query logs provide 250 query tags per URL on average for common URLs Delicious provides 5 tags per URL on average for common URLs Stanford Infolab 17

18 Query Tags vs Page Text For every URL, 1 out of 3 query tags are not present in the pagetext Stanford Infolab 18

19 Delicious Tags vs Page Text
For every URL, 1 out of 2 query tags are not present in the pagetext Stanford Infolab 19

20 Tags for common URLs For common URLs, 1 out of 2 query/delicious tags not present in the pagetext Stanford Infolab 20

21 Conclusions Query tags: Can be extracted in a distributed fashion
new promising source of information can provide substantially many, new tags, for a large fraction of the Web To be removed Stanford Infolab 21 21

22 Thank You! (DEMO) Stanford Infolab 22

23  Stanford Infolab 23

24  Stanford Infolab 24

25 Stanford Infolab 25

26  Stanford Infolab 26

27  Stanford Infolab 27

28 Stanford Infolab 28

29 Stanford Infolab 29

30 Stanford Infolab 30

31 Stanford Infolab 31

32 Stanford Infolab 32

33 How? Stanford Infolab 33

34 Stanford Infolab 34

35 Stanford Infolab 35


Download ppt "Tagging with Queries: How and Why?"

Similar presentations


Ads by Google