Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web queries classification Nguyen Viet Bang WING group meeting June 9 th 2006.

Similar presentations


Presentation on theme: "Web queries classification Nguyen Viet Bang WING group meeting June 9 th 2006."— Presentation transcript:

1 Web queries classification Nguyen Viet Bang WING group meeting June 9 th 2006

2 What does it mean by “classification of queries by their goals”? A taxonomy by [Rosen & Levinson] – Navigational: locate a specific website Example: “Stanford University” – Informational: find out about a topic Example: “European history” – Resources: find a resource Example: “download Beatles lyrics” Note: there are further sub-categories. Also a similar taxonomy by [Broder]

3 What’s this research about? An outline: by [Rose and Levinson] (i) Determine a framework to classify queries according to goals (ii) Given queries, find a way to associate the goals determined (i) with the queries. (iii) With the queries being classified in (ii), try to exploit that information to enhance current search engines.

4 Outline: Problem (i) (i) Determine a classification framework according to goals of users’ queries (a taxonomy by [Rose and Levinson]) (ii) Given queries, find a way to associate the goals determined (i) with the queries. (iii) With the queries being classified in (ii), try to exploit that information to enhance current search engines.

5 Outline: Problem (ii). Associate the goals with the queries (i) Determine a classification framework according to goals of users’ queries (ii) Given queries, find a way to associate the goals determined (i) with the queries. (iii) With the queries being classified in (ii), try to exploit that information to enhance current search engines.

6 Outline: Problem (ii). Associate the goals with the queries (1) Manually ask users (present a user interface) (2) Automated classification 2.1. Use others extra information (others than the queries) – Clickthrough data (user click history) [Lee, Liu and Cho] – Link (anchor text distribution) [Lee, Liu and Cho] – Many others features: Distribution of queries, PageRank, mutual information 2.2.Machine learning 2.3. How about looking at the queries only?

7 An example: click distribution Intuitive: for “navigational”, users tend to click on 1 single result. Algorithm: – Sort the results of a search descending to the number of clicks (yield a distribution) – Calculate a statistics description of the distribution) (for.e.g, mean) – If the mean value > some threshold, classify as “navigational”

8 Automated classification (contd) Combination of features: yield higher accuracy [Lee, Liu and Cho] Machine learning – Unsupervised (clustering) – Supervised (possibly lack of training data)

9 Problem (iii): retrieve results after classification Need different strategies for each category [Kang and Kim] Information to analyize: – Content information (the webpage itself) – Link information (topology of links in the web) – URL information (for e.g. to decide whether a webpage is a “root” (site entry) More techniques: boolean combination (“and” or “or”)

10 Our challenge Try to achieve accurate classification by looking at features of the queries only – POS – Relationship between queries – Features of URL returned by search engines (Meurlin?) Enhance search retrieval


Download ppt "Web queries classification Nguyen Viet Bang WING group meeting June 9 th 2006."

Similar presentations


Ads by Google