Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Type Classification for Web Document Retrieval

Similar presentations


Presentation on theme: "Query Type Classification for Web Document Retrieval"— Presentation transcript:

1 Query Type Classification for Web Document Retrieval
In-Ho Kang, GilChang Kim Presented by: Xiaoguang Qi

2 Introduction Combine multiple evidence to compensate for the weakness of single evidence Content-based & Link-based Three types of user queries Topic relevance task (informational) E.g. “What is a prime factor?” Homepage finding task (navigational) E.g. “John Hopkins Medical Institutions” Service finding task (transactional) E.g. “Where can I buy concert ticket?”

3 Multiple Sources of Information
Content Information: tfidf Link Information: PageRank URL Information: URLprior: (Kraaij et al) estimate a prior probability for a page of being an entry page based on its URL Root (71.7%) Subroot (13.2%) trec.nist.gov/pubs/ Path (5.7%) trec.nist.gov/pubs/trec9/papers/ File (5.7%) Combination: linear combination

4 Topic Relevance Task & Homepage Finding Task
Text representation Anchor text Common content Query term matching and sum Combination of Information

5 Topic Relevance Task & Homepage Finding Task (Cont.)
Some conclusion Topic relevance task Homepage finding task Text representation Full text Anchor text and title Query term matching “sum” “and” Combining URL and link information useless useful

6 User Query Classification
Preparation of language model Distribution of query terms Single-term query Calculate chi-square Multi-term query

7 User Query Classification (Cont.)
Mutual Information Discriminate the two types of queries by using term co-occurrence frequency Usage rate as an anchor text If a query terms appears in titles and anchor texts frequently, the category of the given query is homepage finding task Part-of-speech information If a query has a verb except the “be” verb, we classify it into the topic relevance task

8 Experiments Dataset: WT10g Query classification

9 Experiments (Cont.) The improvement of IR performance

10 Discussion & Conclusion
Identifying the queries of service finding tasks need more sophisticated analysis of anchor texts User query classification can be applied to varies areas (e.g. MetaSearch) Conclusion Classify user queries into different categories Use different strategies for different types of query Experiments show improvement to IR system


Download ppt "Query Type Classification for Web Document Retrieval"

Similar presentations


Ads by Google