Presentation is loading. Please wait.

Presentation is loading. Please wait.

University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 2010.05.04 전자전기컴퓨터공학 부 USN 연구실 G201049005.

Similar presentations


Presentation on theme: "University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 2010.05.04 전자전기컴퓨터공학 부 USN 연구실 G201049005."— Presentation transcript:

1 University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 2010.05.04 전자전기컴퓨터공학 부 USN 연구실 G201049005 김 은 환

2 Ubiquitous Sensor Network Lab 1/14 서울시립대 University Of Seoul Electrical and Computer Engineering Contents  Introduction  Query Categorization  Wikipedia Data Set  Query Categorization  Query Expansion Methods  Relevance model  Strategy for Entity/Ambiguous Queries  Field Evidence for Query Expansion  Experiments  Experiments Settings  Baselines  Using Entity Pages for Relevance Feedback  Field Based Expansion  Conclusion

3 Ubiquitous Sensor Network Lab 2/14 서울시립대 University Of Seoul Electrical and Computer Engineering Introduction  The aim of this study is to explore the possible utility of Wikipedia as a resource improving for IR in PRF  For a long time query expansion has been a focus for researchers  relevant  irrelevant  Supervised method and Unsupervised method  3 types query  Query about a specific entity (EQ)  Ambiguous query (AQ)  Broader query (BQ) It has the potential to enhance IR effectiveness

4 Ubiquitous Sensor Network Lab 3/14 서울시립대 University Of Seoul Electrical and Computer Engineering Introduction  For all query expansion methods, pseudo relevance feedback (PRF) is attractive because it requires no user input  PRF assumes that the top ranked documents in the initial retrieval are relevant  However, this assumption is often invalid which can result in a negative impact on PRF performance  Meanwhile, as the volume of data on the web becomes much larger, other resources have emerged which can potentially supplement an initial search better in PRF  e.g. Wikipedia

5 Ubiquitous Sensor Network Lab 4/14 서울시립대 University Of Seoul Electrical and Computer Engineering Query Categorization  Wikipedia Data Set  A topic in Wikipedia has a distinct  Person  Place  Organization or miscellaneous  In addition, Important information for the topic of a given article may also be found in other Wikipedia articles  With the help of enriched text, we can expect to bridge the gap between the large volume of information on the web and the simple queries issued by users.

6 Ubiquitous Sensor Network Lab 5/14 서울시립대 University Of Seoul Electrical and Computer Engineering Query Categorization  Queries about a specific entity (EQ)  We mean queries that have a specific meaning and cover a narrow topic  The corresponding entity page is the page with the same title field as the query  Queries exactly matching one title of an entity page or a redirect page will be classified as EQ  e.g. “Seoul” Thus EQ can be mapped directly to the entity page with the same title

7 Ubiquitous Sensor Network Lab 6/14 서울시립대 University Of Seoul Electrical and Computer Engineering Query Categorization  Ambiguous Queries (AQ)  We mean queries with terms having more than one potential meaning  e.g. “Apple”  Broader Queries (BQ)  We denote the rest of the queries to be BQ because these queries are neither ambiguous nor focused on a specific entity  e.g. “Orange” A disambiguation process is needed to determine it’s sense

8 Ubiquitous Sensor Network Lab 7/14 서울시립대 University Of Seoul Electrical and Computer Engineering Query Expansion Methods  Strategy for Ambiguous Queries  Lee el al.

9 Ubiquitous Sensor Network Lab 8/14 서울시립대 University Of Seoul Electrical and Computer Engineering Query Expansion Methods  Relevance model  Language modeling framework

10 Ubiquitous Sensor Network Lab 9/14 서울시립대 University Of Seoul Electrical and Computer Engineering Query Expansion Methods  Field Evidence For Query Expansion  Supervised Method Training DataMachine LearningRegressionClassification  Unsupervised Method Machine LearningRandom Variable Bayesian Inference Clustering

11 Ubiquitous Sensor Network Lab 10/14 서울시립대 University Of Seoul Electrical and Computer Engineering Experiments  Experiment Setting  In our experiments, documents are retrieved for a given query by the query likelihood language model with dirichlet smoothing  Experiments were conducted using four standard Text Retrieval Conference (TREC)  AP  Robust2004  WT10G  Gov2

12 Ubiquitous Sensor Network Lab 11/14 서울시립대 University Of Seoul Electrical and Computer Engineering Experiments  Baseline  Query likelihood language model (QL)  Relevance model (RMC)  Relevance model based on Wikipedia (RMW)  All The test collection on test topics > Mean Average Precision (MAP)

13 Ubiquitous Sensor Network Lab 12/14 서울시립대 University Of Seoul Electrical and Computer Engineering Experiments  Using Entity Pages For Relevance Feedback  Note that in out proposed method, not all the queries can be mapped to a specific Wikipedia entity page, thus the method is only applicable  EQ  AQ

14 Ubiquitous Sensor Network Lab 13/14 서울시립대 University Of Seoul Electrical and Computer Engineering Experiments  Field Based Expansion  The first is to add the top ranked 100 good terms (SL)  The second is to add the top ranked 10 good terms, (SLW) each given the classification probability as weight

15 Ubiquitous Sensor Network Lab 14/14 서울시립대 University Of Seoul Electrical and Computer Engineering Conclusion  We have explored utilization of Wikipedia in PRF  Three types based on Wikipedia  Four TREC collection and topics  Finally, in this paper, we focused on using Wikipedia as the sole source of PRF information  However, we believe both the initial result from the test collection and Wikipedia have their own advantages for PRF  By combining them together, one may be able to develop an expansion strategy which is robust to the query being degraded by either of the resources


Download ppt "University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 2010.05.04 전자전기컴퓨터공학 부 USN 연구실 G201049005."

Similar presentations


Ads by Google