Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Caching in Agent-based Distributed Information Retrieval

Similar presentations


Presentation on theme: "Query Caching in Agent-based Distributed Information Retrieval"— Presentation transcript:

1 Query Caching in Agent-based Distributed Information Retrieval
Query Caching in Agent Based DIR September 27, 2002 Query Caching in Agent-based Distributed Information Retrieval Hemali Majithia October 24, 2002 Hemali Majithia - CADIP, UMBC Hemali Majithia, CADIP, UMBC

2 Hemali Majithia - CADIP, UMBC
Problem Definition DIR (IR) systems access their collections to perform searches and answer queries Query resolution on large corpora is expensive in terms of time and resources Similar queries produce similar results Repetitive and redundant searching of the collections Resource Wastage and Inefficiency Solution – “ CACHING QUERIES ” October 24, 2002 Hemali Majithia - CADIP, UMBC

3 Hemali Majithia - CADIP, UMBC
Solution Caching Mechanism Cache new queries along with the results Answer future similar queries using the cached queries New Query Query which has not been answered before Similar Query Query which is identical or similar to the queries existing in the cache Emphasis If similar queries exist, you can retrieve the results for those queries from the previous searched queries rather than exact match Retrieval  linear time  collection size October 24, 2002 Hemali Majithia - CADIP, UMBC

4 Hemali Majithia - CADIP, UMBC
Caching Mechanism Two level Caching Mechanism First level  Exact Match Second level  Inverted Index of the queries Caching Algorithm Least Recent Used (LRU) Least Frequent Used (LFU) Lowest Relative Value (LRV) Similarity Metric Cosine Similarity October 24, 2002 Hemali Majithia - CADIP, UMBC

5 Hemali Majithia - CADIP, UMBC
Caching in CARROT–II Node I Node II Secondary Cache Primary cache 5. Miss Query Agent 4. Query forwarded 9.. Update cache 10. Results returned 1. User query 11. Response 6. Query forwarded to best C2 C2 Agent C2 Agent C2 Agent C2 Agent C2 Agent C2 Agent 3. MISS 2. Lookup 8. HIT 7. Lookup October 24, 2002 Hemali Majithia - CADIP, UMBC

6 Metrics for Evaluation of Caching Mechanism
Efficiency Round Trip Time (RTT) = Total time to answer queries fired at the system Hit Rate = For each agent cache and total hit rate Cost of caching = The over head caused by caching (assuming that the HIT rate is 0) Effectiveness Precision = fraction of retrieved documents that are relevant Recall =fraction of relevant documents that are retrieved October 24, 2002 Hemali Majithia - CADIP, UMBC


Download ppt "Query Caching in Agent-based Distributed Information Retrieval"

Similar presentations


Ads by Google