Download presentation
Presentation is loading. Please wait.
Published byVernon Hawkins Modified over 8 years ago
1
Adaptive Processing of Top-k Queries in XML Amelie Marian, Sihem Amer-Yahia Nick Koudas, Divesh Srivastava Proceedings of the 21st International Conference on Data Engineering (ICDE2005)
2
XML wodehouse psmith london 1234 48.95 wodehouse psmith london 1234
3
XML
5
XML XPath pc : parent – child ad : ancestor-descendant
6
Scoring Function The traditional tf*idf function is defined in IR. tf : term frequency : quantifies the relative importance of a keyword in an individual document. idf : inverse document frequency : quantifies the relative importance of an individual keyword in the collection of documents.
7
Scoring Function XML unlike traditional IR An answer to an XPath query need not be an entire document, but can be any node in a document. An XPath query consists of several predicates linking the returned node to other query nodes, instead of simply “ keyword containment in the document ” (as in IR).
8
Scoring Function XPath Component Predicates XPath query Q q0 : query answer node qi, 1 <= i <= l : other query nodes p( q0, qi ) : XPath axis between query nodes q0 and qi, i>=1 P Q (component predicates of Q): set of predicates {p(q0,qi)}, 1<= i <= l
9
Scoring Function XML idf
10
Scoring Function XML tf
11
Scoring Function XML tf*idf Score
12
Whirlpool Architecture
13
Servers and Server Queues Top-k Set Router and Router Queue
14
Server Predicates Generation
15
Whirlpool
16
Scheduling between components Single-threaded Multi-threaded
17
Experimental
18
Conclusion Whirlpool, an adaptive evaluation strategy for computing exact and approximate top-k answers of XPath queries. We are investigating new directions such as increasing the number of threads per server for maximal parallelism.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.