Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Database Aggregation Query Processing

Similar presentations


Presentation on theme: "Advanced Database Aggregation Query Processing"— Presentation transcript:

1 Advanced Database Aggregation Query Processing
Donghui Zhang Computer Science Department University of California, Riverside 3/28/2002 EDBT Ph.D. Workshop 2002

2 Aggregation Problem Maintain a set of objects, each having a value. Given a condition which holds for a sub-set of objects, compute the total value of objects in this sub-set. E.g. “find the total salary of employees who joined the company less than a year”. 3/28/2002 EDBT Ph.D. Workshop 2002

3 Aggregation over Objects with Extent
Objects with extent: versus point objects. Real-life applications: temporal, spatial, etc. An employee works for the company during a certain period of time; “find the total salary of employees who worked for the company during 1999”. A rainfall record occurs within a spatial region; “find the total volume of rainfall in Los Angeles”. 3/28/2002 EDBT Ph.D. Workshop 2002

4 Functional Box-Sum Maintain a set of objects, each having a box and a value function; given query box q, compute the total value of objects intersecting q, where the contribution of an object is the integral of its value function over its intersection with q. 3/28/2002 EDBT Ph.D. Workshop 2002

5 Functional Box-Sum functional box-sum: 4*50+3*12 = 236. 3/28/2002
EDBT Ph.D. Workshop 2002

6 ò Functional Box-Sum = - 310. d ) 2 ( 7 11 x
Moreover, object value can be a function; FBS= ò 20 = - 15 310. d ) 2 ( 7 11 x 3/28/2002 EDBT Ph.D. Workshop 2002

7 Straightforward Approaches
No index. For each query, scan through all records. Not efficient. Maintain the objects in an R-tree (which speeds up the selection query). To compute an aggregate, select the objects and aggregate their values on-the-fly. Query time: O(n). 3/28/2002 EDBT Ph.D. Workshop 2002

8 Our Solution We reduce the functional box-sum problem into a simpler problem (the dominance-sum problem) and we build an index specialized for computing the dominance-sums. Instead of storing the original data, the specialized index stores specially aggregated information, which leads to O(log2n) query time. 3/28/2002 EDBT Ph.D. Workshop 2002

9 Functional Box-Sum  OIFBS
A special case of functional box-sum is OIFBS (Origin-Involved Functional Box-Sum), where the query box contains the origin of space. A functional Box-Sum query can be reduced to the OIFBS: we compute the OIFBS from origin to upper right corner of the query box, then subtract the parts to the left and below the query box (which are also OIFBS queries). 3/28/2002 EDBT Ph.D. Workshop 2002

10 Dominance-Sum = 18 Maintain a set of weighted points;
Given query point p, compute total weight of objects dominated by p (i.e. to the lower left of p). dominance-sum = 18 3/28/2002 EDBT Ph.D. Workshop 2002

11 OIFBS  Dominance-Sum Idea: to insert an object (with a rectangular region), insert its corner points, associating a function with each corner. To compute an OIFBS regarding box [origin, p], compute the dominance-sum regarding p, i.e. the summation of functions associated with points dominated by p. 3/28/2002 EDBT Ph.D. Workshop 2002

12 New Dominance-Sum Index
For the dominance-sum problem, we propose the BA-tree: a k-d-B-tree augmented with additional information at index records. O(log2n) query time, when balanced. 3/28/2002 EDBT Ph.D. Workshop 2002

13 Performance Functional box-sum query cost 3/28/2002
EDBT Ph.D. Workshop 2002

14 Summary of Our Aggregation Work
The functional box-sum solution described here is to appear in [PODS’02]. Also in [PODS’02], we solved a variation: a simple box-sum aggregation problem, which is to find the total value of objects intersecting the query rectangle. We solved some other aggregation problems... 3/28/2002 EDBT Ph.D. Workshop 2002

15 Range-Temporal Aggregation
Maintain a set of temporal records, each having a key, a value and a time interval. Given a key range r and time interval i, compute the total value of records whose keys are in r and whose intervals intersect i. Appeared in [PODS’01]. 3/28/2002 EDBT Ph.D. Workshop 2002

16 Temporal Aggregation over Data Streams
Temporal aggregation in the circumstance when records accumulate in a streaming manner. There is limited storage, but we want to answer aggregation queries both for recent data and for older data. To appear in [EDBT’02]. 3/28/2002 EDBT Ph.D. Workshop 2002

17 Box-Max Aggregation Maintain a set of spatial objects, each having a spatial region and a value. Given a query region r, find the Min/Max value over all objects intersecting r. Appeared in [GIS’01]. 3/28/2002 EDBT Ph.D. Workshop 2002

18 Conclusions We have proposed specialized index structures for various complex aggregation problems. In all cases, our proposed methods have much better query performance than the existing approaches, sometimes over 100 times faster. We recommend that these indices should be implemented in commercial DBMS in circumstances when the aggregates need to be computed very fast. 3/28/2002 EDBT Ph.D. Workshop 2002


Download ppt "Advanced Database Aggregation Query Processing"

Similar presentations


Ads by Google