Download presentation
Presentation is loading. Please wait.
1
Selectivity Estimation for Optimizing Similarity Query in Multimedia Databases IDEAL 2003 Paper review
2
Query optimization in traditional database Query: find the employee who’s age between 30-40 and work for Engineering Faculty Running time of different execution plans depend on Number of employees between 30-40 Number of employees work for Engineering Faculty Task: Estimate the number in advance and select the best execution plan (selectivity estimation) Statistics are stored in database (metadata)
3
Techniques: one dimension Parametric – unrealistic Curve fitting – negative value problem Sampling – large overhead Non-parametric (Histogram technique) – widely used age 10 1520253035404550
4
Problem in multimedia database (Color = ‘red’) ^ (Shape = ‘round’) Color, shape – feature vector Multi-dimension Number of buckets increases exponentially with dimension Histogram technique fails 1d – 5 2d – 25 3d – 125 4d – 625
5
Previous Work – SIGMOD 99 Use DCT to compress information of histogram 2D example Store DCT coefficient 101513 142016 9131140.33-2.86-5.422.04-0.50-0.29 -6.84-0.291.167 DCT Histogram valueDCT coefficients0000-2 200-0.331.630.47-1.230.00 1.18-0.581.33 DCT
6
Reconstruction of histogram value 10 12 30 15 15 24 36 81 10 30 9 40 42 23 20 18 13 35 60 70 10 15 34 43 60 151.0000 -39.1747 -25.2604 -11.2001 24.9442 -24.0137 42.2456 -24.0044 -8.9490 16.2098 -15.0187 -14.2921 15.5469 9.8779 0.0979 -9.2651 -19.4490 19.4228 16.7544 -20.1256 -27.0396 12.9394 -4.5979 -4.8360 -17.5469 151.0000 -39.1747 -25.2604 -11.2001 24.9442 -24.0137 42.2456 -24.0044 -8.9490 0 -15.0187 -14.2921 15.5469 0 0 -9.2651 -19.4490 0 0 0 -27.0396 0 0 0 0 1.6184 17.9451 34.7007 10.3893 17.3465 31.0059 44.7644 57.9655 25.3113 21.9529 11.2059 25.0820 47.2188 25.3614 25.1319 12.6449 19.9000 49.6779 49.1996 64.5775 14.5248 8.3085 32.4371 40.7383 65.9913 DCT Zone sampling IDCT
7
Selectivity estimation 25 9 13 2 10 23 6 19 14 10 28 10 3 17 8 22 26 13 14 21 30 16 2 20 19
8
Current Work - IDEAL 2003 Extend the range query from hyper-cube to hyper- sphere Model hyper-sphere as combination of hyper-cube Task Find combination of hyper-cubes to represent hyper-sphere Find the area of overlapping
9
Generate combination of hyper- cube
16
Overlapping of hyper-cube with hyper- sphere Monte-Carlo method Generate uniformly distributed random point inside the hypercube Count the number of points within the hyper-sphere Use the ratio to estimate area of overlapping
17
Generate uniformly distributed points inside a hyper-sphere Accept / Reject method Generate points within hyper-cube Accept those fall within the hyper-sphere Greedy method Generate θ uniformly [0,2π] Generate r according to F -1 (U(0,1)) θ r
18
Experiment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.