BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song.

BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song (Korea Advanced Institute of Science and Technology)

2 Outline  Border Monitoring Query (BMQ)  BMQ-Index  Experiments  Related work  Conclusion

3 GPSs Sensors  Data stream monitoring Emerging Computing Environment 1110 121312 14 Data stream Continuous range queries Q 1 : 10 < value Q 2 : 11 < value < 13 ……. ◀ Remote Medical Service ◀ Disaster Prevention Flood Warning Earthquake Prediction Building Monitoring Traffic light control ▲ Automatic Home Automatic Ventilation Automatic Temperature Control Automatic Humidity Control ◀ Logistics Management Thief-proofing Catalog Advertisement ◀ Location-based Service Tracking (Friends, Employee) Vehicle Monitoring Intelligent Transportation

4 Motivating Service Scenario #1  Stock trading SAMSUNG stock price during 23 days from Nov. 16 th to Dec. 23 rd, 2005 Expensive !! ( > $640) Time Cheap !! ( < $600) buy sell buy Monitor stock data streams crossing the borders !!

5 Motivating Service Scenario #2  Location-based advertisement Going out Send a special lunch menu to people within 1km during lunch time !! Coming into Monitor location data streams crossing the borders !! Coupon Pet-Care

6 Border Monitoring Query  To monitor data streams crossing the borders –Essential concern in many practical applications  Users’ main interest  Useful to automatically trigger or stop relevant actions  BMQ (Border Monitoring Query) –A new type of continuous range query !! –It reports only data crossing the borders of a query range (= coming into or going out from the query range) RMQ (Region Monitoring Query) – Conventional continuous range query – It reports all matching data within a query range

7 Problem: Scalability !!  A large number of BMQs can be issued Millions of stock investors will register their own queries Millions of stores will register their own queries + A huge volume of data streams are rapidly incoming + Fast response is also essential for users  How can we process BMQs over data streams efficiently? –(1) Naïve approach  Individual BMQ processing at each data update Lack of scalability !! –(2) Based on existing mechanisms for RMQ evaluation  Shared RMQ processing by indexing queries Costly post-processing !!

8 Solution Approach: BMQ-Index  Shared processing –By query indexing approach  BMQ-Index is built on registered BMQs  Upon a data arrival, only border-crossed queries are quickly searched for Achieves a high level of scalability !! Q 1, Q 2 (border-crossed queries) Registered BMQs Q 1 : 10 < value Q 2 : 11 < value < 13 ……. BMQ-Index 14 Data tuple

9 Solution Approach: BMQ-Index  Incremental processing –By incremental access method  Use previous search step for the next search Successive searches are significantly accelerated !!  Keep information only needed for incremental search Low storage cost !! Q 1, Q 2 (border-crossed queries) Registered BMQs Q 1 : 10 < value Q 2 : 11 < value < 13 ……. BMQ-Index Series of data tuples 10 121312 14 Locality of data streams !!  

10 One-dimensional BMQ-Index (Example) +Q1+Q1 ∞ +Q3  Q1 +Q4  Q3 +Q5 Stream_IDNode pointer IBM …  Q2  Q4  Q5 0 10 15 20 5 25 30 35 45 Stream Table Linked list Q5 Q4 Q3 Q2 Q1 Registered BMQs 0 10 5 20 15 0 25 30 35 45 reasonable price range (unit: $) $10 $30 Notify me whenever the IBM stock price is coming into or going out from my reasonable price range !! +Q2

11 Search Operation in One-dimension (Example) Q5 Q4 Q3 Q2 Q1 ∞ 0 10 15 20 5 25 30 35 45 0 10 5 20 15 0 25 30 35 45  Case 2) 21  37  -Q2, -Q4, +Q5  Traverse BMQ-Index to the right  Case 3) 21  8  +Q3, -Q4, +Q1  Traverse BMQ-Index to the left  Case 1) 21  23  No border-crossed query  No node traversal 37 21 8 Stream_IDNode pointer IBM … 23 +Q1+Q1+Q3  Q1 +Q4  Q3 +Q5  Q2  Q4  Q5 +Q2 : previous data value (v t-1 ) : current data value (v t )

12 Multi-dimensional BMQ-Index StreamIDVPXPX PYPY s1(v X1, v Y1 )RS-X 2 RS-Y 2 s2(v X2, v Y2 )RS-X 3 RS-Y 5 s3(v X3, v Y3 )RS-X 5 RS-Y 4 Stream Table b Y7 {Q 1 } {Q 2 } {Q 1 } {Q 3 } {Q 2 } Q1Q1 Q2Q2 Q3Q3 RS-X List RS-Y List RS-X 5 RS-X 6 RS-X 7 RS-X 4 RS-X 3 RS-X 2 {} -DQSet-X i {} RS-Y 2 RS-Y 3 RS-Y 4 RS-Y 5 RS-Y 6 RS-Y 7 +DQSet-Y i -DQSet-Y i {Q 1 } {Q 2 } {Q 3 } {} {Q 1 } {Q 3 } {Q 2 } +DQSet-X i {} b X0 b X1 b X2 b X3 b X4 b X5 b X7 b Y1 b Y2 b Y3 b Y4 b Y5 b Y6 b X6 RS-X 1 {} RS-Y 1 b Y0 v(s1) v(s2) v1(s3) v3(s3) v2(s3) QueryIDRange Q1Q1 (b X1, b X3, b Y1, b Y4 ) Q2Q2 (b X2, b X6, b Y2, b Y6 ) Q3Q3 (b X4, b X5, b Y3, b Y5 ) Query Table

13 Search Operation in Multi-dimension  Overall flow  Performance Analysis (d-dimension) –Search performance  (((d–1)d)  one-dimensional search time) –Storage cost  (d  one-dimensional storage cost) RS-X list.search() (xc, yc) RS-Y list.search() ±XQSet ±YQSet cross-check with Y-dimension cross-check with X-dimension Union xc yc ±YBMQSet ±XBMQSet QSet ± Per-dimension search Validation through cross-check Union of per-dimension results

14 Experiments  Workload generation –Stock trading scenario (one-dimensional case)  Data stream generation (Korea stock market[9]) –Fluctuation level: 0.01% ~ 0.1% –2000 stream sources, 1000 tuples in each stream  Query generation –Lower bound: randomly chosen (1 ~ 10 6 ) –Width of queries: 1 ~ 10 times larger than FL –Number of queries: 10,000 ~ 100,000  Comparisons –An approach based on state-of-the-arts RMQ-Index (CEI[CIKM’05] and IS-list[Information System’96])  Performance metrics –Average search time per data tuple (millisecond) –Index storage size (Mbyte)

15 Search performance Effects of the number of queries (W=0.1%, FL=0.01%) Effects of the widths of queries (N=100000, FL=0.01%)

16 Storage cost Effects of the number of queries (W=0.1%) Effects of the widths of queries (N=100000) BMQ-Index: twice IS-list: log (# of queries) times CEI: all grids covered by a query range

17 Related Work  Semantics –CQL (Continuous Query Language developed by STREAM project)  General concept to transform a Relation to a Stream  BMQ is a specific class of continuous range query  Shared and Incremental Processing Previous researchDifference Data stream processing Tree-based (1-D: [2][4][5][14]) - O(log N) search performance - O(NlogN) storage cost Grid-based (1-D: [17], 2-D:[6][13]) - Better search performance than tree-based - Require more storage cost Spatio- temporal database SINA[11] (shared and incremental) - Disk-based algorithm - Not purely incremental access method GPAC[12] (incremental) - Not for shared processing Generally not feasible for BMQs !!

18 Conclusion  Summary –Characterize a new type of continuous range query  Border Monitoring Query (BMQ)  Useful and practical in many emerging applications –One- and multi-dimensional BMQ-Index  Evaluates a large number of BMQs in a shared and incremental manner, thereby achieving excellent search performance and low storage cost

19 Thank you Question?

Backup slide

21 Performance Analysis  1-dimensional BMQ-Index –Search performance  (2  N q  FL) –Storage cost  (2N q + N d )  d-dimensional BMQ-Index –Search performance  (((d–1)d) 2N q  FL), only 2 times when d=2 –Storage cost  (d(2N q + N d ) + N q ) N q = Number of queries N d = Number of data streams

22 Cross checking  Algorithm –For +XQSet  check whether v t is located between the Y predicates –For –XQSet  check whether v t-1 is located between the Y predicates –YQSet is checked with X-dimension by a similar manner

BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song.

Similar presentations

Presentation on theme: "BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song.

Similar presentations

Presentation on theme: "BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song."— Presentation transcript:

Similar presentations

About project

Feedback