Presentation is loading. Please wait.

Presentation is loading. Please wait.

Elke A. Rundensteiner Database Systems Research Group Office: Fuller 238 Phone: Ext. – 5815 WebPages:

Similar presentations


Presentation on theme: "Elke A. Rundensteiner Database Systems Research Group Office: Fuller 238 Phone: Ext. – 5815 WebPages:"— Presentation transcript:

1 Elke A. Rundensteiner Database Systems Research Group Email: rundenst@cs.wpi.edurundenst@cs.wpi.edu Office: Fuller 238 Phone: Ext. – 5815 WebPages: http://www.cs.wpi.edu/~rundenst http://davis.wpi.edu/dsrg

2 Elke A. Rundensteiner Topics projects in database and Information systems, such as, web information systems, distributed databases, Etc. Database Systems Research Lab Email: rundenst@cs.wpi.eduundenst@cs.wpi.edu Office: Fuller 238 Phone: x – 5815 Webpages: http://www.cs.wpi.edu/~rundenst http://davis.wpi.edu/dsrg

3 Project Topics in a Nutshell:  Distributed Data Sources:  EVE : Data Warehousing over Distributed Data  TOTAL-ETL : Distributed Extract Transform Load [NSF’96,NSF02,NSF05?]  XML/Web Data Systems:  RAINBOW : XML to Relational Databases  MASS : Native XQuery Processing System [Verizon,IBM,NSF05, NSF05?]  Distributed Data Sources:  EVE : Data Warehousing over Distributed Data  TOTAL-ETL : Distributed Extract Transform Load [NSF’96,NSF02,NSF05?]  XML/Web Data Systems:  RAINBOW : XML to Relational Databases  MASS : Native XQuery Processing System [Verizon,IBM,NSF05, NSF05?]  Databases & Visualization:  Scalable Visual High-Dim. Data Exploration  Data and Visual Quality Support in XMDV [NSF’97,NSF01,NSF05]  Stream Monitoring System:  Scalable Query Engine for Data Streams  Fire Prediction and Monitoring Appl. [NSF05a?, NSF05b?]

4 CAPE : Engine for Querying and Monitoring Streaming Data Example of Stream Data Applications: Market Analysis –Streams of Stock Exchange Data - get rich Critical Care –Streams of Vital Sign Measurements – save lives Physical Plant Monitoring –Streams of Environmental Readings – protect env

5 Databases Upside Down data Query data streams of data static data Standing queries one-time queries

6 Stream Query Processing Register Continuous Queries Distributed Stream Query Engine Distributed Stream Query Engine Streaming Data Streaming Result Real-time and accurate responses required May have time- varying rates and high-volumes Available resources for executing each operator may vary over time. Run-time Distribution and Adaptations required. High workload of queries Receive Answers Memory- and CPU resource limitations

7 Good news … for a research student  We can lean on the oldie and goodie,  Yet so many new and unsolved problems at our finger tips due to new light !  Interesting (yet doable) research challenges  Even possibilities for start-up (if you are so inclined)  We can lean on the oldie and goodie,  Yet so many new and unsolved problems at our finger tips due to new light !  Interesting (yet doable) research challenges  Even possibilities for start-up (if you are so inclined)

8 Research Contributions  Scalable Query Operators (Punctuations)  Adapt and select among tasks such as memory purging, stream reading, memory- to-disk shuffling, punctuation propagation, index selection, etc.  Synchronized Plan Spilling  Operators selectively spill data to disk to off-set the system overload with adaptive re-load to improve performance  Adaptive Operator Scheduling  Selector scores alternate scheduling algorithm based on their effect on QoS requirements, and selects candidate.  On-line Query Plan Migration  On-line plan restructuring and then online migration to the new plan even for stateful operators.  Distributed Plan Execution  Adaptively distribute computations across multiple machines to optimize QoS requirements without information loss  Scalable Query Operators (Punctuations)  Adapt and select among tasks such as memory purging, stream reading, memory- to-disk shuffling, punctuation propagation, index selection, etc.  Synchronized Plan Spilling  Operators selectively spill data to disk to off-set the system overload with adaptive re-load to improve performance  Adaptive Operator Scheduling  Selector scores alternate scheduling algorithm based on their effect on QoS requirements, and selects candidate.  On-line Query Plan Migration  On-line plan restructuring and then online migration to the new plan even for stateful operators.  Distributed Plan Execution  Adaptively distribute computations across multiple machines to optimize QoS requirements without information loss

9 We got it all... and more  If you like theory  algorithms for np-complete optimization, graph theory  If you like systems  distributed allocation, scheduling, and parallelism of query execution  If you like networking  quality-of-query, load-shedding, grid-computing  If you like AI  learning of scheduling selection, run-time adaptation  If you like software engineering  huge query engine code base, we really need you  If you like theory  algorithms for np-complete optimization, graph theory  If you like systems  distributed allocation, scheduling, and parallelism of query execution  If you like networking  quality-of-query, load-shedding, grid-computing  If you like AI  learning of scheduling selection, run-time adaptation  If you like software engineering  huge query engine code base, we really need you So where is the database in this stuff?

10  One answer :  Who cares ? If it’s fun, it’s database stuff  Second answer :  Development of a new generation of “data query engine”  One answer :  Who cares ? If it’s fun, it’s database stuff  Second answer :  Development of a new generation of “data query engine”

11  A driving application: FIRE

12 Sensors in Rooms

13 Engineering Data for Fire Science

14 Futuristic Monitoring Queries ?  Track a smoke cloud (moving cluster) in terms of its speed and severity ?  Find the scope and direction of fire spreads ?  Match given sensors readings of fire with a fire stream simulation to determine similarity ?  Is this a prank (outlier), or are we dealing with an actual fire ?  What path should people be leaving this building ?  Any sensor readings are faulty, and should be ignored?  Track a smoke cloud (moving cluster) in terms of its speed and severity ?  Find the scope and direction of fire spreads ?  Match given sensors readings of fire with a fire stream simulation to determine similarity ?  Is this a prank (outlier), or are we dealing with an actual fire ?  What path should people be leaving this building ?  Any sensor readings are faulty, and should be ignored?

15 FireEngine : Fire Stream Processing

16 If Questions, email me: rundenst@cs.wpi.edu Better, drop by DSRG Labs : Fuller 319 & 318 My office : Fuller 238


Download ppt "Elke A. Rundensteiner Database Systems Research Group Office: Fuller 238 Phone: Ext. – 5815 WebPages:"

Similar presentations


Ads by Google