Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh
AURORA Proponent Introduction Motivation Aurora System Architecture Aurora System Query model Conclusion
Introduction:Traditional DBMSs Passive reporitory: Human-Active, DBMS- Passive(HADP) model The current of state of the data is important: Previous data needs to be extracted form the log Triggers and alerts as second-class citizens Perfect synchronization of data elements and exact query answers No real-time services from applications
Introduction: Monitoring apps Monitoring applications are applications that monitor continuous streams of data Active repository: DBMS-Active, Human-Passive (DAHP) model History of the data is important: Not only the current state but also the previous history Triggers and alerts as the first-class citizens Missing or imprecise data, and approximate query answers Real-time services required by applications
Introducation: Monitoring apps Target Applications :military financial analysis tracking other real-time applications
Car Navigation System Data( e.g., the location of the car) comes from external sources History of the data is required( e.g., display a trajectory of your car in the past 20 minutes) Trigger and alert oriented: an alert for the driver when the car is approaching to an intersection The location of the car is not always perfectly transmitted due to interferences etc..
Motivation-DSMS Data Stream Management Systems Streams: continuous data feeds from sources. ect. sensors and satellites Monitoring applications track the data from numerous streams, filtering them for signs of abnormal activity and processing them for purposes of aggregation, reduction and correlation
Management Requirements DBMS 1.Data processing results issuing transactions and queries 2.Manages data in its tables 3.Provide exact answers to exact queries and is blind to real-time deadlines 4.Optimizations of all queries in the same way 5.The norm is pull-based queries DSMS 1.Monitoring and alerting humans for abnormal activities 2.Processing of data that is bounded by finite window of values and not over unbounded past 3.Respond to real-time deadlines and provide reasonable approximations to queries 4.Benefits from Application specific optimization criteria (QoS) 5.The norm is push-based data processing
What is Aurora Deals with large numbers of data streams Users built queries out of a small set of operators (boxes) Supports combination of (boxes) for better answers Aurora-Better support monitoring application; Stream processing QoS functions Operators: filtering, mapping, windowed aggregate, join Timeout
Aurora –Architecture Continuous stream data comes Flow through a set of operators Output to application or materialized Multiple streams can be merged
Aurora System: Query model Supports continuous queries Supports Ad-hoc queries
Aurora QoS Model Each output is supplied with a QoS specification QoS is captured by three functions A latency graph A loss-tolerance graph A value-based graph
QoS
Aurora optimization Dynamic continuous query optimization optimizer selects a portion of the network to optimize by: Inserting projections. Combining boxes. Reordering boxes.
QoS data structures Response times – output tuples should be produced in a timely fashion, as otherwise QoS will degrade as delays get longer. Tuple drops – if tuples are dropped to shed load, then the QoS of the affected outputs will deteriorate. Values produced – QoS clearly depends on whether or not important values are being produced.
QoS
Load Shedding Load shedding by dropping tuples Minimizing the overall performance degradation as a result of static analysis Semantic load shedding by filtering tuples Semantic load shedding based on value- based QoS information if available
Conclusion Aurora is a new rising star in DBMS More demand for monitoring applications Future directions: Aurora* for distributed processing More efficient data handing algorithm for missing and/or imprecise data that is common in sensor network
AURORA Rebuttal User Issues –The graphical workflow style specification could be cumbersome in real life use –An extended SQL may be easier for users