Presentation is loading. Please wait.

Presentation is loading. Please wait.

Eddies for Continuous Queries

Similar presentations


Presentation on theme: "Eddies for Continuous Queries"— Presentation transcript:

1 Eddies for Continuous Queries
Sam Madden CS286 Project S01

2 Motivation Want many queries over continuous streams of data
Current Eddies Thread per query Scanner per source Share common work between modules Reduce memory burden Intra-query scheduling (Not focusing on joins – Need new operators to deal with endless streams)

3 Data Structures One Eddy per Telegraph Instance
Only Source-module for each source (over all queries) One Filter per Source field (over all queries) Per-Source State Source -> Reachable modules Query -> Completion bitmask Per-Tuple State Output query mask Per Query State: Output queues Aggregate information

4 Tuple Flow Tuple Arrives Works for Joins Too (Somewhat Inefficiently?)
Tagged with source id Routing policy chooses a filter to route to, based on modules reachable from source Filter marks query state as “output” for tuples which don’t pass Tuple output to queries which have completed, using source If more filters to check, tuple re-inserted into eddy Works for Joins Too (Somewhat Inefficiently?) Extend reachability graph across joins Project out unused sources when tuples are output

5 Combining Filters Given a Filter F over some field S.a, with n predicates generalized to be over ranges [a,b] (plus not-equals) Interval tree for >, >=, <, <= predicates, inserting from interval (a,], [a,], [- , b), or [- , b]. (O(log n)) When a tuple arrives, find intervals which it itersects. (O(n)) For = and , use a hash table For , output all tuples except those in table Saves routing, tuple parsing cost Simplifies optimization space

6 Routing Policy Random policy routes to each module with equal probability Ticket policy: from Eddy paper Route to modules with highest selectivity Estimate selectivity based on ratio of in/out tuples Use back-pressure to adjust delivery rates Multi-query Ticket policy Estimate selectivity based on ratio of (number of applied predicates /number of passed predicates) Based on Shankar’s implementation: back pressure not applied properly

7 Preliminary Results Simple, four query test:
from s select s.index where s.a > 30 from s select s.index where s.b > 30 and s.a > 30 from s select s.index where s.c > 30 and s.b> 30 and s.a > 30 from s select s.index where s.d > 30 and s.c > 30 and s.b > 30 and s.a > 30 Becomes five modules: one scanner and four filters


Download ppt "Eddies for Continuous Queries"

Similar presentations


Ads by Google