3/8/2012Data Streams: Lecture 151 CS 410/510 Data Streams Lecture 15: How Soccer Players Would do Stream Joins & Query-Aware Partitioning for Monitoring.

3/8/2012Data Streams: Lecture 151 CS 410/510 Data Streams Lecture 15: How Soccer Players Would do Stream Joins & Query-Aware Partitioning for Monitoring Massive Network Data Streams Kristin Tufte, David Maier

How Soccer Players Would do Stream Joins Handshake Join  Evaluate window-based stream joins  Highly parallelizable  Implementation on multi-core machine and FPGA Previous stream join execution strategies  Sequential execution based on operational semantics 3/8/2012 Data Streams: Lecture 15 2

Let’s talk about stream joins Join window of R with window of S  Focus on sliding windows here Scan, Insert, Invalidate How might I parallelize?  Partition and replicate  Time-based windows vs. tuple-based windows 3/8/2012 Data Streams: Lecture 15 3 Figure Credit: How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011

So, Handshake Join… 3/8/2012 Data Streams: Lecture 15 4 Stream Join Input A Input B Handshake Join Traditional Stream Join Entering tuple pushes oldest tuple out No central coordination Same semantics May introduce disorder Parallelization needs partitioning; possibly replication Needs central coordination Figure Credit : How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011

Parallelization Each core gets a segment of each window Data flow: act locally on new data arrival and passing on data Good for shared-nothing setups Simple communication – interact with neighbors; avoid bottlenecks 3/8/2012 Data Streams: Lecture 15 5 Figure Credit: How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011

Parallelization - Observations Parallelizes tuple-based windows and non equi-join predicates As written, compares all tuples – could hash at each node to optimize Note data transfer costs between cores and each tuple is processed at each core Soccer players have short arms, hardware is NUMA 3/8/2012 Data Streams: Lecture 15 6 Figure Credit: How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011

Scalability Data flow + point-to-point communication Add’l cores: larger window sizes or reduce workload per core “directly turn any degree of parallelism into higher throughput or larger supported window sizes” “can trivially be scaled up to handle larger join windows, higher throughput rates, or more compute-intensive join predicates” 3/8/2012 Data Streams: Lecture 15 7 Figure Credit: How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011

Encountering Tuples Item in either window, encounters all current times in the other window Immediate scan strategy Flexible segment boundaries (cores) Other local implementations 3/8/2012 Data Streams: Lecture 15 8 Figure : How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011 Figure Credit: How Soccer Players Would do Stream Joins – Teubner, Mueller, Sigmod 2011

Handshake Join with Message Passing Lock-step processing (tuple-based windows) FIFO queues with message passing Missed join-pair 3/8/2012 Data Streams: Lecture 15 9

Two-phase forwarding Asymmetric synchronization (replication on one core only) Keep copies of forwarded tuples until ack received Ack for s 4 must be processed between r 5 and r 6 3/8/2012 Data Streams: Lecture 15 10

Load Balancing & Synchronization 3/8/2012 Data Streams: Lecture 15 11 Even distribution not needed for correctness Maintain mostly even-sized local S windows Synch at pipeline ends to manage windows

FPGA Implementation Tuple-based windows that fit into memory Common clock signal; lock-step processing Nested-loops join processing 3/8/2012 Data Streams: Lecture 15 12

Performance 3/8/2012 Data Streams: Lecture 15 13 Scalability on Multi-Core CPU Scalability on FPGAs; 8 tuples/window

Before we move on… Soccer joins focuses on sliding windows How would their algorithm and implementation work for tumbling windows? What if we did tumbling windows only? 3/8/2012 Data Streams: Lecture 15 14

Query-Aware Partitioning for Monitoring Massive Network Data Streams OC-786 Networks  100 million packets/sec  2x40 Gbit/sec Query plan partitioning  Issues: “heavy” operators, non-uniform resource consumption Data stream partitioning 3/8/2012 Data Streams: Lecture 15 15

Let’s partition the data… Computes packet summaries between src and dest for network monitoring Round robin partitioning -> worst case a single flow results in n partial flows 3/8/2012 Data Streams: Lecture 15 16 SELECT time, srcIP, destIP, srcPrt, destPort, COUNT(*), SUM(len), MIN(timestamp), MAX(timestamp)... FROM TCP GROUP BY time, srcIP, destIP, srcPort, destPort

And, we might want a HAVING… Round robin partitioning -> no node can apply HAVING CPU and network load on final aggregator is high 3/8/2012 Data Streams: Lecture 15 17 SELECT time, srcIP, destIP, srcPrt, destPort, COUNT(*), SUM(len), MIN(timestamp), MAX(timestamp)... FROM TCP GROUP BY time, srcIP, destIP, srcPort, destPort HAVING OR_AGGR(flags) = ATTACK_PATTERN

So, let’s partition better… What about partitioning on : srcIP, destIP, srcPort, destPort (partition flows)?  Yeah! Nodes can compute and apply HAVING locally … But, what if I have more than one query? 3/8/2012 Data Streams: Lecture 15 18 SELECT time, srcIP, destIP, srcPrt, destPort, COUNT(*), SUM(len), MIN(timestamp), MAX(timestamp)... FROM TCP GROUP BY time, srcIP, destIP, srcPort, destPort HAVING OR_AGGR(flags) = ATTACK_PATTERN

But I need to run lots of queries… Large number of simultaneous queries are common (i.e. 50) Subqueries place different requirements on partitioning Dynamic repartitioning for each query?  That’s what the parallel DBs do…  Splitting 80 Gbit/sec -> specialized network hardware  Partition stream once and only once… 3/8/2012 Data Streams: Lecture 15 19

Partitioning Limitations Program partitioning in FPGAs  TCP fields (src, dest IP) - ok  Fields from HTTP – not ok Can’t re-partition every time the workload changes 3/8/2012 Data Streams: Lecture 15 20

Query-Aware Partitioning Analysis framework  Determine optimal partitioning Partition-aware distributed query optimizer  Takes advantage of existing partitions 3/8/2012 Data Streams: Lecture 15 21

Query-Aware Partitioning Analysis framework  Determine optimal partitioning Partition-aware distributed query optimizer  Takes advantage of existing partitions Compatible partitioning  Maximizes amount of data reduction done locally  Formal definition of compatible partitioning  Compatible partitioning – aggregations & joins 3/8/2012 Data Streams: Lecture 15 22

GS Uses Tumbling Windows (only) 3/8/2012 Data Streams: Lecture 15 23 SELECT tb, srcIP, destIP, sum(len) FROM PKT GROUP BY time/60 as tb, srcIP, destIP SELECT time, PKT1.srcIp, PKT1.destIP, PKT1.len + PKT2.len FROM PKT1 JOIN PKT2 WHERE PKT1.time = PKT2.time and PKT1.srcIP = PKT2.srcIP and PKT1.destIP = PKT2.destIP Time attribute is ordered (increasing)

Query Example 3/8/2012 Data Streams: Lecture 15 24 flows: SELECT tb, srcIP, destIP, COUNT(*) as cnt FROM TCP GROUP BY time/60 as tb, srcIP, destIP heavy_flows: SELECT tb, srcIP, max(cnt) as max_cnt FROM flows GROUP BY tb, srcIP flow_pairs: SELECT S1.tb, S1.srcIP, S1.max_cnt, S2.max_cnt FROM heavy_flows S1, heavy_flows S2 WHERE S1.srcIP = S2.srcIP and S1.tb = S2.tb+1 Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008

Query Example 3/8/2012 Data Streams: Lecture 15 25 flows: SELECT tb, srcIP, destIP, COUNT(*) as cnt FROM TCP GROUP BY time/60 as tb, srcIP, destIP heavy_flows: SELECT tb, srcIP, max(cnt) as max_cnt FROM flows GROUP BY tb, srcIP flow_pairs: SELECT S1.tb, S1.srcIP, S1.max_cnt, S2.max_cnt FROM heavy_flows S1, heavy_flows S2 WHERE S1.srcIP = S2.srcIP and S1.tb = S2.tb+1 Which partitioning scheme is optimal for each of the queries? Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008

Query Example 3/8/2012 Data Streams: Lecture 15 26 flows: SELECT tb, srcIP, destIP, COUNT(*) as cnt FROM TCP GROUP BY time/60 as tb, srcIP, destIP heavy_flows: SELECT tb, srcIP, max(cnt) as max_cnt FROM flows GROUP BY tb, srcIP flow_pairs: SELECT S1.tb, S1.srcIP, S1.max_cnt, S2.max_cnt FROM heavy_flows S1, heavy_flows S2 WHERE S1.srcIP = S2.srcIP and S1.tb = S2.tb+1 How to reconcile potentially conflicting partitioning requirements? Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008

Query Example 3/8/2012 Data Streams: Lecture 15 27 flows: SELECT tb, srcIP, destIP, COUNT(*) as cnt FROM TCP GROUP BY time/60 as tb, srcIP, destIP heavy_flows: SELECT tb, srcIP, max(cnt) as max_cnt FROM flows GROUP BY tb, srcIP flow_pairs: SELECT S1.tb, S1.srcIP, S1.max_cnt, S2.max_cnt FROM heavy_flows S1, heavy_flows S2 WHERE S1.srcIP = S2.srcIP and S1.tb = S2.tb+1 How can we use information about existing partitioning in a distributed query optimizer? Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008

What if we could only partition on destIP? 3/8/2012 Data Streams: Lecture 15 28 Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008

Partition compatibility Partitioning on (time/60, srcIP, destIP) -> execute aggregation locally then union (srcIP, destIP, srcPort, destPort) can’t aggregate locally 3/8/2012 Data Streams: Lecture 15 29 SELECT tb, srcIP, destIP, sum(len) FROM PKT GROUP BY time/60 as tb, srcIP, destIP

Partition compatibility Partitioning on (time/60, srcIP, destIP) -> execute aggregation locally then union (srcIP, destIP, srcPort, destPort) can’t aggregate locally P is Compatible with Q if for every time window, the output of Q is equal to a stream union of the output of Q running on partitions produced by P 3/8/2012 Data Streams: Lecture 15 30 SELECT tb, srcIP, destIP, sum(len) FROM PKT GROUP BY time/60 as tb, srcIP, destIP

Should we partition on temporal attributes? If we partition on temporal atts:  Processor allocation changes with time epochs  May help avoid bad hash fcns  Might lead to incorrect results if using panes  Tuples correlated in time tend to be correlated on temporal attribute – bad for load balancing Exclude temporal attr from partitioning 3/8/2012 Data Streams: Lecture 15 31

What partitionings work for aggregation queries? Group-bys on scalar expressions of source input attr  Ignore grouping on aggregations in lower-level queries  Any subset of a compatible partitioning is also compatible 3/8/2012 Data Streams: Lecture 15 32 SELECT expr 1, expr 2,.., expr n FROM STREAM_NAME WHERE tup_predicate GROUP BY temp_var, gb_var 1,..., gb_var m HAVING group_predicate

What partitionings work for join queries? 3/8/2012 Data Streams: Lecture 15 33 Equality predicates on scalar expressions of source stream attrs Any non-empty subset of a compatible partitioning is also compatible Need to reconcile partitioning of S and R SELECT expr 1, expr 2,.., expr n FROM STREAM1 AS S{LEFT|RIGHT|FULL}[OUTER] JOIN STREAM2 as R WHERE STREAM1.ts = STREAM2.ts and STREAM1.var 11 = STREAM2.var 21 and STREAM1.var 1k = STEAM2.var 2k and other_predicates

Now, multiple queries… 3/8/2012 Data Streams: Lecture 15 34 tcp_flows: SELECt tb, srcIP, destIP, srcPort, destPort, COUNT(*), sum(len) FROM TCP GROUP BY time/60 as tb, srcIP, destIP, srcPort, destPort flow_cnt: SELECt tb, srcIP, destIP, count(*) FROM tcp_flows GROUP BY tb, srcIP, destIP {sc_exp(srcIP), sc_exp(destIP), sc_exp(srcPort), sc_exp(destPort)} {sc_exp(srcIP), sc_exp(destIP)} {sc_exp(srcIP), sc_exp(destIP)} Result:

Now, multiple queries… 3/8/2012 Data Streams: Lecture 15 35 tcp_flows: SELECt tb, srcIP, destIP, srcPort, destPort, COUNT(*), sum(len) FROM TCP GROUP BY time/60 as tb, srcIP, destIP, srcPort, destPort flow_cnt: SELECt tb, srcIP, destIP, count(*) FROM tcp_flows GROUP BY tb, srcIP, destIP {sc_exp(srcIP), sc_exp(destIP), sc_exp(srcPort), sc_exp(destPort)} {sc_exp(srcIP), sc_exp(destIP)} Fully compatible partitioning set likely to be empty Partition to minimize cost of execution

Query Plan Transformation 3/8/2012 Data Streams: Lecture 15 36 Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008 Main idea: push aggregation operator below merge to allow aggregations to execute independently on partitions Main idea: partial aggregates (think panes)

Performance 3/8/2012 Data Streams: Lecture 15 37 Figure Credit: Query-Aware Partitioning for Monitoring Massive Network Data Streams, Johnson, et al. SIGMOD 2008

3/8/2012Data Streams: Lecture 151 CS 410/510 Data Streams Lecture 15: How Soccer Players Would do Stream Joins & Query-Aware Partitioning for Monitoring.

Similar presentations

Presentation on theme: "3/8/2012Data Streams: Lecture 151 CS 410/510 Data Streams Lecture 15: How Soccer Players Would do Stream Joins & Query-Aware Partitioning for Monitoring."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

3/8/2012Data Streams: Lecture 151 CS 410/510 Data Streams Lecture 15: How Soccer Players Would do Stream Joins & Query-Aware Partitioning for Monitoring.

Similar presentations

Presentation on theme: "3/8/2012Data Streams: Lecture 151 CS 410/510 Data Streams Lecture 15: How Soccer Players Would do Stream Joins & Query-Aware Partitioning for Monitoring."— Presentation transcript:

Similar presentations

About project

Feedback