Presentation is loading. Please wait.

Presentation is loading. Please wait.

Engine Design: Stream Operators Everywhere Theodore Johnson AT&T Labs – Research Contributors: Chuck Cranor Vladislav Shkapenyuk.

Similar presentations


Presentation on theme: "Engine Design: Stream Operators Everywhere Theodore Johnson AT&T Labs – Research Contributors: Chuck Cranor Vladislav Shkapenyuk."— Presentation transcript:

1 Engine Design: Stream Operators Everywhere Theodore Johnson AT&T Labs – Research johnsont@research.att.com Contributors: Chuck Cranor Vladislav Shkapenyuk Oliver Spatscheck

2 Early Data Reduction Goal : Query high-speed links using inexpensive off-the- shelf servers. –OC48 : 2 x 2.4 Gb/sec., 7 million packets/sec. –OC192 : 2 x 7.2 Gb/sec., 21 million packets/sec. Goal : Evaluate queries over every bit of every packet. Problem : Not enough cycles in a second. –3 Ghz / 21 Mpacket/sec = 142 cycles / packet Solution : Push data reduction operators as far down the protocol stack as possible. –Into the hardware if possible. –View hardware bit twiddling as stream operators.

3 Early Data Reduction in Gigascope Gigascope was designed to monitor very high speed (optical) links using complex query sets. Multiple levels of data reduction: –Data reduction in the NIC : depends on NIC capabilities Snap length (projection) BPF filters Approximate filtering (bitmasks) Data reduction queries (replace the NIC run time system) –Low level queries Run queries on kernel input buffers Preliminary filter for the query set –Other possibilities ….

4 Example: Router Monitoring Router Network Tap Select Stream Network Interface card Snap length (projection) Approximate filter (selection) Selection/projection/aggregation queries (replace run time system) Circular Buffer Kernel Libpcap / BPF filters Low Level Queries Selection/projection/aggregation Pre-filter High Level Queries

5 Stream Operators Problem : Great heterogeneity in the specifics of manipulating the hardware mechanism –Stream selection vs. NIC filters vs. kernel filters, etc. –Programmable NIC vs. bit-twiddling NIC vs. non- programmable NIC, etc. Solution : –Define a set of stream operators to evaluate the stream query. Selection, projection, (partial) aggregation Merge, join, reorder ? –Define hardware capabilities as the types of queries they can execute –Multiple query optimization over the query set Low level query nodes feed multiple user queries

6 select timestamp, sourceIP, destIP, source_port, dest_port, len, total_length, gp_header from GAMEPROTOCOL where sample_hash[50, sourceIP, destIP] and protocol=17 and offset=0 NIC : snap_len = 134 (projection) Pre-filter : protocol=17 and offset=0 Low-level query : select timestamp, sourceIP, destIP, source_port, dest_port, len, total_length, gp_header from GAMEPROTOCOL where sample_hash[50, sourceIP, destIP] and protocol=17 and offset=0 Example (network monitoring)

7 Other Operators? Merge : Some NICs deliver packets out of order … –Optical links are not duplex In Buffer Out Buffer NIC In Buffer Out Buffer NIC Stream Merge Almost ordered stream ordered stream timestamp

8 Summary Early data reduction is critical for monitoring very high-speed streams –Selection, projection, aggregation. Use stream operators to mask the complexity and heterogenity of hardware / kernel data reduction. Issues : –Multiple query optimization –Push more complex operators down the stack? Join? Stratified sampling? Sketches? –Optimization at low level / hardware level Approximate filters Avoid duplicate filters. Where to place them? Re-organization when the query set changes.


Download ppt "Engine Design: Stream Operators Everywhere Theodore Johnson AT&T Labs – Research Contributors: Chuck Cranor Vladislav Shkapenyuk."

Similar presentations


Ads by Google