Sonata Query-driven Streaming Network Telemetry

Slides:



Advertisements
Similar presentations
Intrusion Detection Systems (I) CS 6262 Fall 02. Definitions Intrusion Intrusion A set of actions aimed to compromise the security goals, namely A set.
Advertisements

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Network Security Highlights Nick Feamster Georgia Tech.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Logically-Centralized Control COS 597E: Software Defined Networking.
Virtual Network Diagnosis as a Service Wenfei Wu (UW-Madison) Guohui Wang (Facebook) Aditya Akella (UW-Madison) Anees Shaikh (IBM System Networking)
SDN Applications Jennifer Rexford Princeton University.
Programming Protocol-Independent Packet Processors
P4: specifying data planes
Composing Software Defined Networks
SDX: A Software-Defined Internet Exchange
Jennifer Rexford Princeton University
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
Slick: A control plane for middleboxes Bilal Anwer, Theophilus Benson, Dave Levin, Nick Feamster, Jennifer Rexford Supported by DARPA through the U.S.
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Programming Abstractions for Software-Defined Networks Jennifer Rexford Princeton University.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Building Compilers for Reconfigurable Switches Lavanya Jose, Lisa Yan, Nick McKeown, and George Varghese 1 Research funded by AT&T, Intel, Open Networking.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Programmable Data Planes COS 597E: Software Defined Networking.
Composing Software Defined Networks Jennifer Rexford Princeton University With Joshua Reich, Chris Monsanto, Nate Foster, and.
Software-Defined Networks Jennifer Rexford Princeton University.
Copyright 2013 Open Networking User Group. All Rights Reserved Confidential Not For Distribution Programming Abstractions for Software-Defined Networks.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks Data.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Programming Languages for Software Defined Networks Jennifer Rexford and David Walker Princeton University Joint work with the.
Workpackage 3 New security algorithm design ICS-FORTH Ipswich 19 th December 2007.
SDX: A Software-Defined Internet eXchange Jennifer Rexford Princeton University
Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Programming Languages COS 597E: Software Defined Networking.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Plane Verification COS 597E: Software Defined Networking.
ISDX: An Industrial-Scale Software-Defined IXP Arpit Gupta Princeton University Robert MacDavid, Rüdiger Birkner, Marco Canini,
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Graciela Perera Department of Computer Science and Information Systems Slide 1 of 18 INTRODUCTION NETWORKING CONCEPTS AND ADMINISTRATION CSIS 3723 Graciela.
SketchVisor: Robust Network Measurement for Software Packet Processing
P4: Programming Protocol-Independent Packet Processors
COS 561: Advanced Computer Networks
HULA: Scalable Load Balancing Using Programmable Data Planes
CIS 700-5: The Design and Implementation of Cloud Networks
Jennifer Rexford Princeton University
SONATA: Scalable Streaming Analytics for Network Monitoring
COS 561: Advanced Computer Networks
NOX: Towards an Operating System for Networks
Language-Directed Hardware Design for Network Performance Monitoring
Srinivas Narayana MIT CSAIL October 7, 2016
Chapter 5 The Network Layer.
Congestion-Aware Load Balancing at the Virtual Edge
SONATA: Query-Driven Network Telemetry
Flexible and Scalable Systems for Network Management
Programmable Networks
DDoS Attack Detection under SDN Context
Software Defined Networking
Debugging P4 Programs with Vera
Programmable Data Plane
Sonata: Query-Driven Streaming Network Telemetry
Programmable Networks
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Memento: Making Sliding Windows Efficient for Heavy Hitters
Chapter 3 Part 3 Switching and Bridging
Programmable Switches
Data Plane Jennifer Rexford Fall 2018 (TTh 1:30-2:50 in Friend 006)
Project proposal: Questions to answer
Congestion-Aware Load Balancing at the Virtual Edge
Lu Tang , Qun Huang, Patrick P. C. Lee
Design principles for packet parsers
Toward Self-Driving Networks
Toward Self-Driving Networks
SPINE: Surveillance protection in the network Elements
Jennifer Rexford Princeton University
Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,
Control-Data Plane Separation
Chapter 4: outline 4.1 Overview of Network layer data plane
Presentation transcript:

Sonata Query-driven Streaming Network Telemetry Arpit Gupta Princeton University Rob Harrison, Marco Canini, Nick Feamster, Jennifer Rexford, Walter Willinger

Detect network events in real time Network Management Outages Google Level3 Cyberattacks Detect network events in real time Cogent Network Operator Princeton Congestion

Network Monitoring Requirements DNS Src: DNS Dst: Victim Receive DNS responses from many distinct sources Src: Victim Dst: DNS DNS 👺 Src: DNS Dst: Victim Flexible network monitoring is desired address protocol payload device location … Traffic jitter distinct hosts volume delay loss … Metrics Src: Victim Dst: DNS Attacker 😵😵 Victim

Network Monitoring with Sonata Performance Diag.. Malware Detection Flexibility Fault Localization DDoS Detection Abstractions Sonata System Algorithms Scalability

Building Sonata is Challenging Programming abstractions How to let network operators express queries for a wide-range of monitoring tasks? Scalability How to execute multiple queries for high-volume traffic in real time?

Building Sonata is Challenging Programming abstractions How to let network operators express queries for a wide-range of monitoring tasks? Scalability How to execute multiple queries for high-volume traffic in real time?

Packet as Tuple Treat packet as a tuple Packet traversed path, queue size, number of bytes, … Metadata Header source/ destination address, protocol, ports, … Payload Treat packet as a tuple Packet = (path, qsize, nbytes,… sIP, dIP, proto, sPort, dPort, … payload)

Monitoring Tasks as Dataflow Queries Detecting DNS Reflection Attack Identify if DNS response messages from unique DNS servers to a single host exceeds a threshold (Th) victimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) Express wide range of network monitoring tasks in fewer than 20 lines of code DNS Responses from Unique DNS Servers to a Single Host exceeds a Threshold

Building Sonata is Challenging Programming abstractions How to let network operators express queries for a wide-range of monitoring tasks? Scalability How to execute multiple queries for high-volume traffic in real time?

Where to Execute Monitoring Queries? CPUs Switches Match Headers + Payload Actions Any State O(Gb) Speed O(μs) Headers++ add, subtract, bit operations O(Mb) O(ns) Can we use both switches and CPUs? Gigascope [SIGMOD’03] NetQRE [SIGCOMM’17] Univmon [SIGCOMM’16] Marple [SIGCOMM’17]

PISA* Processing Model Programmable Parser Persistent State Programmable Deparser Memory ALU Packet Header Vector ip.src=1.1.1.1 ip.dst=2.2.2.2 ... Stages *RMT [SIGCOMM’13]

Mapping Dataflow to Data plane Model Pipeline Processing Unit Operators Match-Action Tables Structured Data Tuples Packets Which dataflow operators can be compiled to match-action tables?

Compiling Individual Operators Stream of elements Elements satisfying predicate (p) filter(p) Input Output pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) Match Action p udp.sport == 53 1 2 3 4 5 6 7

Compiling Individual Operators Stream of elements Result of applying function f over all elements reduce(f) Input Output Memory pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) Match Action * idx = hash(m.dstIP) 1 2 3 4 5 6 7 Match Action * stateful[idx] += 1

Programmable Deparser Compiling a Query Programmable Parser State Programmable Deparser Filter Map D1 D2 Map R1 R2 Filter Stages

Query Partitioning Decisions pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) pvictimIPs = pktStream .filter(p => p.udp.sport == 53) .map(p => (p.dstIP, p.srcIP)) .distinct() .map((dstIP, srcIP) => (dstIP, 1)) .reduce(keys=(dstIP,), sum) .filter((dstIP, count) => count > Th) Query Planner Resources? Reduce Load? Tuples

Query Partitioning ILP Programmable Parser Persistent State Programmable Deparser Constraints PHV Size Memory ALU Number of Actions Stateful Memory Total Stages Packet Header Vector Stages Goal: Minimize tuples sent to stream processor

How Effective is Query Partitioning? O(1 B) Log Scale 8 Tasks, 100 Gbps Workload

How Effective is Query Partitioning? O(1 B) O(100 M) Log Scale Only one order of magnitude reduction 8 Tasks, 100 Gbps Workload

Query Partitioning Limitations distinct reduce Filter Map D1 D2 Map R1 R2 Filter How can we reduce the memory footprint of stateful operators?

Observations: Nature of Monitoring Tasks DNS Reflection Attack Victims Most monitoring tasks are looking for needles in a haystack All Hosts

Observations: Possible to Reduce Memory Footprint Detecting DNS Reflection Attack Only consider first 8 bits victim = pktStream .map(dIP => dIP/8) .filter(p => p.udp.sPort == 53) .map(p => (p.dIP, p.sIP)) .distinct() … Queries at coarser levels have smaller memory footprint

Observations: Possible to Preserve Query Accuracy Detecting DNS Reflection Attack victim = pktStream .map(dIP => dIP/8) .filter(p => p.udp.sPort == 53) .map(p => (p.dIP, p.sIP)) .distinct() … Hierarchical packet field Query accuracy is preserved if refined with hierarchical packet fields

Iterative Query Refinement map(dIP=>dIP/8) Window Packet Stream t+W Map Filter Map D1 D2 Map R1 R2 Filter PISA Target First, execute query at coarser level

Iterative Query Refinement Smaller memory footprint Detection Delay Smaller memory footprint at the cost of additional detection delay Map Filter Map D1 D2 Map R1 R2 Filter Filtered Packet Stream t+2W Filter Filter Map D1 D2 Map R1 R2 Filter PISA Target Then, execute query at finer level(s)

Query Planning Problem Goal Minimize tuples sent to the stream processor Given Queries, packet traces Determine Which packet field to use for iterative refinement? What levels to use for iterative refinement? What’s the partitioning plan for each refined query? Augment partitioning ILP to compute both refinement and partitioning plans

Up to 4 orders of magnitude reduction Sonata’s Performance O(1 B) O(100 M) Log Scale O(100 K) Up to 4 orders of magnitude reduction 8 Tasks, 100 Gbps Workload

https://github.com/sonata-princeton Summary http://sonata.cs.princeton.edu Key Takeaways Flexible Dataflow queries over packet tuples Fewer than 20 lines of code Scalable Query refinement and partitioning algorithms 4 orders of magnitude workload reduction Future Directions Monitor network-wide events Handle traffic dynamics https://github.com/sonata-princeton