EPFL 25Apr ’12. Software-Defined Networking (SDN) 2 Enables new functionality through programmability … Third-party program 25 Apr 2012NSDI'12.

Slides:



Advertisements
Similar presentations
Openflow App Testing Chao SHI, Stephen Duraski. Motivation Network is still a complex stuff ! o Distributed mechanism o Complex protocol o Large state.
Advertisements

SDN Applications Jennifer Rexford Princeton University.
Frenetic: A High-Level Language for OpenFlow Networks Nate Foster, Rob Harrison, Matthew L. Meola, Michael J. Freedman, Jennifer Rexford, David Walker.
Composing Software-Defined Networks Princeton*Cornell^ Chris Monsanto*, Joshua Reich* Nate Foster^, Jen Rexford*, David Walker*
Ryu Book Chapter 1 Speaker: Chang, Cheng-Yu Date: 25/Nov./
Slick: A control plane for middleboxes Bilal Anwer, Theophilus Benson, Dave Levin, Nick Feamster, Jennifer Rexford Supported by DARPA through the U.S.
A SOFT Way for OpenFlow Interoperability Testing Marco Canini TU Berlin / T-Labs [CoNEXT’12]
A SOFT Way for OpenFlow Interoperability Testing Maciej Kuźniar, Peter Perešini, Marco Canini†, Daniele Venzano, Dejan Kostić‡ EPFL †TU Berlin/T-Labs ‡IMDEA.
An Overview of Software-Defined Network Presenter: Xitao Wen.
OpenFlow Costin Raiciu Using slides from Brandon Heller and Nick McKeown.
Leveraging SDN Layering to Systematically Troubleshoot Networks Brandon Heller ★ Colin Scott  Nick McKeown ⌘ Scott Shenker  Andreas Wundsam § Hongyi.
VeriCon: Towards Verifying Controller Programs in SDNs (PLDI 2014) Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly.
OpenFlow-Based Server Load Balancing GoneWild
Consensus Routing: The Internet as a Distributed System John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented.
Troubleshooting SDNs Peyman Kazemian. Why SDN Troubleshooting SDN decouples software (control plane) from hardware (data plane). Opens doors for innovation.
Scalable Network Virtualization in Software-Defined Networks
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
An Overview of Software-Defined Network
Data Plane Verification. Background: What are network policies Alice can talk to Bob Skype traffic must go through a VoIP transcoder All traffic must.
Languages for Software-Defined Networks Nate Foster, Arjun Guha, Mark Reitblatt, and Alec Story, Cornell University Michael J. Freedman, Naga Praveen Katta,
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
An Overview of Software-Defined Network Presenter: Xitao Wen.
CN2668 Routers and Switches Kemtis Kunanuraksapong MSIS with Distinction MCTS, MCDST, MCP, A+
Formal checkings in networks James Hongyi Zeng with Peyman Kazemian, George Varghese, Nick McKeown.
Enabling Innovation Inside the Network Jennifer Rexford Princeton University
Software Defined Networking COMS , Fall 2013 Instructor: Li Erran Li SDNFall2013/
Network Redundancy Multiple paths may exist between systems. Redundancy is not a requirement of a packet switching network. Redundancy was part of the.
Composing Software Defined Networks Jennifer Rexford Princeton University With Joshua Reich, Chris Monsanto, Nate Foster, and.
OpenFlow-Based Server Load Balancing GoneWild Author : Richard Wang, Dana Butnariu, Jennifer Rexford Publisher : Hot-ICE'11 Proceedings of the 11th USENIX.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 8 – Denial of Service.
Frenetic: A Programming Language for Software Defined Networks Jennifer Rexford Princeton University Joint work with Nate.
Software-Defined Networks Jennifer Rexford Princeton University.
PA3: Router Junxian (Jim) Huang EECS 489 W11 /
VeriFlow: Verifying Network-Wide Invariants in Real Time
Languages for Software-Defined Networks Nate Foster, Michael J. Freedman, Arjun Guha, Rob Harrison, Naga Praveen Katta, Christopher Monsanto, Joshua Reich,
Reasoning about Software Defined Networks Mooly Sagiv Tel Aviv University Thursday (Physics 105) Monday Schrieber.
Copyright 2013 Open Networking User Group. All Rights Reserved Confidential Not For Distribution Programming Abstractions for Software-Defined Networks.
Central Control over Distributed Routing fibbing.net SIGCOMM Stefano Vissicchio 18th August 2015 UCLouvain Joint work with O. Tilmans (UCLouvain), L. Vanbever.
4061 Session 25 (4/17). Today Briefly: Select and Poll Layered Protocols and the Internets Intro to Network Programming.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Programming Languages for Software Defined Networks Jennifer Rexford and David Walker Princeton University Joint work with the.
Pyretic Programming.
Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Programming Languages COS 597E: Software Defined Networking.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Plane Verification COS 597E: Software Defined Networking.
NetEgg: Scenario-based Programming for SDN Policies Yifei Yuan, Dong Lin, Rajeev Alur, Boon Thau Loo University of Pennsylvania 1.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Testing and Debugging COS 597E: Software Defined Networking.
Header Space Analysis: Static Checking for Networks Broadband Network Technology Integrated M.S. and Ph.D. Eun-Do Kim Network Standards Research Section.
Authors: Mark Reitblatt, Nate Foster, Jennifer Rexford, Cole Schlesinger, David Walker Presenter: Byungkwon Choi Abstractions for Network Update INA.
SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.
Programming SDN 1 Problems with programming with POX.
Xin Li, Chen Qian University of Kentucky
Multi Node Label Routing – A layer 2.5 routing protocol
CIS 700-5: The Design and Implementation of Cloud Networks
SDN Network Updates Minimum updates within a single switch
Programming SDN Newer proposals Frenetic (ICFP’11) Maple (SIGCOMM’13)
What I Learned From Mininet
Software defined networking: Experimental research on QoS
University of Maryland College Park
The DPIaaS Controller Prototype
Martin Casado, Nate Foster, and Arjun Guha CACM, October 2014
NOX: Towards an Operating System for Networks
Abstractions for Model Checking SDN Controllers
Software Defined Networking (SDN)
DDoS Attack Detection under SDN Context
Software Defined Networking
Languages for Software-Defined Networks
Programmable Networks
Delivery, Forwarding, and Routing of IP Packets
Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,
Control-Data Plane Separation
Presentation transcript:

EPFL 25Apr ’12

Software-Defined Networking (SDN) 2 Enables new functionality through programmability … Third-party program 25 Apr 2012NSDI'12

… at the risk of bugs 325 Apr 2012NSDI'12

Software Faults Will make communication unreliable Major hurdle for success of SDN 25 Apr 2012NSDI'124 We need effective ways to test SDN networks This talk: automatically testing OpenFlow Apps

OpenFlow Quick OpenFlow 101 Host BHost A Switch 2 Flow Table Rule 1 Rule 2 Rule N Switch 1 Packet OpenFlow program Controller Install rule; forward packet Default: forward to controller 5 MatchActionsCounters Dst: Host BFwd: Switch 2pkts / bytes System is distributed and asynchronous  can misbehave under corner cases Execute packet_in event handler 25 Apr 2012NSDI'12

Bugs in OpenFlow Apps OpenFlow program Host BHost A Switch 2 Controller Switch 1 Packet Install rule ? 6 Goal: systematically test possible behaviors to detect bugs Install rule Delayed! 25 Apr 2012NSDI'12 Drop packet Inconsistent distributed state!

State-space exploration via Model Checking (MC) Systematically Testing OpenFlow Apps 7 Target system Unmodified OpenFlow program Complex environment Environment model Switch 1 Switch 2 Host AHost B Carefully-crafted streams of packets Many orderings of packet arrivals and events 25 Apr 2012NSDI'12

Scalability Challenges Huge space of possible packets Huge space of possible event orderings Data-plane drivenComplex network behavior Enumerating all inputs and event orderings is intractable Equivalence classes of packets Domain-specific search strategies 825 Apr 2012NSDI'12

Network topology Correctness properties (e.g., no loops) Traces of property violations InputOutput NICE State-space search 9 N o bugs I n C ontroller E xecution NICE found 11 bugs in 3 real OpenFlow Apps Unmodified OpenFlow program 25 Apr 2012NSDI'12

Network topology Correctness properties (e.g., no loops) Traces of property violations InputOutput NICE N o bugs I n C ontroller E xecution Unmodified OpenFlow program State-space search 1025 Apr 2012NSDI'12

Model Checking State-Space Model 11 State 0 State 2 State 6 State 7 State 4 State 9 State 1 State 3 State 5 State 8 25 Apr 2012NSDI'12

System State 25 Apr 2012NSDI'1212 State Controller (global variables) Environment: Switches (flow table, OpenFlow agent) Simplified switch model End-hosts (network stack) Simple clients/servers Communication channels (in-flight pkts)

Transition System 13 State 0 State 2 State 6 State 7 State 4 State 9 State 1 State 3 ctrl packet_in(pkt A) host send switch process_of switch process_pkt ctrl packet_in(pkt B) Run actual packet_in handler State 5 State 8 25 Apr 2012NSDI'12 Data-dependent transitions!

Combating Huge Space of Packets 14 Packet arrival handler is dst broadcast? Flood packet Install rule and forward packet dst in mactable? Equivalence classes of packets: 1.Broadcast destination 2.Unknown unicast destination 3.Known unicast destination yes no yes Code itself reveals equivalence classes of packets pkt 25 Apr 2012NSDI'12

Code Analysis: Symbolic Execution (SE) 15 Packet arrival handler is λ.dst broadcast? yesno Symbolic packet λ Flood packet λ.dst ∈ {Broadcast} λ.dst in mactable? no yes λ.dst ∉ {Broadcast} Install rule and forward packet λ.dst ∉ {Broadcast} ∧ λ.dst ∉ mactable λ.dst ∉ {Broadcast} ∧ λ.dst ∈ mactable 1 path = 1 equivalence class of packets = 1 packet to inject Infeasible from initial state 25 Apr 2012NSDI'12

New packets Enable new transitions: host / send(pkt B) host / send(pkt C) Symbolic execution of packet_in handler State 0 State 1 Controller state 1 State 2 host discover_packets State 3 host send(pkt B) State 4 host send(pkt C) discover_packets transition: Combining SE with Model Checking 16 Controller state changes host send(pkt A) 25 Apr 2012NSDI'12

Combating Huge Space of Orderings 17 MC + SE PKT-SEQ FLOW-IR NO-DELAY UNUSUAL OpenFlow-specific search strategies for up to 20x state-space reduction: 25 Apr 2012NSDI'12

Network topology Traces of property violations InputOutput NICE N o bugs I n C ontroller E xecution Unmodified OpenFlow program State-space search Correctness properties (e.g., no loops) 1825 Apr 2012NSDI'12

Specifying App Correctness Library of common properties – No forwarding loops – No black holes – Direct paths (no unnecessary flooding) – Etc… Correctness is app-specific in nature 1925 Apr 2012NSDI'12

API to Define App-Specific Properties 25 Apr 2012NSDI'1220 State 0 State 1 ctrl packet_in(pkt A) def init(): init local vars register(“packet_in”) def on_packet_in(): check system-wide state Register callbacks to observe transitions Execute after transitions

Prototype Implementation Built a NICE prototype in Python Target the Python API of NOX 2125 Apr 2012NSDI'12 Unmodified OpenFlow program Stub NOX API NICE Controller state & transitions

Experiences Tested 3 unmodified NOX OpenFlow Apps – MAC-learning switch – LB: Web server load balancer [Wang et al., HotICE’11] – TE: Energy-aware traffic engineering [CoNEXT’11] Setup – Iterated with 1, 2 or 3-switch topologies; 1,2,… pkts – App-specific properties LB: All packets of same request go to same server replica TE: Use appropriate path based on network load 2225 Apr 2012NSDI'12

Results NICE found 11 property violations  bugs – Few secs to find 1 st violation of each bug (max 30m) – Few simple mistakes (not freeing buffered packets) – 3 insidious bugs due to network race conditions NICE makes corner cases as likely as normal cases 25 Apr 2012NSDI'1223

Conclusions 24 NICE automates the testing of OpenFlow Apps Explores state-space efficiently Tests unmodified NOX applications Helps to specify correctness Finds bugs in real applications SDN: a new role for software tool chains to make networks more dependable. 25 Apr 2012NSDI'12 Thank you! Questions? NICE is a step in this direction!

Backup slides 2525 Apr 2012NSDI'12

Related Work (1/2) Model Checking – SPIN [Holzmann’04], Verisoft [Godefroid’97], JPF [Visser’03] – Musuvathi’04, MaceMC [Killian’07], MODIST [Yang’09] Symbolic Execution – DART [Godefroid’05], Klee [Cadar’08], Cloud9 [Bucur’11] MC+SE: Khurshid’03 25 Apr 2012NSDI'1226

Related Work (2/2) OpenFlow programming – Frenetic [Foster’11], NetCore [Monsanto’12] Network testing – FlowChecker [Al-Shaer’10] – OFRewind [Wundsam’11] – Anteater [Mai’11] – Header Space Analysis [Kazemian’12] 25 Apr 2012NSDI'1227

Micro-benchmark of full state-space search 25 Apr 2012NSDI'1228 Pyswitch MAC-learning switch Switch 1 Switch 2 Host A Host B Single 2.6 GHz core 64 GB RAM PingsTransitionsUnique statesTime [s] 312,8015, [s] 4391,091131,51536 [m] 514,052,8534,161,33530 [h] Concurrent “Layer-2 ping” Compared with SPIN: 7 pings  out of memory JPF is 5.5 x slower

State space reduction by heuristics 25 Apr 2012NSDI'1229 Pyswitch MAC-learning switch Switch 1 Switch 2 Host A Host B Single 2.6 GHz core 64 GB RAM Compared to base model checking

Transitions # / run time [s] to 1 st property violation of each bug 25 Apr 2012NSDI'1230

OpenFlow Switch Model 25 Apr 2012NSDI'1231 State 1 State 2 install Rule 1 Flow Table Rule 2 Rule 1 1) Rule 1 Flow Table 2) Rule 2 Rule 1 Flow Table 3) ≠ Rule 2 Flow Table (4 (5 switch process_of State 4 install Rule 2 State 5 install Rule 1 State 3 install Rule 2 Example: adding Rule 1 and Rule 2

MAC-learning switch (3 bugs) 32 OpenFlow program Host A A->B | port A->B | port 1 Host B BUG-I: Host unreachable after moving 3 25 Apr 2012NSDI'12

MAC-learning switch (3 bugs) 33 OpenFlow program Host A B->A | port B->A | port 2 Host B BUG-I: Host unreachable after moving 3 BUG-II: Delayed direct path A->B | port 2A->B | port 1 25 Apr 2012NSDI'12

MAC-learning switch (3 bugs) 34 OpenFlow program Host A 1221 BUG-I: Host unreachable after moving 3 BUG-II: Delayed direct path BUG-III: Excess flooding Apr 2012NSDI'12

Web Server Load Balancer (4 bugs) 35 OpenFlow program Host A 13 Host B 24 Server 1 Server 2 BUG-IV: Next TCP packet always dropped after reconfiguration BUG-V: Some TCP packets dropped after reconfiguration BUG-VI: ARP packets forgotten during address resolution BUG-VII: Duplicate SYN packets during transitions Custom property: all packets of same request go to same server replica 25 Apr 2012NSDI'12

Energy-Efficient TE (4 bugs) Precompute 2 paths per – Always-on and on-demand Make online decision: – Use the smallest subset of network elements that satisfies current demand 36 BUG-VIII: The first packet of a new flow is dropped BUG-IX: The first few packets of a new flow can be dropped BUG-X: Only on-demand routes used under high load BUG-XI: Packets can be dropped when the load reduces 25 Apr 2012NSDI'12

Results Why were mistakes easy to make? – Centralized programming model only an abstraction Why the programmer could not detect them? – Bugs don’t always manifest – TCP masks transient packet loss – Platform lacks runtime checks Why NICE easily found them? – Makes corner cases as likely as normal cases 25 Apr 2012NSDI'1237

Example: MAC-learning Switch 1 ctrl_state = {} # State of the controller is a global variable (a hashtable) 2 def packet_in(sw_id, inport, pkt, bufid): # Handles packet arrivals 3 mactable = ctrl_state[sw_id] 4 is_bcast_src = pkt.src[0] & 1 5 is_bcast_dst = pkt.dst[0] & 1 6 if not is_bcast_src: 7 mactable[pkt.src] = inport 8 if (not is_bcast_dst) and (mactable.has_key(pkt.dst)): 9 outport = mactable[pkt.dst] 10 if outport != inport: 11 match = {DL SRC: pkt.src, DL DST: pkt.dst, DL TYPE: pkt.type, IN PORT: inport} 12 actions = [OUTPUT, outport] 13 install_rule(sw_id, match, actions, soft_timer=5, hard_timer=PERMANENT) 14 send_packet_out(sw_id, pkt, bufid) 15 return 16 flood_packet(sw_id, pkt, bufid) 3825 Apr 2012NSDI'12

Causes of Corner Cases (Examples) Multiple packets of a flow reach controller No atomic update across multiple switches Previously-installed rules limit visibility Composing functions that affect same packets Assumptions about end-host protocols & SW 3925 Apr 2012NSDI'12