Troubleshooting SDNs Peyman Kazemian. Why SDN Troubleshooting SDN decouples software (control plane) from hardware (data plane). Opens doors for innovation.

Slides:



Advertisements
Similar presentations
Catching Bugs in Software Rajeev Alur Systems Design Research Lab University of Pennsylvania
Advertisements

SDN Abstractions. In an SDN Ideal World, we want… multiple applications (Composition): – So, need to worry about sharing. – About isolation. Network policies.
Openflow App Testing Chao SHI, Stephen Duraski. Motivation Network is still a complex stuff ! o Distributed mechanism o Complex protocol o Large state.
Models and techniques for verification of Software Defined Networks
Slick: A control plane for middleboxes Bilal Anwer, Theophilus Benson, Dave Levin, Nick Feamster, Jennifer Rexford Supported by DARPA through the U.S.
A SOFT Way for OpenFlow Interoperability Testing Marco Canini TU Berlin / T-Labs [CoNEXT’12]
A SOFT Way for OpenFlow Interoperability Testing Maciej Kuźniar, Peter Perešini, Marco Canini†, Daniele Venzano, Dejan Kostić‡ EPFL †TU Berlin/T-Labs ‡IMDEA.
Leveraging SDN Layering to Systematically Troubleshoot Networks Brandon Heller ★ Colin Scott  Nick McKeown ⌘ Scott Shenker  Andreas Wundsam § Hongyi.
VeriCon: Towards Verifying Controller Programs in SDNs (PLDI 2014) Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly.
I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks Nikhil Handigol With Brandon Heller, Vimal Jeyakumar, David Mazières,
SDN and Openflow.
Network Innovation using OpenFlow: A Survey
Presenter : Shih-Tung Huang Tsung-Cheng Lin Kuan-Fu Kuo 2015/6/15 EICE team Model-Level Debugging of Embedded Real-Time Systems Wolfgang Haberl, Markus.
NATIONAL & KAPODISTRIAN UNIVERSITY OF ATHENS INTERDEPARTMENTAL GRADUATE PROGRAM IN MANAGEMENT AND ECONOMICS OF TELECOMMUNICATION NETWORKS Master Thesis.
Data Plane Verification. Background: What are network policies Alice can talk to Bob Skype traffic must go through a VoIP transcoder All traffic must.
EE694v-Verification-Lect5-1- Lecture 5 - Verification Tools Automation improves the efficiency and reliability of the verification process Some tools,
Replay Debugging for Distributed Systems Dennis Geels, Gautam Altekar, Ion Stoica, Scott Shenker.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
Securing Legacy Software SoBeNet User group meeting 25/06/2004.
Formal checkings in networks James Hongyi Zeng with Peyman Kazemian, George Varghese, Nick McKeown.
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
Enabling Innovation Inside the Network Jennifer Rexford Princeton University
How SDNs will tame networks Nick McKeown Stanford University.
How SDN will shape networking
A Portable Virtual Machine for Program Debugging and Directing Camil Demetrescu University of Rome “La Sapienza” Irene Finocchi University of Rome “Tor.
Software-Defined Networks Jennifer Rexford Princeton University.
Chapter 8 Architecture Analysis. 8 – Architecture Analysis 8.1 Analysis Techniques 8.2 Quantitative Analysis  Performance Views  Performance.
Designing For Testability. Incorporate design features that facilitate testing Include features to: –Support test automation at all levels (unit, integration,
Where is the Debugger for my Software-Defined Network? [ndb]
VeriFlow: Verifying Network-Wide Invariants in Real Time
Higher-Level Abstractions for Software-Defined Networks Jennifer Rexford Princeton University.
Network Verification Star Wars amd The Empire Strikes Back.
The Beauty and Joy of Computing Lecture #3 : Creativity & Abstraction UC Berkeley EECS Lecturer Gerald Friedland.
VeriCon: Towards Verifying Controller Programs in SDNs Thomas Ball, Nikolaj Bjorner, Aaron Gember, Shachar Itzhaky, Aleksandr Karbyshev, Mooly Sagiv, Michael.
Functional Verification Figure 1.1 p 6 Detection of errors in the design Before fab for design errors, after fab for physical errors.
1 Introduction to Software Testing. Reading Assignment P. Ammann and J. Offutt “Introduction to Software Testing” ◦ Chapter 1 2.
Programming Languages for Software Defined Networks Jennifer Rexford and David Walker Princeton University Joint work with the.
Getting to Know Your Computer Your File System Applications What’s running on your machine Its own devices Networking.
- 1 - ©2009 Jasper Design Automation ©2009 Jasper Design Automation JasperGold for Targeted ROI JasperGold solutions portfolio delivers competitive.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Improving Network Management with Software Defined Network Group 5 : z Xuling Wu z Haipeng Jiang z Sichen Wu z Aparna Sanil.
P4 Amore! ( Or, How I Learned to Stop Worrying and Love P4) Jennifer Rexford Princeton University.
Automated Network Repair with Meta Provenance
Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Programming Languages COS 597E: Software Defined Networking.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Plane Verification COS 597E: Software Defined Networking.
Winter 2007SEG2101 Chapter 121 Chapter 12 Verification and Validation.
Header Space Analysis: Static Checking for Networks Broadband Network Technology Integrated M.S. and Ph.D. Eun-Do Kim Network Standards Research Section.
Towards Secure and Dependable Software-Defined Networks Fernando M. V. Ramos LaSIGE/FCUL, University of Lisbon
BUZZ: Testing Context-Dependent Policies in Stateful Networks Seyed K. Fayaz, Tianlong Yu, Yoshiaki Tobioka, Sagar Chaki, Vyas Sekar.
Department of Electrical and Computer Engineering Abhishek Dwaraki 1 Srini Seetharaman 2, Sriram Natarajan 3, Tilman Wolf 1 1. Department of Electrical.
Verifying & Testing SDN Data & Control Planes: Header Space Analysis.
InterVLAN Routing 1. InterVLAN Routing 2. Multilayer Switching.
Seyed K. Fayaz, Tushar Sharma, Ari Fogel
SDN challenges Deployment challenges
Software Engineering (CSI 321)
Martin Casado, Nate Foster, and Arjun Guha CACM, October 2014
Network Life Cycle Created by Michael Law
Hardware Verification
Stanford University Software Defined Networks and OpenFlow SDN CIO Summit 2010 Nick McKeown & Guru Parulkar In collaboration with Martin Casado and Scott.
Gabor Madl Ph.D. Candidate, UC Irvine Advisor: Nikil Dutt
Abstractions for Model Checking SDN Controllers
Software Defined Networking (SDN)
Enabling Innovation Inside the Network
COS 260 DAY 16 Tony Gauvin.
SDN and Testing.
Behind the scenes: software programming
Lecture 10, Computer Networks (198:552)
With slides from Ahmed Khurshid
Control-Data Plane Separation
Presentation transcript:

Troubleshooting SDNs Peyman Kazemian

Why SDN Troubleshooting SDN decouples software (control plane) from hardware (data plane). Opens doors for innovation in networks. More competition in the industry. Brings down the capex. ?Makes network management task easier and hence reduce opex. SDN software stack is a complex, asynchronous distributed system which introduces new troubleshooting challenges. Software from one vendor and hardware from the other vendor. What will happen when things break? Who to blame?

Why SDN Troubleshooting SDN gives us a unique opportunity for systematic troubleshooting.  Decouples control plane from data plane.  State changes pushed from a logically centralized location.  Easier access to the state of the network.  Provides clear abstraction for control plane functionality. Richer troubleshooting techniques.

Firmware Network OS Network Hypervisor App Logical View Physical View Device State Hardware Policy Bug = Mistranslation between different layers == SDN Architecture

Reactive Troubleshooting of SDNs One possible Binary Search to detect where error happens reactively. [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? = ? No Yes No Yes No Yes No

Proactive Troubleshooting of SDNs One possible Binary Search to detect where error happens proactively. [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? No Yes No Yes No = ? Yes No

RESEARCH WORKS ON SDN TROUBLESHOOTING

Troubleshooting SDNs [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? = ? = ? NDB (Where is the debugger for my software defined network, HotSDN’12) ATPG: (Automatic Test Packet Generation, CoNEXT’12)

Troubleshooting SDNs [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? = ? = ? AntEater (Debugging the dataplane with AntEater, Sigcomm’11) HSA (Header Space Analysis: static checking for networks NSDI’12) VeriFlow (Verifying Network-wide invariants in real time, HotSDN’12)

Troubleshooting SDNs [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? = ? = ? OFRewind (Enabling record and replay troubleshooting for networks, ATC’11) NICE (a NICE way to test OpenFlow applications, NSDI’12)

Troubleshooting SDNs [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? = ? = ? Bi-Simulation (What, Where and When: Software Fault localization for SNDs, UC Berkeley tech report)

Troubleshooting SDNs [Operator Intent] Logical View Device State “Apps” NetHypervisor Firmware Hardware NetOS Physical View Policy [Actual Behavior] = ? = ? = ? = ? = ? = ? RIB == FIB? Compare device state as reported by against the actual bits and bytes in TCAMs, etc.

WHAT ELSE IS NEEDED?

Policy Expression Language Rarely the policies are maintained anywhere, except in the mind of network admins! Systematic troubleshooting requires such clear policy description.  Easy-to-use, expressive and standard network policy description language.

Better Troubleshooting Tools Not just detect where the problem is, but also find its root cause -- automaticaly. – Some of these tools can partially do that. Challenges: – What Information is needed? Packet history (NDB)? Control message history (OFRewind)? “Logic” behind control/data plane? … – What is the expected output? The sequence of events that lead to the error? The exact (relevant) state of control software and hardware? Looks like a mix of networking and symbolic execution and formal verification.

Automated Troubleshooting Automatically run the search through different layers to pinpoint the error. – Example: a complete system could do Real time monitoring of data plane with test packets. Real time checking of network policy against control messages. Problem in data plane (e.g. link down, congestion, etc)  Report it to a control application to reroute traffic around the troubled area. Problem in control plane:  Prevent the change from hitting data plane.

Policy Driven SDN Use these techniques in reverse – try to derive correct state/configurations from the policy. Challenges: – A policy can be implemented in zillion ways. How to reduce the search space? – Avoid conflicting implementation. – What is the correct level of human involvement?

References A. Wundsam, D. Levin, S. Seetharaman, and A. Feldmann. OFRewind: enabling record and replay troubleshooting for networks In Proceedings of USENIX- ATC H. Zeng, P. Kazemian, G. Varghese, and N. McKeown. Automatic Test Packet Generation. In Proceedings of CoNEXT 2012, Nice, France, December A.Khurshid, W.Zhou, M.Caesar and P.B.Godfrey. Veriflow: verifying network-wide invariants in real time. In Proceedings of HotSDN P. Kazemian, G. Varghese, and N. McKeown. Header space analysis: static checking for networks. In Proceedings of NSDI’12, N. Handigol, B. Heller, V. Jeyakumar, D. Mazie ́res, and N. McKeown. Where is the debugger for my software- defined network? In Proceedings of HotSDN M.Canini, D.Venzano, P.Peresini, D.Kostic ́ and J.Rexford. A NICE way to test openflow applications. In Proceedings of NSDI H. Mai, A. Khurshid, R. Agarwal, M. Caesar, P. B. Godfrey, and S. T. King. Debugging the data plane with anteater. In Proceedings of SIGCOMM 2011 C. Scott, A. Wundsam, K. Zarifis and S. Shenker. What, where and when: Software fault localization for SDN. UC Berkeley technical report.