Aaron Gember, Chaithan Prakash, Raajay Viswanathan, Robert Grandl, Junaid Khalid, Sourav Das, Aditya Akella 1 OpenNF
SDN + software NFs NFs examine/modify packets at layers 3-7 Software NFs are replacing physical appliances SDN applications (PLayer, SIMPLE, Stratos, etc.) steer flows through NFs 2 Web Server Home Users Caching Proxy Intrusion Prevention Firewall Enables new applications that control the packet processing happening across instances of an NF
Not moving flows => bottleneck persists Naively moving flows => incorrect NF behavior Example: scaling & load balancing 3 FirewallCaching Proxy Intrusion Prevention Web Server Home Users Requires a control plane that enables management of both internal NF state and network forwarding state
Challenges 1.Dealing with race conditions – Packets may arrive while state is being moved, causing state updates to be lost or re-ordered 2.Giving applications flexibility – May need to move state at different granularities 3.Supporting many NFs with minimal changes – Undesirable to force NFs to conform to certain state structures or allocation/access strategies 4
OpenNF 5 OpenNF Controller SDN Controller Control Application Northbound API Southbound API
Outline Overview Requirements Design – Southbound API (addresses NF diversity) – Northbound API (addresses race conditions) Evaluation 6
Requirements Move flow-specific NF state at various granularities Copy and combine, or share, NF state pertaining to multiple flows Support key guarantees (no loss, order preserved) when needed Track when/how state is updated 7
Existing approaches Control over routing (PLayer, SIMPLE, Stratos) Virtual machine replication – Unneeded state => incorrect actions – Cannot combine => limited rebalancing Split/Merge and Pico/Replication – Address specific problems => limited suitability – Require NFs to create/access state in specific ways => significant NF changes 8
State created or updated by an NF applies to either a single flow or a collection of flows Classify state based on scope Flow provides a natural way for reasoning about which state to move, copy, or share NF state taxonomy 9 Connection TcpAnalyzer HttpAnalyzer TcpAnalyzer HttpAnalyzer Per-flow state ConnCount Multi-flow state All-flows state Statistics
API to export/import state Three simple functions: get, put, delete – Version for each scope (per-, multi-, all-flows) – Filter defined over packet header fields NFs responsible for – Identifying and providing all state matching a filter – Combining provided state with existing state 10 No need to expose internal state organization No changes to conform to a specific allocation strategy
API to observe/prevent updates Problem: need to prevent (e.g., during move) or observe (e.g., to trigger copy) state updates Solution: event abstraction – Functions: enableEvents and disableEvents – Instruct NF to raise an event and process, buffer, or drop packets matching a filter 11 Only need to change an NF’s receive packet function
Move operation 12 OpenNF Controller Control Application move (port=80,Inst 1,Inst 2 ) getPerflow(port=80) [Chunk1] putPerflow(Chunk1) delPerflow(port=80) [Chunk2] putPerflow(Chunk2) forward(port=80,Inst 2 ) SDN Controller Inst 2 Inst 1
Packet arrivals during move Packets may arrive during a move operation Fix: suspend traffic flow and buffer packets – May last 100s of ms => connection timeouts – Packets in-transit when buffering starts are dropped Inst 2 is missing updates Inst 2 Inst 1 move(yellow,Inst 1,Inst 2 ) Loss-free: All state updates due to packet processing should be reflected in the transferred state, and all packets the switch receives should be processed
Use events for loss-free move enableEvents(blue,drop) on Inst 1 ; get / delete on Inst 1 ; put on Inst 2 Buffer events at controller Flush packets in events to Inst 2 Update forwarding 14 S Inst 2 Inst 1 AS S S,S+A S+A S,S+A,A
Re-ordering of packets 15 Order-preserving: All packets should be processed in the order they were forwarded to the NF instances by the switch Controller Switch Inst 2 Flush buffer Request forwarding update Inst 1 S+A A A D1D2 D1 S+A A D2 D1
Flush packets in events to Inst 2 enableEvents(blue,buffer) on Inst 2 Forwarding update: send to Inst 1 & controller Wait for packet from switch (remember last) Forwarding update: send to Inst 2 Wait for event for last packet from Inst 2 Release buffer of packets on Inst 2 Order-preserving move 16 S S S,S+A S+A S,S+A,A A AAD1 S,S+A, A,D1
Copy and share operations Used when multiple instances need to access a particular piece of state Copy – no or eventual consistency – Issue once, periodically, based on events, etc. Share – strong or strict consistency – All packets reaching NF instances trigger an event – Packets in events are released one at a time – State is copied between packets 17
Example app: Load balanced network monitoring movePrefix(prefix,oldInst,newInst): copy(oldInst,newInst,{nw_src:prefix},multi) move(oldInst,newInst,{nw_src:prefix},per,LF+OP) while (true): sleep(60) copy(oldInst,newInst,{nw_src:prefix},multi) copy(newInst,oldInst,{nw_src:prefix},multi) scan.bro vulnerable.bro weird.bro scan.bro vulnerable.bro weird.bro
Example app: Selectively invoking advanced remote processing enhanceProcessing(flowid,locInst): move(locInst,cloudInst,flowid,per,LF) scan.bro vulnerable.bro weird.bro scan.bro vulnerable.bro weird.bro scan.bro vulnerable.bro weird.bro detect-MHR.bro scan.bro vulnerable.bro weird.bro detect-MHR.bro!
Implementation OpenNF Controller (≈3.8K lines of Java) – Written atop Floodlight Shared NF library (≈2.6K lines of C) Modified NFs (3-8% increase in code) – Bro (intrusion detection) – PRADS (service/asset detection) – iptables (firewall and NAT) – Squid (caching proxy) 20
End-to-end benefits Load balanced monitoring with Bro IDS – Load: 10K pkts/sec cloud trace – After 180 sec: move HTTP flows (489) to new Bro OpenNF: 260ms to move (optimized, loss-free) – Log entries equivalent to using one instance VM replication: 3889 incorrect log entries Forwarding control only: scale down delayed by > 1500 seconds
Southbound API call processing 22 Serialization/deserialization costs dominate Cost grows with state complexity
Efficiency with guarantees State: 500 flows in PRADS; Load: 1000 pkts/s Move Copy – 176ms Share – 7ms (or more) for every packet pkts dropped! 130 pkts buffered at dstInst 230 pkts in events Guarantees come at a cost!
Controller performance Improve scalability with P2P state transfers 24
Systematic engineered APIs implemented by NFs and used by control applications Enables rich control of the packet processing happening across instances of an NF Provides key guarantees and requires minimal NF modifications Conclusion 25