Download presentation
Presentation is loading. Please wait.
Published byZane Walmsley Modified over 9 years ago
1
A Network-State Management Service Peng Sun Ratul Mahajan, Jennifer Rexford, Lihua Yuan, Ming Zhang, Ahsan Arefin Princeton & Microsoft
2
Complex Infrastructure Variety of vendors/models/time 1 Number of2010 Data CenterA few Network Device 1,000s Network Capacity 10s of Tbps Number of20102014 Data CenterA few10s Network Device 1,000s10s of 1,000s Network Capacity 10s of TbpsPbps Microsoft Azure
3
Management Applications 2 Traffic Engineering Load Balancing Link Corruption Mitigation Device Firmware Upgrade ……
4
Our Question How to safely run multiple management applications on shared infrastructure 3
5
It does not work due to 2 problems Naïve Solution Run independently 4 Traffic Engineering Link Corruption Mitigation Firmware Upgrade Network Devices
6
Agg A ToRs Agg B Core12 Problem #1: Conflict 5 Link-corruption- mitigation adjusts traffic away from Core1 TE tunes traffic among links to Core1, 2
7
Agg A ToRs Agg B Core12 Problem #2: Safety Violation 6 Link-corruption- mitigation shuts down faulty Agg A Firmware-upgrade schedules Agg B to upgrade
8
Potential Solution #1 One monolithic application Central control of all actions 7 Traffic Engineering Firmware Upgrade Link Corruption Mitigation
9
Too Complex to Build Difficult to develop Combine all applications that are already individually complicated High maintenance cost for such huge software in practice 8
10
Potential Solution #2 Explicit coordination among applications Consensus over network changes 9 Traffic Engineering Firmware Upgrade Link Corruption Mitigation
11
Still Too Complex Hard to understand each other Diverse network interactions 10 ApplicationRouting Device Config Traffic Engineering Firmware upgrade
12
Main Enemy: Complexity Application development Application coordination 11 Monolithic Indepen- dent Explicitly coordinate SimpleComplex
13
What We Advocate Loose coupling of applications Design principle: Simplicity with safety guarantees Forgo joint optimization Worthwhile tradeoff for simplicity Applications could do it out-of-band 12
14
Overview of Statesman Network operating system for safe multi-application operation Uses network state abstraction Three views of network state Dependency model of states 13
15
The “State” in Statesman Complexity of dealing with devices Heterogeneity Device-specific commands 14 Network Devices Network State
16
State Variable Examples State VariableValue Device Power StatusUp, down Device FirmwareVersion number Device SDN Agent BootUp, down Device Routing StateRouting rules Link Admin StatusUp, down Link Control PlaneBGP, OpenFlow, … 15
17
Simplify Device Interaction Past Now 16 SNMP, OF, vendor API, … ReadWrite Network Devices Network State Application Device Statistics Application Device- specific cmds
18
Views of Network State 17 Network Devices Observed State Actual state of the whole network Target State Desired state to be updated on the whole network Target State Network State Application
19
Network Devices Two Views Are Not Enough 18 Observed State Target State One More View Proposed State A group of entity-variable-values desired by an application Proposed State Application
20
How Merging Works Combine multiple proposed states into a safe target state Conflict resolution Last-writer-wins Priority-based locking Sufficient for current deployment Safety invariant checking Partial rejection & Skip update 19
21
Choose Safety Invariants Our current choice Connectivity: Every pair of ToRs in one DC is connected Capacity: 99% of ToR pairs have at least 50% capacity 20 Hinder application too frequently TightLoose Cannot protect network operation
22
Recap of Three-View Model Simplify network management 21 Observed State Target State Proposed State What we see from the network What we want the network to be What can be actually done on the network Statesman Application
23
Yet Another Problem What’s in Proposed State Small number of state variables that application cares Implicit conflicts arises Caused by state dependency 22
24
A BC D Implicit Conflict 23 TE writes new value of routing state of B for tunneling traffic Firmware-upgrade writes new value of firmware state of B
25
Dependency Relations 24 PowerState FirmwareVersion ConfigurationState RoutingState AdminState ConfigurationState PathState Device Link bgpdSDN
26
Build in Dependency Model Statesman calculates it internally Only exposes the result for each state variable Whether the variable is controllable 25
27
Statesman System 26 Target State MonitorUpdater Checker Proposed State Observed State Storage Service
28
Deployment Overview Operational in Microsoft Azure for 10 months Cover 10 DCs of 20K devices 27
29
Production Applications 3 diverse applications built Device firmware upgrade Link corruption mitigation Traffic engineering Finish within months Only thousands of lines of code 28
30
Case #1: Resolve Conflict Inter-DC TE & Firmware-upgrade 29 BR 1 BR 2 DC 1 BR 8 BR 7 DC 4 BR 3 BR 4 DC 2 BR 5 DC 3 BR 6 DC = Data Center BR = Border Router
31
30 Firmware-upgrade acquires lock of BR1 TE fails to acquire lock, and moves traffic away BR1 firmware upgrade starts BR1 firmware upgrade ends. Lock released. TE re-acquires lock, and moves traffic back … … … …
32
Case #1 Summary Each application: Simple logic Unaware of the other Statesman enables: Conflict resolution Necessary coordination 31
33
Case #2: Maintain Capacity Invariant Firmware-upgrade & Link-corruption-mitigation 32 … ToR Agg … … … Core … Pod 4 4 1 1 n … Pod 1 4 1 1 n … Pod 10 4 1 1 n 14 Link corrupting packets
34
33 Upgrade proceeds in normal speed in Pod 3 and 5 Upgrade in Pod 4 is slowed down by checker due to lost capacity … … … … …
35
Case #2 Summary Statesman: Automatically adjusts application progresses Keeps the network within safety requirements 34
36
Conclusion Need network operating system for multiple management applications Statesman Loose coupling of applications Network state abstraction Deployed and operational in Azure 35
37
36 Thanks! Questions ? Check paper for related works
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.