Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Network-State Management Service

Similar presentations


Presentation on theme: "A Network-State Management Service"— Presentation transcript:

1 A Network-State Management Service
Paper 1 A Network-State Management Service Peng Sun(Princeton), Ratul Mahajan, Jennifer Rexford, (Princeton), Lihua Yuan, Ming Zhang, Ahsan Arefin( Microsoft) ( in the Proc. of ACM SIGCOMM 2014) Presented by: Suman Maroju EECS,Northwestern University

2 Paper Overview A network-state management service called Statesman for DCNs is presented. It allows multiple management applications to operate independently while ensuring performance and network- wide invariants. View based architecture. Tested on Microsoft Azure data center for 7 months. 3 applications tested.

3 DCN A modern data center is home to tens of thousands of hosts, each consisting of one or more processors, memory, network interface, and local high-speed I/O (disk or flash). Compute resources are packaged into racks and allocated as clusters consisting of thousands of hosts that are tightly connected with a high-bandwidth network.

4

5

6

7

8

9 Problem 1-Application conflict
Which ever happens first takes the control

10 Problem 2-Safety violation
Joint actions disconnects the ToR (top-of-rack)

11

12

13 (Corybantic)

14

15

16

17

18

19

20

21

22

23 Three views of the network state:
Statement uses three views of the proposed state: Observed Proposed Target Design inspired by version control system git Each application corresponds to different git user. Observed-pull;Proposed-pushed;Target-merged.

24

25

26

27 Dependency Model of State Variable
Prior work considered independent variable-value pairs. Does not contain enough semantic knowledge about how various state variables are related. Dependency model can capture the domain-specific cross variable dependencies among the state variable.

28

29

30 Detailed Architecture:

31 Input and Output in Statesman

32

33

34 Network State Variables & Controllability
Firmware-upgrade DeviceFirmwareVersion DeviceFirmwareVersionls- Controllable Switch configuration DeviceConfiglsControllable LinkAdminPowerlsControllable

35 Checking Network State
Resolving Conflicts 1.TS-OS OpenFlow agent-DeviceAgent-BootStatus=Down. So TS cannot be applied. 2.PS-OS LinkEndAddress=Down PS cannot be applied. 3.PS-TS Upgrading a switch. Read controllability values from OS, set uncontrollability values at PS or TS and use SkipUpdate to resolve TS-OS conflicts or partial rejection. For PS-TS conflicts, last-write wins, priority based locking.

36

37 Statesman System Design and Implementation
Storage: 50000 lines of C# and C++ code. RESTful web service. Paxos rings (Smaller): Storage instance multiple locations. Smaller rings. Proxy layer for uniform access. Updator: Command Template (OpenFlow,BGP etc) Monitor: (SNMP,OpenFlow)

38 Read-write APIs of Statesman
Implemented as a HTTP web service with RESTful APIs. Freshness parameter included (Staleness). Link failure mitigation DeviceFirmwareVersion

39

40

41 Application experiences:
1.Switch upgrade DeviceFirmwareVersion 2.Failure mitigation Frame-Check-Sequence(FCS) error rates, LinkAdminPower-shutdown and generate repair ticket. 3.Inter-DC TE Bandwidth demands from bandwidth broker Tunnel status and flow matching rules. 99% of the ToR pairs in the DC should have atleast 50% of their baseline capacity

42

43

44

45

46

47

48

49

50

51

52 Conflict resolution in Statesman

53

54

55

56

57

58

59

60

61

62 Handling Operational Failures
Switch-upgrade application on 250 switches. A. Straggling switch takes 4 hours to upgrade. Cannot download new firmware image. B. Unstable switches. C. Failure case( human intervention)

63 System Performance Latency

64 Checker performance

65 Read-write performance

66 Related Work Most of the previous works enable centralized control of traffic flow by directly forwarding states of switches. Similar to Statesman, Onix and Hercules provide a shared network-state platform for all applications but not designed to resolve conflicts. Pyretic, PANE and Maple are recent proposals to deal with multiple applications but focus only on traffic management. Corybantic used explicit resolution by evaluation other applications proposals leading to complexity. Other approaches include partitioning the network into multiple isolated virtual slices.

67

68 Thanks! Question?


Download ppt "A Network-State Management Service"

Similar presentations


Ads by Google