Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Experiences with setting up the CHEETAH network Xuan Zheng, Xiangfei Zhu, Malathi Veeraraghavan University of Virginia.

Similar presentations


Presentation on theme: "1 Experiences with setting up the CHEETAH network Xuan Zheng, Xiangfei Zhu, Malathi Veeraraghavan University of Virginia."— Presentation transcript:

1 1 Experiences with setting up the CHEETAH network Xuan Zheng, Xiangfei Zhu, Malathi Veeraraghavan University of Virginia

2 2 CHEETAH network overview

3 3 Wide-area circuits OC-192 lambda from MCNC between MCNC and NLR Raleigh PoP (Cisco 15454 MSTPs) OC-192 lambda from NLR between NLR Raleigh PoP and NRL Atlanta PoP (Cisco 15808 MSTPs) OC-192 lambda from SLR between NLR Atlanta PoP and SLR/Sox/Telx Atlanta PoP (Movaz) 2x1GbE MPLS tunnels from ONRL between SLR/Telx Atlanta PoP and ORNL Will be upgraded to OC-192 lambda (Ciena Corestream) Purchasing issues: Wide-area circuits not “ capital equipment ”

4 4 CHEETAH nodes (circuit gateways) Cisco ONS 15454 MSPP in original proposal Bought 10/100 Ethernet, GbE, and OC-3 interface cards Watch out for the differences between the three types of Ethernet cards (E-series, ML-series, G-series) Powerful TL1/CTC interfaces Does not support dynamic circuit setup through GMPLS. Need external signaling software in order to be deployed in CHEETAH network (said support for UNI – but this is from client side) Now being used in our lab with two Cisco GSR 12008 routers for local-area CHEETAH research

5 5 The circuit gateway SDM to TDM More specifically, Ethernet/VLAN at user side to SONET at CHEETAH network side Implement GFP and VCAT to map Ethernet signals into SONET signals in a an efficient manner For example, 1Gbps -> 21xOC-1 Why GMPLS Value of network grows exponentially with number of endpoints (Metcalfe's law) CHEETAH network aims to a wide range of applications instead of just scientific applications. Need distributed, Dynamic control-plane to build a scalable network.

6 6 CHEETAH nodes Sycamore SN16000 intelligent optical switch GbE, 10GbE, and SONET interface cards (we have OC192s) Networks/nodes are managed through Silvx NMS, limited support for CLI and TL1 interface Switch OS - BroadLeaf – implements GMPLS protocols since Release 7.0 Pure SONET circuits – excellent GMPLS signaling and routing support. Ethernet-to-SONET GMPLS not officially released yet. But proprietary solution works!

7 7 CHEETAH nodes Sycamore SN16000 intelligent optical switch Before purchasing, a signaling interop. testing was conducted between SN16000 and our RSVP-TE client software at Chelmsford, MA. Three SN16000s were purchased without Silvx management software (would recommend getting Silvx NMS) Promising future upgrade of signaling capability Ethernet-to-SONET is already available in unofficial patches LMP, L2SC, unidirectional circuits, CSPF, UNI, etc. Purchasing issues: SONET switch equipment vendors ’ contracts quite elaborate Very different from data equipment Couldn ’ t reach agreement between UVA (state regulations) and Sycamore Networks!

8 8 CHEETAH network connection Three SN16000s were purchased and are located in MCNC, SLR/SoX/Telx Atlanta PoP, and ORNL. SN16000s are interconnected by OC-192 circuits End hosts are connected to SN16000s ’ GbE interfaces by direct fiber, VLAN or MPLS tunnels. VLAN & MPLS tunnels not in original proposal But see a clear need for these - for costs reasons Testing shows packet loss/out-of-sequence on these segments!

9 9 10G Ethernet switch 1G NCSU MCNC Compute-0-4 152.48.249.6 Orbitty Compute Nodes 1G OC192 GbE 1-8-33 1-8-34 1-8-35 1-8-36 1-6-1 1-6-17 Direct fibers 1-8-37 To Atlanta 10G Ethernet switch H H H H H 1G 1-7-1 Compute-0-3 152.48.249.5 Compute-0-2 152.48.249.4 Compute-0-1 152.48.249.3 Compute-0-0 152.48.249.2 1G wukong H 152.48.249.102 1G 1-8-38 1-7-17 VLAN connections cheetah-nc 5Gbps VLAN NC configuration – User Plane

10 10 OC192 1-6-1 1-6-17 10GbE 1-7-1 GbE 1-7-33 1-7-34 1-7-35 1-7-36 1-7-37 1-7-38 1-7-39 1G Zelda1 10.0.0.11 H H H 1G Zelda2 10.0.0.12 Zelda3 10.0.0.13 To NC Juniper router Zelda4 10.0.0.14 H H Zelda5 10.0.0.15 Atlanta ORNL 2x1GbE MPLS tunnels GaTech 1G Direct fibers To GaTech 1GbE MPLS tunnels Cheetah-atl Juniper router Atlanta configuration – User Plane

11 11 Things learned during node installation and maintenance Rack installation Physical dimension: 19 ” or 23 ” width, height, depth? two-post or four-post rack? Not just the SONET gateway. Need space for other equipment, such as PCs, Ethernet switches, console servers, PDU, etc. Power supply Need careful power calculations – allow for growth DC power for switch: voltage, current, etc. AC for other equipment: current, connector type, etc. Remote power management Need the capability to remotely (through the Internet) power cycle equipment. Switched PDU (Power Distribution Unit) for AC power

12 12 Things learned during the node installation and maintenance Remote management Internet access Console server Connect the serial port on SN16000s to PCs to allow remote management when Ethernet management port on a SN16000 is down Need better specification of remote manual support from collocation service providers Network security Protect the switch from Internet attacks Provide the integrity and authentication of control-plane traffic Our solution: Juniper Netscreen-5XT firewall/VPN server (hardware) for SN16000s Openswan software for Linux end hosts. CHEETAH control-plane traffic is protected by IPsec tunnels These tunnels allow for the use of private IP addresses behind the firewall device Other considerations Network measurement Need Ethernet hub (old style) or high-end Ethernet switch

13 13 Things learned for CHEETAH end hosts End hosts CPU: frequency, single or dual, cache size Bus speed Memory: size, speed Disk: volume, speed, SATA or SCSI, Raid GbE Interface Card (NIC) Optical or copper Bus type: PCI, PCI-X Connector: SC, LC Protocol: 1000Base-T, SX, LX (need to match the protocol of Ethernet interfaces on CHEETAH nodes) Operating system: Windows, Linux, Unix, etc. Linux kernel version Software Security software: VPN client, ssh/ssl library Development tools: gcc compiler Network tools: Iperf

14 14 Control-plane design More difficult than user-plane design Must consider requirements of GMPLS protocols, security, robustness DCC-inband or out-of-band signaling? Must do out-of-band because of end hosts involved. Security Control-plane traffic must be protected Private or public IP addresses Part of the routing domain on the Internet or endpoints?

15 15 NCSU MCNC Compute-0-0 128.109.45.160 Orbitty Compute Nodes OC192 GbE 1-6-1 1-6-17 H H H H H 1-7-1 Compute-0-1 128.109.45.161 Compute-0-2 128.109.45.162 Compute-0-3 128.109.45.163 Compute-0-4 128.109.45.164 wukong H 128.109.34.20 1-7-17 Ethernet port 128.109.34.18 192.168.4.2 Interne t Control NS-5 1-8-33 1-8-34 1-8-35 1-8-36 1-8-37 1-8-38 128.109.34.19 Interne t cheetah-nc 1Gbps TE link OC-192 TE link to ORNL16K1 Router ID=switch IP =192.168.5.1 5 x 1Gbps TE link to orbitty NC configuration – Control Plane

16 16 OC192 1-6-1 1-6-17 10GbE 1-7-1 GbE 1-7-33 1-7-34 1-7-35 1-7-36 1-7-37 1-7-38 1-7-39 Zelda1 130.207.252.131 H H H Zelda2 130.207.252.132 Zelda3 130.207.252.133 Zelda4 10.1.1.4 H H Zelda5 10.1.1.5 Atlanta ORNL Control NS-5 Ethernet port 130.207.252.138 130.207.252.136 192.168.2.2 NS-5 198.124.42.3 3x1Gbps TE link Interne t OC-192 TE link to cheetah-nc Cheetah-atl Router ID=switch IP=192.168.3.1 1Gbps TE link Atlanta configuration – Control Plane

17 2015-6-317 End-to-End GMPLS Signaling in CHEETAH Project Xiangfei Zhu xzhu@cs.virginia.edu 9/1/2005

18 18 Outline Optical Network signaling overview Sycamore SN16000 and Cisco 15454 End host software for GMPLS signaling External GMPLS signaling Engine for Cisco 15454 Demo Conclusion and future work

19 19 Optical Network Signaling GMPLS: Work in progress at IETF RFC2205 – RSVP for IP network RFC3209 – RSVP-TE for MPLS RFC3471 & RFC3473 – RSVP-TE for GMPLS RFC3946 – SONET and SDH Optical Internetworking Forum (OIF) UNI, I-NNI, E-NNI International Telecommunications Union (ITU) G.8080 (G.ASON)

20 20 Vendor Support to GMPLS Some vendors provide varying-level support to GMPLS in their products E.g.: CIENA CoreDirector, Sycamore SN16000, etc. Successful multi-vendor GMPLS interoperability demos at ISOCORE and Supercomputing 05. The implementation of GMPLS signaling by different vendors are basically compatible

21 21 University GMPLS Code KOM RSVP Engine – Technische Universitat Darmstadt [7] partial support of RFC 2205, 2210 & 3209 Dragon RSVP-TE code – MAX/ISI [4] partial support of RFC 3471, 3473 & 3946 Integrate with OSPF-TE

22 22 Work at UVA Implement a GMPLS software for end host – End host RSVP-TE client Integrating GMPLS signaling with Admission control OCS (Optical Connectivity Service) (Not fully done) Integrate with 15454 control software Interoperability test with Sycamore SN16000 Add support to CAC and route computation to VLSR (work at CUNY)

23 23 SN16000: Optical Control Plane Features GMPLS RFCs and Drafts RFC 3471: GMPLS Signaling Functional Description RFC 3473: GMPLS Signaling RSVP-TE Extensions (Draft) Routing Extensions in Support of GMPLS (Draft) OSPF Extensions in Support of GMPLS (Draft) Generalized Multi-Protocol Label Switching Architecture (Draft) GMPLS Extensions for SONET and SDH Control (Draft) Framework for GMPLS-based Control of SDH/SONET Networks In-fiber (in-band) and Out of fiber (out of band) Control Plane Fiber-Switch Capable Support enables communication w/FSC devices, (in addition to TDM devices) OIF UNI 1.0 Slide from Sycamore

24 24 Cisco 15454 Cisco SONET Multiservice Provisioning Platform (MSPP) Doesn ’ t support GMPLS signaling

25 25 End host Context Routing decision: Decide use CHEETAH circuit or Internet to transfer the file base on the Internet congestion status and file size FRTP: Fixed-Rate Transport Protocol designed for circuit-switched network [6] RSVP-TE client: Dynamic provision of the circuit Internet PC NIC I PC FRTP RSVP-TE Client FTP TCP/IP FTP Routing decision Routing decision NIC II CHEETAH Network NIC I [3] RSVP-TE Client

26 26 End host RSVP-TE Client Software Architecture

27 27 bwrequestor Command for end users to request a circuit. bwrequestor DESTINATION-DOMAIN-NAME BANDWIDTH

28 28 bwmgr Daemon Read the configuration from a configuration file. Initiate circuit setup. Accept the circuit request Check if destination is in CHEETAH network (OCS) Bandwidth management (CAC) Create RSVP session and send out PATH message Update ARP/IP table for data-plane Accept the circuit setup requests Register a default session with the RSVPD and listen to PATH message If it is a new session Fork a new process, create a new session, update CAC table, and send back RESV message Update ARP/IP table for data- plane

29 29 Bwmgr Configuration File The configuration file includes: The control-plane address of the node The address of the edge switch the node is connected to TE-link information: Local data-plane interface Link type (Ethernet/SONET) and bandwidth Interface types (numbered or unnumbered) of the two interfaces (local and remote) IP (numbered interface) / IFID (unnumbered) of each interface A sample configuration file CTRL-PLANE-IP = 130.207.252.131 EDGE-ROUTER-IP = 192.168.2.2 # TE-Links # TELink data-plane interface link type (0-Ethernet, 1-SONET) bandwidth(unit: Mbit) local interface type (0-unnumbered, 1-numbered) local interface IP/ID remote interface type remote interface IP/ID TELink eth2 0 1000 0 1 0 1

30 30 Bwmgr Library Provide two interfaces: BWRequest() – setup circuits BWTeardown() – teardown circuits Can be easily integrated with user applications

31 31 External GMPLS Engine for Equipment without GMPLS Capability Dragon ’ s VLSR (Virtual Label Switching Router)[4] as an external GMPLS engine. RSVP-TE message parsing and construction Fabric programming module for some Ethernet switches through SNMP Adopt VLSR for Cisco 15454 Monfox TL1 Library Provides an interface for an external program to provision circuits by issuing TL1 commands to 15454 Difficulty: Library in Java while the Dragon code is in C++ Figured out how to integrate Java code with C++ through CNI (Cygnus Native Interface) (by Lingling Cui) Integrate Dragon ’ s RSVP-TE software with 15454 control software

32 32 External GMPLS Engine for CISCO 15454 PATH RESV

33 33 OC192GbE 1-6-1 1-6-17 Ethernet port 192.168.4.2 Interne t Control NS-5 1-8-33 1-8-34 1-8-35 1-8-36 1-8-37 1-8-38 OC192 1-6-1 1-6-17 10GbE 1-7-1 GbE 1-7-33 1-7-34 1-7-35 1-7-36 1-7-37 1-7-38 1-7-39 1G Zelda1 H H H 1G Zelda2 Zelda3 Juniper router Atlanta 1G H Control-Plane: 128.109.34.20 Data-Plane 152.48.249.102 Demo Control NS-5 Interne t To ORNL To Orbitty Wukong 192.168.2.2 Ethernet port Control-Plane: 130.207.252.131 Data-Plane 10.0.0.11

34 34 Performance Average end-to-end setup delay is around 4.5 seconds Detailed delay has not being measured

35 35 Experiment with Cisco 15454

36 36 Performance Performance of external GMPLS engine for MSPP[5] Time for crossconnection setup: STS-1: 17.833 ± 0.184 ms STS-3: 18.000 ± 0.081 ms Time for crossconnection delete: STS-1: 16.400 ± 0.175 ms STS-3: 16.300 ± 0.145 ms

37 37 Conclusion It is feasible to extend dynamic circuits to end hosts by running RSVP-TE software on end hosts It is feasible to add GMPLS signaling capability to devices without build-in GMPLS capability The standards are mature and vendor implementation is good

38 38 Future Work Development part Alpha and Beta test Hand out to scientists to use Finish the developing of VLSR at CUNY Research part Bandwidth scheduling of circuit-switched network Immediate call vs. scheduled call Distributed bandwidth scheduling

39 2015-6-339 Thank you!

40 40 Reference [1] http://cheetah.cs.virginia.edu/http://cheetah.cs.virginia.edu/ [2] CHEETAH overview, John H. Moore, Xuan Zheng, Malathi Veeraraghavan, http://cheetah.cs.virginia.edu/networks/Cheetah%20Over view.jpg http://cheetah.cs.virginia.edu/networks/Cheetah%20Over view.jpg [3] CHEETAH network, Malathi Veeraraghavan, Nagi Rao, July 7, 2004 [4] http://dragon.east.isi.edu/http://dragon.east.isi.edu/ [5] External Switch Control Software, Lingling Cui, CHEETAH project year 1 demo, September 01, 2004 [6] X. Zheng, A. P. Mudambi, and M. Veeraraghavan, FRTP: Fixed Rate Transport Protocol -- A modified version of SABUL for end-to-end circuits, Pathnets2004 on Broadnet2004, Sept. 2004, San Jose, CAFRTP: Fixed Rate Transport Protocol -- A modified version of SABUL for end-to-end circuits [7] KOM RSVP Engine, http://www.kom.e-technik.tu- darmstadt.de/rsvp/http://www.kom.e-technik.tu- darmstadt.de/rsvp/ [8] Monfox DynamicTL1 SDK, http://www.monfox.com/dtl1_sdk.html http://www.monfox.com/dtl1_sdk.html

41 41 Acronym CHEETAH – Circuit-switched High-speed End-to-End Transport ArcHitecture RSVP – Resource Reservation Protocol RSVP-TE – RSVP – Traffic Engineering GMPLS – Generalized Multiple Protocol Label Switching SONET – Synchronous Optical NETwork SDH – Synchronous Digital Hierarchy IETF – Internet Engineering Task Force RFC – Requests for Comments UNI – User-Network Interface I-NNI – Internal-Network-Network Interface E-NNI – External-Network-Network Interface

42 2015-6-342 Transport protocol for dedicated circuits

43 43 Outline Transport protocol functionality Requirements for dedicated circuits User-space implementation Kernel-space implementation

44 44 Transport protocols End-to-end functionality falls in transport layer ’ s domain Error control Congestion control Flow control

45 45 Error control End-to-end reliability; no missing data, no reordering, no duplicates Data packets get lost (thrown away due to lack of buffer space) Bit errors cause packet corruption in transit Error control: Infer that something is wrong and take steps to correct it. Retransmit missing packets, buffer out of order data until the missing data arrives, suppress duplicates

46 46 Congestion control Avoid or reduce the damage due to buffers in the network overflowing Inferring that network is congested: # explicit signals from the node where buffer overflows # guess from indirect signals like packet loss, RTT variation To reduce congestion slow the rate of putting packets into the network

47 47 Flow control Avoid over-running the receiver If receiver is the bottleneck maintain a sending rate that matches the receiver rate Receiver signals its willingness to accept more data Sender sends data if receiver is ready

48 48 Transport protocol for dedicated circuits Error control: Errors can occur so error control is needed Congestion control: Bandwidth reserved in network. No congestion if sender always sends below reserved circuit rate. Explicit congestion signals: will never come so no problem; infer congestion from packet loss: loss due to other reasons wrongly assumed to signify congestion Flow control: Ensure no receiver over-run so flow control is required

49 49 User-space implementation UDP: Minimal transport protocol. No error/congestion/flow control UDP: Interface to the IP layer. Low overhead Perfect for adding extra functionality as needed Many variations of UDP-based protocols: SABUL, Hurricane, Tsunami, UDT …

50 50 User-space implementation We took SABUL and modified it for our needs Simple Available Bandwidth Utilization Library Uses UDP for the data transfer Adds reliability, congestion/flow control by using a control channel that runs over TCP

51 51 SABUL protocol Error control Sequence numbers added to UDP packets On getting out of order packets receiver sends a NAK control packet Sender keeps unacknowledged packets in memory. Retransmits missing packets on getting a NAK Receiver sends periodic ACK to allow sender to free buffer space

52 52 SABUL protocol Congestion control / Flow control Receiver sends a SYN control packet periodically with number of packets received since last SYN Sender uses this information to estimate loss rate. Based on loss rate adjust sending rate Sending rate adjusted by adjusting inter-packet gap

53 53 SABUL implementation Implemented in C++ Provides an interface that is similar to the socket interface: connect, send, receive 1 thread for the application generating data to send (consuming received data) 1 thread to send (receive) the data and handle the control channel Accurate inter-packet gap maintained by busy-waiting

54 54 SABUL FRTP Stripped out the rate alteration part to avoid leaving the reserved circuit idle Added support to use the secondary NIC for the data channel Ran experiments to figure out the optimum values for the many parameters available for tweaking

55 55 Drawbacks of SABUL Flow control: No way to prevent receiver buffer overflow. Use of busy waiting to maintain inter- packet gap: Uses up CPU cycles and makes the sending process sensitive to other processes Error control efficiency: Many packets were retransmitted unnecessarily

56 56 Kernel-space implementation Flow control is essential because the receiver cannot guarantee that the buffer will be emptied at the rate it is filled TCP has window-based flow control TCP ’ s self-clocking can provide a more efficient way of maintaining a sending rate Interrupts from the NIC allow new packets to be put on the network; scheduler has less impact on the packet transmission

57 57 Kernel-space implementation TCP congestion control is not required though Congestion window cwnd: amount of data that network can sustain Receiver advertised window rwnd: amount of data the receiver has buffer space for TCP sender tries to keep min (cwnd, rwnd) of outstanding data in the network Congestion control : a way to estimate the value of cwnd

58 58 TCP congestion control Slow start: start with cwnd set to a small value (2-4 packets) and increment by 2 packets for every ACK received Congestion avoidance: when cwnd crosses a threshold increment it by 1 packet every RTT On inferring loss (triple duplicate ACKs or timeout) cut cwnd to half

59 59 FRTP: TCP-based version With a dedicated circuit the amount of data network can sustain (what cwnd represents) is fixed and known. Slow start serves to get the cwnd close to the actual network capacity; not required Additive increase of congestion avoidance and multiplicative decrease of loss recovery not required

60 60 FRTP implementation We use the Web100 TCP stack which provides an interface to modify/set some TCP parameters Added 2 new control parameters One to select whether the TCP socket is connected to a CHEETAH circuit Second to provide the value to which cwnd is to be set. This value is the Bandwidth Delay Product BDP

61 61 FRTP utility For small to medium sized transfers the gains of avoiding slow start are substantial Lesser advantage as transfer size increases In maintaining just BDP amount of outstanding data FRTP might get lower throughput if the sending process goes silent The switches may not have enough extra data in their buffers to cover for these silences

62 62 File transfer experiment Tested different file transfer applications over the CHEETAH testbed ORNL, Tn  SOX/SLR PoP, Atlanta, Ga  Centaur lab, NCSU, NC Hostcompute-0-0zelda3zelda4wukong LocationNCSUAtlantaORNLMCNC CPUDual 2.4 GHz XeonDual 2.8 GHz Xeon 2.8 GHz Xeon Memory2 GB 1 GB DiskSCSI 3x10K rpm RAID0 SCSI 2x10K rpm RAID0 SCSI 2x15K rpm RAID0 Kernel2.6.112.4.21 2.6.9 File systemXFSEXT3

63 63 File transfer experiment : Results Among FTP, SFTP, SABUL, Hurricane, BBCP and FRTPv1 FTP seen to have best performance FTP gets highest throughput and is most consistent Send and receive buffers have to be tuned for best performance


Download ppt "1 Experiences with setting up the CHEETAH network Xuan Zheng, Xiangfei Zhu, Malathi Veeraraghavan University of Virginia."

Similar presentations


Ads by Google