GMPLS optical networks

1 GMPLS optical networks
Malathi Veeraraghavan Professor Charles L. Brown Dept. of Electrical & Computer Engineering University of Virginia ETRI, Korea Feb. 2009 GMPLS: Generalized MultiProtocol Label Switched networks (MPLS, SONET, WDM, SDM, VLAN)

2 Outline Telcom “transport network” Cheetah vs. Dragon Approach
Theoretical concepts GMPLS networks Technologies, off-the-shelf switches, control-plane protocols State of the art on different applications & networks Commercial Research-and-Education (REN) networks

3 Spectrum of services Leased line IP
Leased lines are used to connect IP routers. Network that offers leased line service is called “transport network” by telcom industry Leased line IP Circuit technologies: time/frequency division multiplexing PDH: T1, T3 switch: Digital Cross Connect (DCS) SONET/SDH: OC3-OC768 Switch: SONET/SDH crossconnects DWDM: OTU1-OTU3 Switch: optical WDM crossconnects Packet technologies: virtual circuit switches ATM MPLS Carrier-grade Ethernet All the above: Data-plane technologies

4 Circuit or virtual circuit (VC) switch
IP and leased line service deployment Leased line Circuit or virtual circuit (VC) switch Telco service provider (transport network) owns circuit/VC switches Internet service provider or enterprise owns IP routers IP Router

5 (in transport network)
Management plane (in transport network) (2) NMS computes path with available bandwidth Network management system (1) Admins use Web interface to request leased line creation (3) NMS sends provisioning signals to each switch on path using SNMP/CLI/TL1 Customer edge device Customer edge device Customer edge device switch controller has minimal software (SNMP agent, CLI/TL1 parser) Customer edge device Customer edge device

6 Spectrum of services Leased line IP New service: rapid provisioning
Verizon Bandwidth-on-Demand (BoD) IP

7 Network management system
Management plane + control plane (2) NMS still computes path with available bandwidth Network management system (1) Admins use Web interface to request leased line creation (4) hop-by-hop distributed signaling for circuit/VC provisioning (3) TL1/CLI to edge node Customer edge device Customer edge device Customer edge device switch controllers have RSVP-TE software Customer edge device Customer edge device

8 Progress made in telcom industry
Data-plane progress Excellent: interesting new switching technologies being invented for transport networks Control-plane Switch controllers implement RSVP-TE capable of distributed route computation and admission control But only provisioning phase is distributed Requests for circuits/VCs are still handled through management plane with involvement of administrators even in “Dynamic” scenarios Why is this an issue? Limits access to “transport” circuit/VC network

9 Difference with R&E thinking
(2) scheduler computes path with available bandwidth Scheduler (1) application software running at end host initiates request for circuit/VC (3) TL1/CLI to edge node (4) hop-by-hop distributed signaling for circuit/VC provisioning external controller (3a) switch controllers have RSVP-TE software Enterprise (3a) configure router to filter packets for long flow on to circuit/VC

10 Effect of opening up access to circuit/VC “transport” network
Application software running on end hosts deep inside enterprises can access dynamic circuit/VC services of the backbone transport network Circuit network reach does not need to extend all the way to the desktop With additional high-speed line from enterprise edge router into transport network, high-speed access can be enabled for short durations High call volume of setup/release: automatic generation of calls by software New applications!

11 Spectrum of services New services Leased line IP Verizon BoD eScience
10G POTS IP Book-ahead (BA) mode call duration specified Current solution: centralized per-domain path computation/admission control Low call handling volume Plain Old Telephone Service (64kbps) Immediate-Request (IR) mode unspecified call duration Low call setup overhead ( holding times can be shorter) Distributed path computation/admission control High call handling volume OSCARS/DRAGON CHEETAH

12 Outline Telcom “transport network” Cheetah vs. Dragon Approach
Theoretical concepts GMPLS networks Technologies, off-the-shelf switches, control-plane protocols State of the art on different applications & networks Commercial Research-and-Education (REN) networks

13 Observations "Many e-science experiments ... are optimized to provide maximum throughput to a few facilities, as opposed to moderate throughput to millions of users, which is the raison d'etre for commercial networks." Networks should be scalable: Metcalfe's statement: Value of a network increases exponentially with the number of users

14 Key difference between DRAGON and CHEETAH
DRAGON focus: For eScience Small number of users High throughput to a few facilities Transfer technology to Internet2 Implement and deploy software for book-ahead reservations and circuit provisionining by teaming with ESNet and DANTE CHEETAH focus: General-purpose commercial network goal to bring GMPLS services to millions of users But not with just moderate throughput, but also high-rate Analyze GMPLS network bandwidth sharing modes (BA + IR) Implementation: IR

15 Background Types of switches Types of bandwidth-sharing modes
IP networks vs connection-oriented (GMPLS) networks Tradeoffs in GMPLS network modes Immediate-request mode (e.g., Plain Old Telephone Service) Book-ahead (advance-reservation)

16 Types of switches Multiplexing technique on data-plane links Admission
control in control plane? Circuit switch (CS) - position based (port, time, lambda) Packet switch (PS) - header based Connectionless (CL) - no admission control Not an option e.g., Ethernet Connection-oriented (CO) - admission control e.g., telephone SONET WDM, SDM Virtual-circuit e.g., MPLS, ATM, PBBTE GMPLS network switches

17 Difference between bandwidth (BW)-sharing modes
In connectionless networks (e.g., IP) Pre-1988 IP network: Just send data without reservations or any mechanism to adjust rates  congestion collapses in the Internet in the 80s! Van Jacobson's 1988 contribution: Added congestion control to TCP Sending TCP adjusts rate TCP congestion-control pros and cons: Pros: Proportional fairness and high utilization Cons: No rate guarantees & No temporal fairness (job seniority) In connection-oriented networks (e.g., GMPLS) Key: Admission control

18 Bandwidth sharing modes in GMPLS networks
Can execute admission control in two ways: Bufferless (immediate-request) With buffers (book-ahead is effectively the same as having buffers to hold calls to start in the future) Immediate-request: M/G/m/m model m: number of channels on a link (servers) if all channels are occupied, reject call Book-ahead: M/G/m/p model p: max number in system: advance-reservation window K = p/m timeslots waiting time and call blocking K cannot be : need to block calls if per-server traffic intensity can be > 1 Or engineer the system so per-server traffic intensity ≤ 1 Difference: Not as the names suggest: IR calls need bandwidth immediately Misconception: BA with book-ahead time of “now”  IR  NOT TRUE Instead, call duration needs to be specified to support BA mode For IR mode, applications do not need to specify duration

19 IR mode: M/G/m/m ErlangB formula
r: offered traffic load in Erlangs : call arrival rate 1/: mean call holding time r/m: per-server traffic intensity m: number of circuits Pb: call blocking probability ub: utilization For a 1% call blocking probability, i.e., Pb = 0.01 1997 Crovella et al paper alpha = 1.06 k =1000B r m ua 24.8% 58.2% 84.6% 1 10 100 4 17 117 If m is small, high utilization can only be achieved along with high call blocking probability

20 Comparison of Immediate-Request (IR) and Book-Ahead (BA) schemes
Example To achieve a 90% utilization with a call blocking probability less than 10% BA-First schemes are needed when m < 59 To achieve a 90% utilization with a call blocking probability less than 20% BA-First schemes are needed when m < 32 U: utilization K: number of time periods in advance-reservation window Link capacity C = 10Gbps m = 10 if per-call allocation = 1Gbps IR m=10, U = 80%: PB = 23.6% m=100, U = 80%: PB = 0.4% BA m=10, K=10, U = 80%: PB = 0.4%

21 Bandwidth sharing mechanisms in GMPLS networks
Needed if per-call circuit rate is a large fraction of link capacity (e.g., 1Gbps circuits on a 10Gbps link, m = 10) Bandwidth sharing mechanisms Book-ahead Immediate-request call duration specified unspecified call duration BA-n/BA-First VBDS (Varying-Bandwidth Delayed Start) session-type requests: BW, duration data-type requests: file size (can assign any rate, even vary rate in different time ranges) BA-n BA-First Users specify a set of n call-initiation time options Users are given first available timeslot X. Zhu, Ph.D. Thesis, UVA,

22 Relate BW sharing modes to network types
Bandwidth-sharing mechanisms Book-Ahead (BA) (high rate per call) Immediate-Request (IR) (moderate rate per call) eScience networks (small number of users) Very large (TB, PB) file transfers need high-BW and long holding time + remote viz. need to reserve other resources such as displays. Centralized control-plane solution sufficient, since call durations are high (OSCARS+DRAGON) What applications? Centralized control-plane (DRAGON) general-purpose networks (large number of users) To assign 1Gb/s on 10Gb/s per file transfer, m=10, need BA mode. Need distributed control-plane solution: small durations implies high call arrival rate at same util (load) Moderately large (100MB, GB) file transfers assigned moderate-BW ( Mbps) (CHEETAH) If 1/mu is small, lambda has to be high to achieve same per-server traffic intensity, which is rho/m, rho=lambda * (1/mu). If call arrival rate is high, need distributed signaling engines to handle high load.

23 References on bandwidth sharing modes
IR mode for file transfers with moderate-BW allocation (100Mbps on 10Gbps link) X. Fang and M. Veeraraghavan, “On using a hybrid architecture for file transfers,” acceptedto IEEE Transactions on Parallel and Distributed Systems, 2009. X. Fang and M. Veeraraghavan, On using circuit-switched networks for file transfers,” in IEEE Globecom, New Orleans, LA, Nov X. Zhu, X. Zheng, and M. Veeraraghavan, "Experiences in implementing an experimental wide-area GMPLS network," IEEE Journal on Selected Areas in Communications (JSAC), Apr M. Veeraraghavan, X. Fang, and X. Zheng, “On the suitability of applications for GMPLS networks,” in IEEE Globecom, San Francisco, CA, Nov Large-scale deployment of BA mode: (mean waiting time, blocking rate) X. Zhu and M. Veeraraghavan, "Analysis and Design of Book-ahead Bandwidth-Sharing Mechanisms," IEEE Transactions on Communications, Dec. 08. X. Zhu, M. E. McGinley, T. Li, and M. Veeraraghavan, "An Analytical Model for a Book-ahead Bandwidth Scheduler," in IEEE Globecom Washington, DC, Nov Heterogeneous rate allocation

24 Is an opportunity being missed if distributed IR bandwidth sharing mode is not explored?
Yes. Four reasons: Increase end-to-end rate relative to IP service; possible in the presence of admission control (programmable patch panels to share ports) Enable the creation of large-scale circuit/VC networks with moderate-rate circuits that can support a brand new class of applications economic value for the networking industry A "reservations-oriented" mode of networking to complement today's connectionless Internet analogy: airlines complement roadways Alternative pricing models for bandwidth Leased lines and IP service are at two extremes Usage based pricing Dedicated (moderately high) bandwidth for short durations instead of low bandwidth for all time BA support is clearly needed but for the following two reasons, my preference is to work on IR. 1. Scalability loss by not having support for BA in switches. By centralizing scheduling in an external controller, all BW sharing features of MPLS/GMPLS control-plane engines are lost. With a limit on call holding time, IR does have value. Domain-to-domain signaling between schedulers needs standardization. This is the same as the multiple switch problem. The community is on the right track, but is software oriented rather than being interested in algorithms. Nagi's work on accept any open time slot, etc. in his JTechs. talks from July 06, is the only one I know of that has reached this question of different BA algorithms. Like our BA-n and BA-all; but our work is only for a single link. Need to extend this to multiple links. Did Chin's OSCARS-BRUW testing involve two schedulers and an inter-domain protocol? 2. Not great solution to require a suer to always book ahead - IR is more natural in the data world unlike with arilines. But answer to this that Nagi, Tom Lehman and Chin give is that sure IR can be accommodated as part of the BA centralized controller. This is incorrect, because if long-held, high-BW circuits are allocated in BA mode, the probability of blocking of IR calls will be high. Presumably, call holding time is not limited to some small number in BA, which intrinsically becomes a disadvantage for co-existing IR calls. Without partitioning BW, can't handle both types of calls. So may as well as use distributed GMPLS CP solution for IR. Bandwidth is not like airline seats - not that expensive and scarce, which means BA should not be the only mode.

25 To increase end-to-end rate
Problem: WDM allows 40Gbps/channel with 80 channels/port But, end-to-end rate is still on the order of tens of Mbps Why? Access link rates: both for enterprises and residences Inter-domain link cost: Internet2 charges $250K/year for a 1Gbps Ethernet connection Why so high? High router port cost and no sharing Router port cost: One-port 10Gbps or ten-port 1Gbps interface card costs $ K 2007 data for local access links in US: 1.5M T1, 183K T3, 44K OC3, 21K OC12, 2K OC48 and 2.5K OC192 Add leased lines to terminate on a space-division switch - for moderate rate, connect to sub-Gbps ports With admission control for ports, connect high-speed link for short duration for single flows based on request from file-transfer apps.

26 What "brand new class of applications?"
Moderate-bandwidth Video: “Harry Potter” application, multiple-cameras/automated cameraman for video-tel/conf, distance-learning, virtual reality Cloud computing, gaming Teleoperations, telemedicine High-bandwidth, short-held calls Web, P2P, storage, CDN file transfers BA support is clearly needed but for the following two reasons, my preference is to work on IR. 1. Scalability loss by not having support for BA in switches. By centralizing scheduling in an external controller, all BW sharing features of MPLS/GMPLS control-plane engines are lost. With a limit on call holding time, IR does have value. Domain-to-domain signaling between schedulers needs standardization. This is the same as the multiple switch problem. The community is on the right track, but is software oriented rather than being interested in algorithms. Nagi's work on accept any open time slot, etc. in his JTechs. talks from July 06, is the only one I know of that has reached this question of different BA algorithms. Like our BA-n and BA-all; but our work is only for a single link. Need to extend this to multiple links. Did Chin's OSCARS-BRUW testing involve two schedulers and an inter-domain protocol? 2. Not great solution to require a suer to always book ahead - IR is more natural in the data world unlike with arilines. But answer to this that Nagi, Tom Lehman and Chin give is that sure IR can be accommodated as part of the BA centralized controller. This is incorrect, because if long-held, high-BW circuits are allocated in BA mode, the probability of blocking of IR calls will be high. Presumably, call holding time is not limited to some small number in BA, which intrinsically becomes a disadvantage for co-existing IR calls. Without partitioning BW, can't handle both types of calls. So may as well as use distributed GMPLS CP solution for IR. Bandwidth is not like airline seats - not that expensive and scarce, which means BA should not be the only mode.

27 Outline Cheetah vs. Dragon Approach GMPLS networks
Theoretical concepts GMPLS networks Technologies, off-the-shelf switches, control-plane protocols State of the art on different applications & networks Commercial Research-and-Education (REN) networks

28 GMPLS related technologies
GMPLS networks Data-(user-) plane protocols packet-switched: MPLS, VLAN Ethernet (PBBTE) circuit-switched: SONET/SDH, WDM, SDM (space div. mux) Control-plane protocols: RSVP-TE: signaling protocol OSPF-TE: routing protocol LMP: link management protocol Internetworking: Ethernet-over SONET/MPLS/WDM GFP, VCAT, LCAS for SONET/SDH PWE3 for MPLS networks Digital wrapper for OTN

29 Why internetworking? GMPLS networks do not exist as standalone entities as data-sourcing end hosts do not have MPLS, SONET, WDM NICs Instead they need to be internetworked with Ethernet interface cards: Common usage: IP layer internetworking IP routers with Packet-over-SONET (PoS) interfaces Newer usage: Ethernet layer internetworking Ethernet over MPLS/SONET/WDM/SDM Port-mapped VLAN-mapped (probably not supported with SDM) Ethernet interface could be on hosts or routers

30 Off-the-shelf GMPLS switches
Vendor/system Data-plane Control-plane Cisco series MPLS switching; PWE3 Ethernet-over-MPLS RSVP-TE, OSPF-TE Juniper T640 Sycamore SN16000 SONET switching; GFP/VCAT Ethernet-over-SONET (EoS) for SONET circuits; no support for EoS Ciena CDCI GFP/VCAT EoS Proprietary signaling/routing protocols Movaz (now Adva) RayExpress WDM switching; G.709 Eth-over-WDM Calient SDM switching; Ethernet-over-fiber RSVP-TE, OSPF-TE (?) Force10 E600 Ethernet VLAN switching None

31 GMPLS control-plane scope
RSVP-TE and OSPF-TE do not have parameters to support admission control for BA calls e.g., call duration, optional desired call-initiation time Strengths: Distributed routing and call setup/release functions for high-call volume IR calls OSPF-TE (in each switch controller) Loading conditions shared only intra-area Link-state + Distance vector (even basic OSPF) RSVP-TE (in each switch controller) Route computation and admission control CSPF can be done only intra-area by ingress switch Any switch could be an ingress switch – hence highly scalable Switch fabric configuration (i.e., provisioning)

32 Control-plane for BA calls
Run an external scheduler to perform path computation and admission control for future start time add authentication and authorization Centralized scheduler - one per domain Inter-domain scheduler-to-scheduler protocol: Abstracted topology exchange Reservation phase (path computation + admission control) Signaling phase (not clear why RSVP-TE is not used interdomain) Intradomain Provisioning phase: RSVP-TE is used OSPF-TE data is read out from switch controllers by scheduler for intra-domain path computation Not a scalable solution to support short-duration, high-BW calls

33 Outline Cheetah vs. Dragon Approach GMPLS networks
Theoretical concepts GMPLS networks Technologies, off-the-shelf switches, control-plane protocols State of the art on different applications & networks Commercial Research-and-Education (REN) networks

34 Spectrum of services New services Leased line IP eScience 10G POTS
Verizon BoD

35 Commercial uses Semi-permanent MPLS virtual circuits
Traffic engineering Voice over IP QoS concerns: telephony has a 150ms one-way delay requirement (with echo cancellers) Business or service provider interconnect interconnecting geographically distributed campuses of an enterprise interconnecting wide-area routers of an ISP service provider

36 Traffic engineering (TE)
Since BGP and OSPF routing protocols mainly spread reachability information, routing tables are such that some links become heavily congested while others are lightly loaded MPLS virtual circuits are used to alleviate this problem e.g., NY to SF traffic could be directed to take an MPLS virtual circuit on a lightly loaded route avoiding all paths on which more local traffic may compete This is an application of MPLS VCs without bandwidth allocation

37 Business or service provider interconnect (leased lines)
Multiple options: TDM circuits (traditional private line, T1, T3, OC3, OC12, etc.) Ethernet private line point-to-point (Ethernet over MPLS/SONET/WDM) VPNs (called Virtual private LAN service) MPLS VPNs WDM lightpaths Dark fiber

38 Dynamic circuits/virtual circuit (GMPLS control-plane)
Commercial: fast restoration circuit/VC setup delay significant rapid provisioning Verizon: Bandwidth on Demand (Just-in-Time Provisioning) AT&T: Shared mesh networks Customer Applications for dynamic network configuration Key industries: Financial, Media & Entertainment Corporate Utility Backbone Networks (e.g. reconfigure for disaster recovery) Distribution of real-time content (e.g., Video) Level3: Vyvx service

39 Spectrum of services New services Leased line IP Verizon BoD eScience
10G POTS IP Book-ahead (BA) mode call duration specifie d Current solution: centralized per-domain path computation/admission control Low call handling volume OSCARS/DRAGON

40 Research & Education (G)MPLS networks
Internet2’s Dynamic Circuit network NSF-funded DRAGON DOE's ESnet - Science Data Network DOE's Ultra Science Network (USN)

41 Rick Summerhill talk (10/11/2007)
Internet2 DWDM network Image Slide Infinera DWDM system Rick Summerhill talk (10/11/2007)

42 Internet2 Dynamic Circuit (DC) network
Image Slide Ciena CD-CI Eth-SONET switch Rick Summerhill talk (10/11/2007)

43 Internet2 IP-routed network
IP-router-to-router links on one wavelength SONET switch-to-switch links on another wavelength Ciena CD-CI Eth-SONET switch Image Slide Juniper T640 IP router Rick Summerhill talk (10/11/2007)

44 Rick Summerhill talk (10/11/2007)
Equipment at each PoP Image Slide Rick Summerhill talk (10/11/2007)

45 Control-plane software (for DC network)
OSCARS implemented in InterDomain Controller (IDC) - one per domain Abstracted topology exchange Interdomain scheduling Interdomain signaling (for provisioning) DRAGON (intradomain control-plane) Used in Internet2’s DC network Intradomain routing, path computation, signaling (for provisioning)

46 OSCARS On-demand Secure Circuits and Advance Reservation System (OSCARS) DOE Office of Science and ESnet project Co-development with Internet2 Web Service based provisioning infrastructure, which includes scheduling, AAA architecture using X.509 certificates Extended to include the DICE IDCP Reservations held in SQL database Recall no support for book-ahead in GMPLS control protocols Talk by Tom Lehman, Sep. 28, 2008

47 DRAGON Washington DC metro-area network: Control-plane software:
Adva (old Movaz) WDM switches and Ethernet switches (G.709) Control-plane software: Network Aware Resource Broker – NARB Intradomain listener, Path Computation Virtual Label Swapping Router – VLSR Implements OSPF-TE, RSVP-TE Run on control PCs external to switches (since not all switches implement these GMPLS control-plane protocols) Communicates with switches via SNMP, TL1, CLI to configure circuits. Client System Agent – CSA End system software for signaling into network (UNI or peer mode) Application Specific Topology Builder – ASTB User Interface and processing which build topologies on behalf of users Topologies are a user specific configuration of multiple LSPs

48 Open Source DCN Software Suite
OSCARS (IDC) Open source project maintained by ESNet and Internet2 Uses WDSL, XML, SQL database to store reservations Reservations accepted with 1 minute granularity DRAGON (DC) NSF-funded Open source project maintained by USC ISI EASTand MAX Version 0.4 of DCNSS current deployed release DCN workshops offered for training: Talk by Tom Lehman, Sep. 28, 2008

49 DICE IDCP Dante, Internet2, CANARIE, ESNet
IDCP: InterDomain Controller Protocol wsdl - web service definition of message types and formats xsd – definition of schemas used for network topology descriptions and path definitions Talk by Tom Lehman, Sep. 28, 2008

50 InterDomain Controller (IDC) Protocol (IDCP)
The following organizations have implemented/deployed systems which are compatible with this IDCP Internet2 Dynamic Circuit Network (DCN) ESNet Science Data Network (SDN) GÉANT2 AutoBahn System Nortel (via a wrapper on top of their commercial DRAC System) Surfnet (via use of above Nortel solution) LHCNet (use of I2 DCN Software Suite) Nysernet (use of I2 DCN Software Suite) LEARN (use of I2 DCN Software Suite) LONI (use of I2 DCN Software Suite) Northrop Grumman (use of I2 DCN Software Suite) University of Amsterdam (use of I2 DCN Software Suite) DRAGON Network The following "higher level service applications" have adapted their existing systems to communicate via the user request side of the IDCP: LambdaStation (FermiLab) – CMS project on Large Hadron Collider TeraPaths (Brookhaven) - ATLAS project on Large Hadron Collider Phoebus Talk by Tom Lehman, Sep. 28, 2008

51 Heterogeneous Network Technologies Complex End to End Paths
Example: DRAGON Example: Internet2 DC Example: ESNet SDN AS 2 AS 1 IP Control Plane AS 3 IP Control Plane IP Control Plane VLSR Router MPLS LSP Ethernet over SONET VLSR Ethernet over WDM End System Ethernet Router Lambda Switch SONET Switch End System Ethernet Segment VLSR Established VLAN Ethernet Segment VLSR Established VLAN Rick Summerhill talk (10/11/2007)

52 IDCP operation Route selection, admission control
centralized per domain at IDC Advance reservation request and circuit provisioning at scheduled time: End user signals IDC with a reservation request Authenticate requester and check authorization Request reservation (create time, bandwidth, VLAN tag) Signaling: creation of circuit (automatic or in response to message to IDC) Topology exchange: interdomain (abstracted topology information) Monitoring

53 Intra-domain operations
Using DRAGON in Internet2 DCN NARB does intra-domain path computation after collecting routing information by listening to OSPF-TE exchanges between VLSRs These intradomain paths are provided to IDC for use during resource scheduling (upto 3 path options are considered) 5 VLSRs serve 22 CD-CIs: “subnets of CD-CIs” In Signaling phase, VLSR sends TL1 command to edge CD-CI, which initiates proprietary hop-by-hop signaling to configure circuit through subnet

54 GOLE: GLIF open lightpath exchange

55 DOE networks ESnet and Science Data Network (SDN) UltraScience Network
OSCARS: an advance-reservation system Science Data Network: MPLS network UltraScience Network Research network for DoE labs GbE and SONET (Ciena CD-CI) Centralized scheduler for advance-reservation calls 5-PoP network: ORNL, Atlanta, Chicago, Seattle, Sunnyvale Connections to Fermi Lab, PNNL, SLAC, CalTech Lambdastation: CMS project Between Fermi Lab and Univ. of Nebraska

56 Spectrum of services New services Leased line IP Verizon BoD eScience
10G POTS IP Plain Old Telephone Service (64kbps) Immediate-Request (IR) mode unspecified call duration Low call setup overhead ( holding times can be shorter) Distributed path computation/admission control High call handling volume CHEETAH

57 NSF-funded CHEETAH network GbEthernet and SONET
UVa TN PoP CUNY GbE SN16000 GbE NCSU End hosts OC192 card Control card GbE/ 10GbE card GbE GbEs OC-192 GA PoP NC PoP We extended the reach of the CHEETAH network to UVa and CUNY through the use of Vortex links in Virginia, NYSERnet connectivity in New York, a HOPI VLAN from New York to Washington, two MPLS tunnels from Washington to Raleigh, NC on the Abilene and NCREN networks. The second enhancement was that ORNL provided us with the OC192 link between the Sycamore switches at Atlanta and ORNL. We added Georgia Tech to the CHEETAH network. SN16000 SN16000 GbE GbE/ 10GbE card Control card OC192 cards GbE/ 10GbE card GbE End hosts OC192 card Control card End hosts GaTech OC-192 GbE ORNL GbE Sycamore SN16000 SONET switch with GbE/10GbE interfaces

58 Networking software CircuitTCP (CTCP) code
Sycamore switch comes with built-in GMPLS control-plane protocols: RSVP-TE and OSPF-TE We developed CHEETAH software for Linux end hosts: circuit-requestor allows users and applications to issue RSVP-TE call setup and release messages asking for dedicated circuits to remote end hosts CircuitTCP (CTCP) code

59 SONET circuit-switched network
CHEETAH network usage End Host End Host CHEETAH software CHEETAH software IP-routed network DNS client DNS client Application RSVP-TE module SONET circuit-switched network RSVP-TE module Application TCP/IP TCP/IP NIC 1 Circuit Gateway Circuit Gateway NIC 1 CTCP/IP CTCP/IP NIC 2 NIC 2 Bandwidth-sharing mode: Immediate-request mode (blocked calls fall back to IP path) Heterogeneous rate allocation under high loads: higher BW for large files than for small files Applications: Common file transfers (web, P2P, CDN, storage) attempts circuits for large files (if blocked, use IP-routed path) use IP-routed path for small files The next slide introduce the detail of CHEETAH software

60 End-to-end call setup delay measurements
Delays incurred in setting up a circuit between host zelda1 (in Atlanta, GA) and host wuneng (in Raleigh, NC) across the CHEETAH network Observations: Setup delays for SONET circuits (OC1, OC3) are small (166ms) Setup delays for Ethernet-over-SONET (EoS) hybrid circuits are much higher (1.6s) (no standard; proprietary implementation) Signaling message processing delays dominate end-to-end circuit setup delays Circuit type End-to-end circuit setup delay (s) Processing delay for Path message at the NC SN16000 (s) Processing delay for Resv message at OC-1 OC-3 1Gb/s EoS Round-trip signaling message propagation plus emission delay between GA SN16000 and NC SN16000: 0.025s

61 Conclusions Need BA service if the per-call bandwidth allocation is a significant fraction of link capacity (1Gbps on a 10Gbps link) Key differentiator between BA and IR: BA calls specify call duration GMPLS control-plane protocols are designed for distributed scalable implementation of IR service GMPLS control-plane protocols do not have parameters to support BA service (e.g., call duration in RSVP-TE) BA service with centralized schedulers per domain suitable for long call-duration eScience applications (small number of users) To support BA service for general-purpose applications, e.g., large file transfers in Web, P2P, storage, CDN, with short call durations, need to design scalable control-plane solution for BA calls Four reasons to develop an IR service with moderate per-BW calls Pr

