Presented at the GGF3 conference 8th October Frascati, Italy The EU DataTAG Project Presented at the GGF3 conference 8th October Frascati, Italy Olivier H. Martin CERN - IT Division
The EU DataTAG project Two main focus: Grid applied network research Interoperability between Grids 2.5 Gbps transatlantic lambda between CERN (Geneva) and StarLight (Chicago) dedicated to research (no production traffic) Expected outcomes: Hide complexity of Wide Area Networking Better interoperability between GRID projects in Europe and North America DataGrid, possibly other EU funded Grid projects PPDG, GriPhyN, DTF, iVDGL (USA) 11/23/2018 The EU DataTAG Project
The EU DataTAG project (cont) European partners: INFN (IT), PPARC (UK), University of Amsterdam (NL) and CERN, as project coordinator Significant contributions to the DataTAG workplan have been made by Jason Leigh (EVL@University of Illinois), Joel Mambretti (Northwestern University), Brian Tierney (LBNL). Strong collaborations already in place with ANL, Caltech, FNAL, SLAC, University of Michigan, as well as Internet2 and ESnet. The budget is 3.9 MEUR Expected starting date: December, 1, 2001 NSF support through the existing collaborative agreement with CERN (Eurolink award) 11/23/2018 The EU DataTAG Project
DataTAG project Abilene UK IT ESNET CERN GEANT MREN NL NewYork SuperJANET4 IT GARR-B STAR-LIGHT ESNET GEANT CERN MREN NL SURFnet STAR-TAP 11/23/2018 The EU DataTAG Project
DataTAG planned set up (second half 2002) DataTAG test equipment CERN PoP Chicago DataTAG test equipment DataTAG test equipment STARLIGHT UvA INFN PPARC …... DataTAG test equipment ESNET GEANT ABILENE 2.5 Gb DataGRID PPDG iVDGL CERN CIXP DTF GriPhyN DataTAG test equipment 11/23/2018 The EU DataTAG Project
DataTAG Workplan WP1: Provisioning & Operations (CERN) Will be done in cooperation with DANTE Two major issues: Procurement Routing, how can the DataTAG partners have transparent access to the DataTAG circuit across GEANT and their national network? WP5: Information dissemination and exploitation (CERN) WP6: Project management (CERN) 11/23/2018 The EU DataTAG Project
DataTAG Workplan (cont) WP2: High Performance Networking (PPARC) High performance Transport tcp/ip performance over large bandwidth*delay networks Alternative transport solutions End to end inter-domain QoS Advance network resource reservation 11/23/2018 The EU DataTAG Project
DataTAG Workplan (cont) WP3: Bulk Data Transfer & Application performance monitoring (UvA) Performance validation End to end user performance Validation Monitoring Optimization Application performance Netlogger 11/23/2018 The EU DataTAG Project
DataTAG Workplan (cont) WP4: Interoperability between Grid Domains (INFN) GRID resource discovery Access policies, authorization & security Identify major problems Develop inter-Grid mechanisms able to interoperate with domain specific rules Interworking between domain specific Grid services Test Applications Interoperability, performance & scalability issues 11/23/2018 The EU DataTAG Project
DataTAG Planning details The lambda availability is expected in the second half of 2002 Initially, test systems will be either at CERN or connect via GEANT GEANT is expected to provide VPNs (or equivalent) for Datagrid and/or access to the GEANT PoPs. Later, it is hoped that GEANT will provide dedicated lambdas for Datagrid Initially a 2.5 Gb/sec POS link WDM later, depending on equipment availability 11/23/2018 The EU DataTAG Project
11/23/2018 The EU DataTAG Project At the STAR TAP but not yet on the map is: KREONet2 (Korea) Newcomers we expect in the next few months are: ANSP (Brazil - Sao Paulo R&E network) RNP (Brazil - country-wide R&E network) HEANET (Ireland R&E network) STAR LIGHT a project to run Lambda’s between participants to a meet point in Chicago is also underway 11/23/2018 The EU DataTAG Project
The STAR LIGHT Next generation STAR TAP with the following main distinguishing features: Neutral location (Northwestern University) 1/10 Gigabit Ethernet based Multiple local loop providers Optical switches for advanced experiments The STAR LIGHT will provide 2*622 Mbps ATM connection to the STAR TAP Started in July 2001 Also hosting other advanced networking projects in Chicago & State of Illinois N.B. Most European Internet Exchanges Points have already been implemented along the same lines. 11/23/2018 The EU DataTAG Project
StarLight Infrastructure …Soon, Star Light will be an optical switching facility for wavelengths 11/23/2018 The EU DataTAG Project
Evolving StarLight Optical Network Connections Asia-Pacific SURFnet, CERN Vancouver CA*net4 CA*net4 Seattle Portland U Wisconsin Chicago* NYC PSC San Francisco IU DTF 40Gb NCSA Asia-Pacific Caltech Atlanta SDSC *ANL, UIC, NU, UC, IIT, MREN AMPATH 11/23/2018 The EU DataTAG Project
Multiple Gigabit/second networking Facts, Theory & Practice (1) Gigabit Ethernet (GBE) nearly ubiquitous 10GBE coming very soon 10Gbps circuits have been available for some time already in Wide Area Networks (WAN). 40Gbps is in sight on WANs, but what after? THEORY: 1GB file transferred in 11 seconds over a 1Gbps circuit (*) 1TB file transfer would still require 3 hours and 1PB file transfer would require 4 months (*) according to the 75% empirical rule 11/23/2018 The EU DataTAG Project
Multiple Gigabit/second networking Facts, Theory & Practice (2) Assuming suitable window size is use (i.e. bandwidth*RTT), the achieved throughput also depends on the packet size and the packet loss rate. This means that with non-zero packet loss rates, higher throughput will be achieved using Gigabit Ethernet “Jumbo Frames”. Could possibly conflict with strong security requirements in the presence of firewalls (e.g. throughput, transparency (e.g.TCP/IP window scaling option)) Single stream vs multi-stream Tuning the number of streams is probably as difficult as tuning single stream However, as explained later multi-stream are a very effective way to bypass the deficiencies of TCP/IP 11/23/2018 The EU DataTAG Project
Single stream vs Multiple streams (1) Why do multiple streams normally yield higher aggregate throughput than a single stream, in the presence of packet losses? Assume we have a 200ms RTT (e.g. CERN-Caltech) and a 10Gbps link The size of the window is computed according to the following formula: Window Size = Bandwidth*RTT (i.e. 250MB at 10Gbps & 200ms RTT): With no packet losses, one 10Gbps stream or two 5Gbps streams are equivalent, even though the CPU load on the end systems may not be the same. 11/23/2018 The EU DataTAG Project
Single stream vs Multiple streams (2) With one packet loss, the 10Gbps stream will reduce its window to 5Gbps and will then increase by one MSS (1500 bytes) per RTT, therefore the average rate during the congestion avoidance phase will be 7.5 Gbps, at best. With one packet loss and two 5Gbps streams, only one stream is affected and the congestion avoidance phase is shorter (i.e. almost half) because RTTs are hardly affected by the available bandwidth, so, the average rate will be 3.75Gbps, and the aggregate throughput will be 8.75Gbps, In addition the 10Gbps regime will be reached faster. 11/23/2018 The EU DataTAG Project
Single stream vs Multiple streams (3) effect of a single packet loss (e.g. link error, buffer overflow) Streams/Throughput 10 5 1 7.5 4.375 2 9.375 10 Avg. 7.5 Gbps Throughput Gbps 7 5 Avg. 6.25 Gbps Avg. 4.375 Gbps 5 2.5 Avg. 3.75 Gbps T = 2.37 hours! (RTT=200msec, MSS=1500B) T T T Time T 11/23/2018 The EU DataTAG Project
Single stream vs Multiple streams (4) effect of two packet losses (e.g. link error, buffer overflow) Streams/Throughput 10 5 1 6.25 4.583 2 9.166 10 Avg. 6.25 Gbps Throughput Gbps Avg. 8.75 Gbps 7 5 Avg. 6.25 Gbps 5 Avg. 4.375 Gbps 1 packet losses on two 5Gbps streams Avg. 4.583 Gbps 2.5 Avg. 3.75 Gbps T = 2.37 hours! (RTT=200msec, MSS=1500B) 2 packet losses on one 5Gbps stream T T T T Time 11/23/2018 The EU DataTAG Project
Multiple Gigabit/second networking (tentative conclusions) Are TCP's "congestion avoidance" algorithms compatible with high speed, long distance networks? The "cut transmit rate in half on single packet loss and then increase the rate additively (1 MSS by RTT)" algorithm, also called AIMD “additive increase, multiplicative decrease” may simply not work. New TCP/IP adaptations may be needed in order to better cope with “lfn”, e.g. TCP Vegas, but simpler changes can also be thought of. Non-Tcp/ip based transport solution, use of Forward Error Corrections (FEC), Early Congestion Notifications (ECN) rather than active queue management techniques (RED/WRED)? We should work closely with the Web100 & Net100 projects Web100 (http://www.web100.org/), a 3MUSD NSF project, might help enormously! better TCP/IP instrumentation (MIB) self-tuning tools for measuring performance improved FTP implementation Net100 http://www.net100.org/ (complementary, DoE funded, project) Development of network-aware operating systems 11/23/2018 The EU DataTAG Project