Networking Shawn McKee University of Michigan DOE/NSF Review November 29, 2001
DOE/NSF Review November 29, 2001Shawn Mckee, UMich2 Why Networking? Since the early 1980’s physicists have depended upon leading-edge networks to enable ever larger international collaborations. ATLASMajor HEP collaborations, such as ATLAS, require rapid access to event samples from massive data stores, not all of which can be locally stored at each computational site. Evolving integrated applications, i.e. Data Grids, rely on seamless, transparent operation of the underlying LANs and WANs. Networks are among the most basic Grid building blocks.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich3 Tier 1 Tier2 Center Online System Offline Farm, CERN Computer Ctr ~25 TIPS BNL Center France Italy UK Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec Mbits/sec Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Physics data cache ~PByte/sec ~2.5 Gbits/sec2.5 Gbits/sec Tier2 Center ~2.5 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 CERN/Outside Resource Ratio ~1:2 Tier0/( Tier1)/( Tier2) ~1:1:1 Hierarchical Computing Model
DOE/NSF Review November 29, 2001Shawn Mckee, UMich4 MONARC Simulations MONARC (Models of Networked Analysis at Regional Centres) has simulated Tier 0/ Tier 1/Tier 2 data processing for ATLAS. Networking implications: Tier 1 centers require ~ 140 Mbytes/sec to Tier 0 and ~200 Mbytes/sec to (each?) other Tier 1s, based upon 1/3 of ESD stored at each Tier 1.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich5 TCP WAN Performance Mathis, et. al., Computer Communications Review v27, 3, July 1997, demonstrated the dependence of bandwidth on network parameters: BW - Bandwidth MSS – Max. Segment Size RTT – Round Trip Time PkLoss – Packet loss rate If you want to get 90 Mbps via TCP/IP on a WAN link from LBL to IU you need a packet loss < 1.8e-6 !! (~70 ms RTT).
DOE/NSF Review November 29, 2001Shawn Mckee, UMich6 Network Monitoring: Iperf We have setup testbed network monitoring using Iperf (V1.2) (S. McKee(Umich), D. Yu (BNL)) We test both UDP (90 Mbps sending) and TCP between all combinations of our 8 testbed sites. Globus is used to initiate both the client and server Iperf processes. (
DOE/NSF Review November 29, 2001Shawn Mckee, UMich7 USATLAS Grid Testbed Calren Esnet, Abilene, Nton Abilene ESnet, Mren UC Berkeley LBNL-NERSC ESnet NPACI, Abilene Brookhaven National Laboratory Indiana University Boston University Argonne National Laboratory HPSS sites U Michigan University of Texas at Arlington University of Oklahoma Prototype Tier 2s
DOE/NSF Review November 29, 2001Shawn Mckee, UMich8 Testbed Network MeasurementsSite UDP (Mbps) TCP (Mpbs) PkLoss (%)* Jitter (ms) TCP Wind, Bottleneck ANL65.4/ / / /0.1 2 M, 100 BNL66.4/ / / /0.5 4 M, 100 BU63.4/ / / / K, 100 IU35.8/ / / / M, 45 LBL70.4/ / / /0.7 2 M, 100 OU72.1/ / / /0.4 2 M, 100 UM69.7/ / / /0.6 2 M, 100 UTA K, 10
DOE/NSF Review November 29, 2001Shawn Mckee, UMich9 Networking Requirements There is more than a simple requirement of adequate network bandwidth for USATLAS. We need: –A set of local, regional, national and international networks able to interoperate transparently, without bottlenecks. –Application software that works together with the network to provide high throughput and bandwidth management. –A suite of high-level collaborative tools that will enable effective data analysis between internationally distributed collaborators. The ability of USATLAS to effectively participate at the LHC is closely tied to our underlying networking infrastructure!
DOE/NSF Review November 29, 2001Shawn Mckee, UMich10 Networking as a Common Project HENPA new Internet2 working group has formed from the LHC Common Projects initiative: HENP (High Energy/Nuclear Physics), co-chaired by Harvey Newman (CMS) and Shawn McKee (ATLAS). Initial meeting hosted by IU in June, kick-off meeting in Ann Arbor October 26 th sameThe issues this group is focusing on are the same that USATLAS networking needs to address. USATLAS gains the advantage of a greater resource pool dedicated to solving network problems, a “louder” voice in standard settings and a better chance to realize necessary networking changes.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich11 Network Coupling to Software Our software and computing model will evolve as our network evolves…both are coupled. Very different computing models result from different assumptions about the capabilities of the underlying network (Distributed vs Local). network awareWe must be careful to keep our software “network aware” while we work to insure our networks will meet the needs of the computing model.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich12 Achieving High Performance Networking Server and Client CPU, I/O and NIC throughput sufficient Must consider firmware, hard disk interfaces, bus type/capacity Knowledge base of hardware: performance, tuning issues, examples Absolutely RequiredTCP/IP stack configuration and tuning is Absolutely Required Large windows, multiple streams No Local infrastructure bottlenecks Gigabit Ethernet “ clear path ” between selected host pairs To 10 Gbps Ethernet by ~2003 Careful Router/Switch configuration and monitoring Enough router “ Horsepower ” (CPUs, Buffer Size, Backplane BW) Packet Loss must be ~Zero (well below 0.1%) i.e. No “ Commodity ” networks (need ESNet, I2 type networks) End-to-end monitoring and tracking of performance
DOE/NSF Review November 29, 2001Shawn Mckee, UMich13 Local Networking Infrastructure LANs used to lead WANs in performance, capabilities and stability, but this is no longer true. WANs are deploying 10 Gigabit technology compared with 1 Gigabit on leading edge LANs. ESNet, I2New protocols and services are appearing on backbones (Diffserv, IPV6, multicast) (ESNet, I2). Insuring our ATLAS institutions have the required LOCAL level of networking infrastructure to effectively participate in ATLAS is a major challenge.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich14 Estimating Site Costs Site Costs OC3 155Mbps OC12 622Mbps OC48 2.4Gbps Fiber/campus Backbone I2 req. (Sup. Gig) I2 req. (Sup Gig) I2 req. (Sup Gig) Network Interface $100/conn. (Fast Eth.) $1K/conn. (Gigabit) $1K/conn. (Gigabit) Routers$15-30K$40-80K$60-120K Telecom service Provider Variable (~$12K/y) Variable (~$20K/y) Variable (~$50K/y) Network connection Fee $110K$270K$430K Network Planning for US ATLAS Tier 2 Facilities, R. Gardner, G. Bernbom (IU)
DOE/NSF Review November 29, 2001Shawn Mckee, UMich15 Networking Plan of Attack Refine our requirements for the network Survey existing work and standards Estimate likely developments in networking and their timescales Focus on gaps between expectations and needs Adapt existing work for US ATLAS Provide clear, compelling cases to funding agencies about the critical importance of the network
DOE/NSF Review November 29, 2001Shawn Mckee, UMich16 Network Efforts Survey of current/future network related efforts Determine and document US ATLAS network requirements Problem Isolation (Finger pointing tools) ProtocolsProtocols (Achieving high bandwidth and reliable connections) Network testbed (implementation, Grid testbed upgrades) ServicesServices (QoS, Multicast, Encryption, Security) Network configuration examples and recommendations knowledgebaseEnd-to-end knowledgebase Monitoring for both prediction and fault detection Liaison to network related efforts and funding agencies
DOE/NSF Review November 29, 2001Shawn Mckee, UMich17 Network Related FTEs/CostsFTEsCosts Network related efforts to leverage and adapt existing efforts for ATLAS
DOE/NSF Review November 29, 2001Shawn Mckee, UMich18 Support for Networking? DOENSFTraditionally, DOE and NSF have provided University networking support indirectly through the overhead charged to grant recipients. DOENational labs have network infrastructure provided by DOE, but not at the level we are finding we require. Unlike networking, computing for HEP has never been considered as simply infrastructure. The Grid is blurring the boundaries of computing and the network is taking on a much more significant, fundamental role in HEP computing. It will be necessary for funding agencies to recognize the fundamental role the network plays in our computing model and to support it directly.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich19 What can we Conclude? Networks will be vital to the success of our USATLAS efforts. Network technologies and services are evolving requiring us to test and develop with current networks while planning for the future. We must raise and maintain awareness of networking issues for our collaborators, network providers and funding agencies. We must clearly present network issues to the funding agencies to get the required support. We need to determine what deficiencies exist in network infrastructure, services and support and work to insure those gaps are closed before they adversely impact our program.
DOE/NSF Review November 29, 2001Shawn Mckee, UMich20 References US ATLAS Facilities Plan – MONARC – HENP Working Group – Iperf monitoring page –
DOE/NSF Review November 29, 2001Shawn Mckee, UMich21 Recommended BW for the US-CERN Link: TAN-WG From the Transatlantic Networking Committee (TAN) report
DOE/NSF Review November 29, 2001Shawn Mckee, UMich22 NetworkNetwork FTE Breakdown Survey0.25 Requirements0.5/ Protocols0.25 Services / / /0.25 Configuration /0.25 Testbed0.25/ Monitoring0.25/ /0.25 End-to-End KB / /0.5 Problem Isolation Liaison0.25/0.25
DOE/NSF Review November 29, 2001Shawn Mckee, UMich23 NetworkNetwork K$ Breakdown Survey44422 Requirements44444 Protocols55105 Services Configuration Testbed Monitoring End-to-End KB Problem Isolation45686 Liaison77888