Presentation is loading. Please wait.

Presentation is loading. Please wait.

US ATLAS Tier-2 Networking Shawn McKee University of Michigan US ATLAS Tier-2 Meeting San Diego, March 8 th, 2007.

Similar presentations

Presentation on theme: "US ATLAS Tier-2 Networking Shawn McKee University of Michigan US ATLAS Tier-2 Meeting San Diego, March 8 th, 2007."— Presentation transcript:

1 US ATLAS Tier-2 Networking Shawn McKee University of Michigan US ATLAS Tier-2 Meeting San Diego, March 8 th, 2007

2 Shawn Mckee, University of Michigan 2 Overview Tier-2 Networking Focus areas, typical issuesTier-2 Networking Focus areas, typical issues Tier-2 statusTier-2 status Related networking research projects and “technology transfer”Related networking research projects and “technology transfer” Appendix has reference URLs and additional slidesAppendix has reference URLs and additional slides

3 Shawn Mckee, University of Michigan 3 Network Focus for Tier-2 Map out details of existing network connectionsMap out details of existing network connections Critical to understanding network usage and problems Locate bottlenecks (physical and logical)Locate bottlenecks (physical and logical) This will point out areas needing updates/changes Implement “Monitoring” and measurement toolsImplement “Monitoring” and measurement tools The ability to see what is actually happening is REQUIRED Set timescales and plans for upgradesSet timescales and plans for upgrades Tune and optimize existing network useTune and optimize existing network use Plan for new capabilities, e.g., “managed networks”Plan for new capabilities, e.g., “managed networks”

4 Shawn Mckee, University of Michigan 4 Simple Network Picture Host 1 Network Infrastructure Host 2 Slides from Rich Carlson, Internet2

5 Shawn Mckee, University of Michigan 5 Switch 1 Switch 2Switch 3 Network Infrastructure R1 R3 R4 R2 R7 R6 R9 R8 R5 Switch 4 Slides from Rich Carlson, Internet2

6 Shawn Mckee, University of Michigan 6 Network Infrastructure Bottlenecks Links too smallLinks too small Using standard Ethernet instead of FastEthernet Links congestedLinks congested Too many hosts crossing this link Scenic routingScenic routing End-to-end path is longer than it needs to be Broken equipmentBroken equipment Bad NIC, broken wire/cable, cross-talk Administrative restrictionsAdministrative restrictions Firewalls, Filters, shapers, restrictors Slides from Rich Carlson, Internet2

7 Shawn Mckee, University of Michigan 7 Host Computer Bottlenecks CPU utilizationCPU utilization What else is the processor doing? Memory limitationsMemory limitations Main memory and network buffers I/O bus speedI/O bus speed Getting data into and out of the NIC Disk access speedDisk access speed Remember typical “new” disks are 50-60 MB/sec Older disks can be 20-30 MB/sec (or even slower) Gigabit corresponds to almost 125MB/sec… Slides from Rich Carlson, Internet2

8 Shawn Mckee, University of Michigan 8 Application Behavior Bottlenecks Chatty protocolChatty protocol Lots of short messages between host 1 & 2 High reliability protocolHigh reliability protocol Send packet and wait for reply before continuing No run-time tuning optionsNo run-time tuning options Use only default settings Blaster protocolBlaster protocol Ignore congestion control feedback Slides from Rich Carlson, Internet2

9 Shawn Mckee, University of Michigan 9 Network Diagnostic Tool (NDT) Rich Carlson (Internet2) has developed NDT which runs on a system with a Web100 instrumented kernel.Rich Carlson (Internet2) has developed NDT which runs on a system with a Web100 instrumented kernel. Measure performance to users desktopMeasure performance to users desktop Identify real problems for real usersIdentify real problems for real users Network infrastructure is the problem Host tuning issues are the problem Make tool simple to use and understandMake tool simple to use and understand Make tool useful for users and network administratorsMake tool useful for users and network administrators

10 Shawn Mckee, University of Michigan 10 Local Host - LISA Localhost Information Service Agent LISA is a Java Web Start application which provides:Localhost Information Service Agent LISA is a Java Web Start application which provides: Integration with MonALISA Complete Monitoring of the System (Load, CPU, Memory, Disk, Disk IO, Paging, Processes, Network Traffic and Connectivity...). History and instantaneous measurements Filters to trigger actions when predefined conditions are detected. A user Friendly GUI to present the monitoring information. Optimization modules for distributed applications. It is a lightweight application that can be easily deployed on any system. Modules for End to End network measurements ( e.g. IPERF). See

11 Shawn Mckee, University of Michigan 11 Network Research and LHC In addition to physical networking infrastructure, ATLAS needs the capability to efficiently and effectively utilize provisioned bandwidth.In addition to physical networking infrastructure, ATLAS needs the capability to efficiently and effectively utilize provisioned bandwidth. Numerous complementary network research efforts are trying to provide this: UltraLight, Terapaths, OSCARs, Lambda Station, USNet, HOPI and others.Numerous complementary network research efforts are trying to provide this: UltraLight, Terapaths, OSCARs, Lambda Station, USNet, HOPI and others. I will give a quick overview of two projects. Others are detailed in additional slides at the end for referenceI will give a quick overview of two projects. Others are detailed in additional slides at the end for reference Goal: to get the useful research into the Tier-2s!Goal: to get the useful research into the Tier-2s!

12 Shawn Mckee, University of Michigan 12 Next generation Information System with the network as a managed componentNext generation Information System with the network as a managed component Hybrid network infrastructure: packet-switched + dynamic optical pathsHybrid network infrastructure: packet-switched + dynamic optical paths End-to-end monitoring; Real- time tracking and optimization; Dynamic bandwidth provisioning;End-to-end monitoring; Real- time tracking and optimization; Dynamic bandwidth provisioning; u Goal: Enable the network as an integrated managed resource u Meta-Goal: Enable physics analysis & discoveries which otherwise could not be achieved

13 Shawn Mckee, University of Michigan 13 PLaNetS: Physics Lambda Network Services PLaNets is a step toward a managed network services infrastructure for LHC Submitted to NSF Feb 3, 2006 Collaboration of BNL, Caltech, FNAL, FIU, U Florida, Michigan PLaNets will focus on the efficient use of the network for data transport Collaborating with OSG – One shared FTE

14 Shawn Mckee, University of Michigan 14 Status for US ATLAS We should conduct a detailed network survey of our sites and maintain it.We should conduct a detailed network survey of our sites and maintain it. Being done so far… We should installed NDT “test points” at each Tier-2 and the Tier-1 for problem diagnosis (New Knoppix CD option)We should installed NDT “test points” at each Tier-2 and the Tier-1 for problem diagnosis (New Knoppix CD option) Requires a host system (gigabit ?) and well document IP for each site IEPM, PerfSONAR and/or MonALISA should be installed and configured to monitor our site’s (and inter-site) network (partially done as part of OSG)IEPM, PerfSONAR and/or MonALISA should be installed and configured to monitor our site’s (and inter-site) network (partially done as part of OSG) Use of LISA or end-host agent to help diagnose and auto-tune systems for optimal network performance.Use of LISA or end-host agent to help diagnose and auto-tune systems for optimal network performance. Package being developed. End of March timescale. Install on NDT machine / servers? We should plan for testing and eventual adoption of useful research deliverables (TeraPaths, UltraLight, LambdStation, OSCARS and others)We should plan for testing and eventual adoption of useful research deliverables (TeraPaths, UltraLight, LambdStation, OSCARS and others)

15 Relatively “Easy” Recommendation As part of UltraLight a java application called FDT (Fast Data Transport) has been developed by Iosif LegrandAs part of UltraLight a java application called FDT (Fast Data Transport) has been developed by Iosif Legrand See for details. Integrated with LISA.See for details. Integrated with LISA. Recommendation is to provide FDT on one or more hosts at each Tier- 2 for testing of data transport. Could be co-located on the “NDT” host each site provides. Next slide shows what was achievable in the WAN with relatively inexpensive systemsRecommendation is to provide FDT on one or more hosts at each Tier- 2 for testing of data transport. Could be co-located on the “NDT” host each site provides. Next slide shows what was achievable in the WAN with relatively inexpensive systems Shawn Mckee, University of Michigan 15

16 FDT Performance Shawn Mckee, University of Michigan 16 ~200 MB/sec

17 Shawn Mckee, University of Michigan 17 Some Network Related Points Networks will be vital to the success of our US ATLAS efforts. Last mile issues are still a challenge…Networks will be vital to the success of our US ATLAS efforts. Last mile issues are still a challenge… Network technologies and services are always evolving requiring us to test and develop with current networks while planning for the future.Network technologies and services are always evolving requiring us to test and develop with current networks while planning for the future. We must continue to maintain awareness of networking issues for our collaborators, network providers and funding agencies.We must continue to maintain awareness of networking issues for our collaborators, network providers and funding agencies. We need to determine what gaps exist in network infrastructure, services and support and work to insure those gaps are closed before they adversely impact our program. (Integrate research and practice)We need to determine what gaps exist in network infrastructure, services and support and work to insure those gaps are closed before they adversely impact our program. (Integrate research and practice)

18 Action Items for Tier-2s Define host for NDT service, install and “publish”Define host for NDT service, install and “publish” Check/update network diagramsCheck/update network diagrams Provide a network test host (FDT, LISA installs)Provide a network test host (FDT, LISA installs) Ideally 10GE connected, reasonable memory 4GB, few disks Install TeraPaths (see Dantong’s talk at end of session)Install TeraPaths (see Dantong’s talk at end of session) Longer term will explore additional capabilities related to monitoring end-to-end, network management and diagnosisLonger term will explore additional capabilities related to monitoring end-to-end, network management and diagnosis Shawn Mckee, University of Michigan 18

19 Shawn Mckee, University of Michigan 19 Thanks QUESTIONS?

20 Shawn Mckee, University of Michigan 20 References HENP Internet2 Sponsored Interest GroupHENP Internet2 Sponsored Interest Group The LHC-Optical Private Network (OPN) page:The LHC-Optical Private Network (OPN) page: Network Tuning and OptimizationNetwork Tuning and Optimization PSC tuning page: Internet2 End-to-end page: Network Research Projects:Network Research Projects: UltraLight http://www.ultralight.org Terapaths Lambda Station http://www.lambdastation.org OSCARS UltraScienceNet HOPI HEP Network related sitesHEP Network related sites ICFA-SCIC (status of HEP related networks)

21 Shawn Mckee, University of Michigan 21 Some Network Presentations and URLs Russ Hobby (I2) noted an interesting talk from Spring 2006 Internet2 meeting…moving football videos between Universities and the corresponding network issues:Russ Hobby (I2) noted an interesting talk from Spring 2006 Internet2 meeting…moving football videos between Universities and the corresponding network issues: Rich Carlson (I2) has a nice “problem solving” set of presentations at:Rich Carlson (I2) has a nice “problem solving” set of presentations at: Les Cottrell (SLAC) pointed out some useful talks/links:Les Cottrell (SLAC) pointed out some useful talks/links: The link for Les’s comprehensive (and excellent) talk on Diagnosing Network problems is at: A network case studies page is at: SLAC has a web page in case of WAN problems. It is at:

Download ppt "US ATLAS Tier-2 Networking Shawn McKee University of Michigan US ATLAS Tier-2 Meeting San Diego, March 8 th, 2007."

Similar presentations

Ads by Google