Presentation is loading. Please wait.

Presentation is loading. Please wait.

SC|05 Bandwidth Challenge ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford Linear Accelerator Center ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford.

Similar presentations


Presentation on theme: "SC|05 Bandwidth Challenge ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford Linear Accelerator Center ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford."— Presentation transcript:

1 SC|05 Bandwidth Challenge ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford Linear Accelerator Center ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford Linear Accelerator Center

2

3 LHC Network Requirements CERN/Outside Resource Ratio ~1:2 Tier0/(  Tier1)/(  Tier2) ~1:1:1 Tier 1 Tier2 Center Online System CERN Center PBs of Disk; Tape Robot FNAL Center IN2P3 Center INFN Center RAL Center Institute Workstations ~150-1500 MBytes/sec ~10 Gbps 1 to 10 Gbps Tens of Petabytes by 2007-8. An Exabyte ~5-7 Years later. Physics data cache ~PByte/sec 10 - 40 Gbps Tier2 Center ~1-10 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment

4 Overview  Bandwidth Challenge  ‘ The Bandwidth Challenge highlights the best and brightest in new techniques for creating and utilizing vast rivers of data that can be carried across advanced networks. ‘  Transfer as much data as possible using real applications over a 2 hour window  We did…  Distributed TeraByte Particle Physics Data Sample Analysis  ‘Demonstrated high speed transfers of particle physics data between host labs and collaborating institutes in the USA and worldwide. Using state of the art WAN infrastructure and Grid Web Services based on the LHC Tiered Architecture, they showed real-time particle event analysis requiring transfers of Terabyte-scale datasets.’  Bandwidth Challenge  ‘ The Bandwidth Challenge highlights the best and brightest in new techniques for creating and utilizing vast rivers of data that can be carried across advanced networks. ‘  Transfer as much data as possible using real applications over a 2 hour window  We did…  Distributed TeraByte Particle Physics Data Sample Analysis  ‘Demonstrated high speed transfers of particle physics data between host labs and collaborating institutes in the USA and worldwide. Using state of the art WAN infrastructure and Grid Web Services based on the LHC Tiered Architecture, they showed real-time particle event analysis requiring transfers of Terabyte-scale datasets.’

5 Overview  In detail, during the bandwidth challenge (2 hours):  131 Gbps measured by SCInet BWC team on 17 of our waves (15 minute average)  95.37TB of data transferred.  (3.8 DVD’s per second)  90-150Gbps (peak 150.7Gbps)  On day of challenge  Transferred ~475TB ‘practising’ (waves were shared, still tuning applications and hardware)  Peak one way USN utlisation observed on a single link was 9.1Gbps (Caltech) and 8.4Gbps (SLAC)  Also wrote to StorCloud  SLAC: wrote 3.2TB in 1649 files during BWC  Caltech: 6GB/sec with 20 nodes  In detail, during the bandwidth challenge (2 hours):  131 Gbps measured by SCInet BWC team on 17 of our waves (15 minute average)  95.37TB of data transferred.  (3.8 DVD’s per second)  90-150Gbps (peak 150.7Gbps)  On day of challenge  Transferred ~475TB ‘practising’ (waves were shared, still tuning applications and hardware)  Peak one way USN utlisation observed on a single link was 9.1Gbps (Caltech) and 8.4Gbps (SLAC)  Also wrote to StorCloud  SLAC: wrote 3.2TB in 1649 files during BWC  Caltech: 6GB/sec with 20 nodes

6 Participants  Caltech/HEP/CACR/ NetLab: Harvey Newman, Julian Bunn - Contact, Dan Nae, Sylvain Ravot, Conrad Steenberg, Yang Xia, Michael Thomas Caltech  SLAC/IEPM: Les Cottrell, Gary Buhrmaster, Yee-Ting Li, Connie Logg SLAC  FNAL Matt Crawford, Don Petravick, Vyto Grigaliunas, Dan Yocum FNAL  University of Michigan Shawn McKee, Andy Adamson, Roy Hockett, Bob Ball, Richard French, Dean Hildebrand, Erik Hofer, David Lee, Ali Lotia, Ted Hanss, Scott Gerstenberger University of Michigan  Caltech/HEP/CACR/ NetLab: Harvey Newman, Julian Bunn - Contact, Dan Nae, Sylvain Ravot, Conrad Steenberg, Yang Xia, Michael Thomas Caltech  SLAC/IEPM: Les Cottrell, Gary Buhrmaster, Yee-Ting Li, Connie Logg SLAC  FNAL Matt Crawford, Don Petravick, Vyto Grigaliunas, Dan Yocum FNAL  University of Michigan Shawn McKee, Andy Adamson, Roy Hockett, Bob Ball, Richard French, Dean Hildebrand, Erik Hofer, David Lee, Ali Lotia, Ted Hanss, Scott Gerstenberger University of Michigan  U Florida Paul Avery, Dimitri Bourilkov, U Florida  University of Manchester: Richard Hughes-Jones ・ University of Manchester  CERN, Switzerland David Foster CERN, Switzerland  KAIST, Korea Yusung Kim, KAIST, Korea  Kyungpook Univserity, Korea, Kihwan Kwon, Kyungpook Univserity, Korea  UERJ, Brazil Alberto Santoro, UERJ, Brazil  UNESP, Brazil Sergio Novaes, UNESP, Brazil  USP, Brazil Luis Fernandez Lopez USP, Brazil  GLORIAD, USA: Greg Cole, Natasha Bulashova GLORIAD  U Florida Paul Avery, Dimitri Bourilkov, U Florida  University of Manchester: Richard Hughes-Jones ・ University of Manchester  CERN, Switzerland David Foster CERN, Switzerland  KAIST, Korea Yusung Kim, KAIST, Korea  Kyungpook Univserity, Korea, Kihwan Kwon, Kyungpook Univserity, Korea  UERJ, Brazil Alberto Santoro, UERJ, Brazil  UNESP, Brazil Sergio Novaes, UNESP, Brazil  USP, Brazil Luis Fernandez Lopez USP, Brazil  GLORIAD, USA: Greg Cole, Natasha Bulashova GLORIAD

7 Networking Overview  We had 22 10Gbits/s waves to the Caltech and SLAC/FNAL booths. Of these:  15 waves to the Caltech booth (from Florida (1), Korea/GLORIAD (1), Brazil (1 * 2.5Gbits/s), Caltech (2), LA (2), UCSD, CERN (2), U Michigan (3), FNAL(2)).  7 x 10Gbits/s waves to the SLAC/FNAL booth (2 from SLAC, 1 from the UK, and 4 from FNAL).  The waves were provided by Abilene, Canarie, Cisco (5), ESnet (3), GLORIAD (1), HOPI (1), Michigan Light Rail (MiLR), National Lambda Rail (NLR), TeraGrid (3) and UltraScienceNet (4).  We had 22 10Gbits/s waves to the Caltech and SLAC/FNAL booths. Of these:  15 waves to the Caltech booth (from Florida (1), Korea/GLORIAD (1), Brazil (1 * 2.5Gbits/s), Caltech (2), LA (2), UCSD, CERN (2), U Michigan (3), FNAL(2)).  7 x 10Gbits/s waves to the SLAC/FNAL booth (2 from SLAC, 1 from the UK, and 4 from FNAL).  The waves were provided by Abilene, Canarie, Cisco (5), ESnet (3), GLORIAD (1), HOPI (1), Michigan Light Rail (MiLR), National Lambda Rail (NLR), TeraGrid (3) and UltraScienceNet (4).

8 Network Overview

9 Hardware (SLAC only)  At SLAC:  14 x 1.8Ghz Sun v20z (Dual Opteron)  2 x Sun 3500 Disk trays (2TB of storage)  12 x Chelsio T110 10Gb NICs (LR)  2 x Neterion/S2io Xframe I (SR)  Dedicated Cisco 6509 with 4 x 4x10GB blades  At SC|05:  14 x 2.6Ghz Sun v20z (Dual Opteron)  10 QLogic HBA’s for StorCloud Access  50TB Storage at SC|05 provide by 3PAR (Shared with Caltech)  12 x Neterion/S2io Xframe I NICs (SR)  2 x Chelsio T110 NICs (LR)  Shared Cisco 6509 with 6 x 4x10GB blades  At SLAC:  14 x 1.8Ghz Sun v20z (Dual Opteron)  2 x Sun 3500 Disk trays (2TB of storage)  12 x Chelsio T110 10Gb NICs (LR)  2 x Neterion/S2io Xframe I (SR)  Dedicated Cisco 6509 with 4 x 4x10GB blades  At SC|05:  14 x 2.6Ghz Sun v20z (Dual Opteron)  10 QLogic HBA’s for StorCloud Access  50TB Storage at SC|05 provide by 3PAR (Shared with Caltech)  12 x Neterion/S2io Xframe I NICs (SR)  2 x Chelsio T110 NICs (LR)  Shared Cisco 6509 with 6 x 4x10GB blades

10 Hardware at SC|05

11 Software  BBCP ‘Babar File Copy’  Uses ‘ssh’ for authentication  Multiple stream capable  Features ‘rate synchronisation’ to reduce byte retransmissions  Sustained over 9Gbps on a single session  XrootD  Library for transparent file access (standard unix file functions)  Designed primarily for LAN access (transaction based protocol)  Managed over 35Gbit/sec (in two directions) on 2 x 10Gbps waves  Transferred 18TBytes in 257,913 files  DCache  20Gbps production and test cluster traffic  BBCP ‘Babar File Copy’  Uses ‘ssh’ for authentication  Multiple stream capable  Features ‘rate synchronisation’ to reduce byte retransmissions  Sustained over 9Gbps on a single session  XrootD  Library for transparent file access (standard unix file functions)  Designed primarily for LAN access (transaction based protocol)  Managed over 35Gbit/sec (in two directions) on 2 x 10Gbps waves  Transferred 18TBytes in 257,913 files  DCache  20Gbps production and test cluster traffic

12 Last year (SC|04) BWC Aggregate Bandwidth

13 Cumulative Data Transferred Bandwidth Challenge period

14 Component Traffic

15 SLAC-ESnet FermiLab-HOPI SLAC-ESnet-USNFNAL-UltraLight UKLight Out from booth SLAC-FermiLab-UK Bandwidth Contributions In to booth

16 In to booth Out from booth ESnet routed ESnet SDN layer 2 via USN Bandwidth Challenge period SLAC Cluster Contributions

17 SLAC/FNAL Booth Aggregate Mbps Waves

18 Problems…  Managerial/PR  Initial request for loan hardware took place 6 months in advance!  Lots and lots of paperwork to keep account of all loan equipment  Logistical  Set up and tore down a pseudo production network and servers in a space of week!  Testing could not begin until waves were alight  Most waves lit day before challenge!  Shipping so much hardware not cheap!  Setting up monitoring  Managerial/PR  Initial request for loan hardware took place 6 months in advance!  Lots and lots of paperwork to keep account of all loan equipment  Logistical  Set up and tore down a pseudo production network and servers in a space of week!  Testing could not begin until waves were alight  Most waves lit day before challenge!  Shipping so much hardware not cheap!  Setting up monitoring

19 Problems…  Tried to configure hardware and software prior to show  Hardware  NICS  We had 3 bad Chelsios (bad memory)  Xframe II’s did not work in UKLight’s Boston machines  Hard-disks  3 dead 10K disks (had to ship in spare)  1 x 4Port 10Gb blade DOA  MTU mismatch between domains  Router blade died during stress testing day before BWC!  Cables! Cables! Cables!  Software  Used golden disks for duplication (still takes 30 minutes per disk to replicate!)  Linux kernels:  Initially used 2.6.14, found sever performance problems compared to 2.6.12.  (New) Router firmware caused crashes under heavy load  Unfortunately, only discovered just before BWC  Had to manually restart the affected ports during BWC  Tried to configure hardware and software prior to show  Hardware  NICS  We had 3 bad Chelsios (bad memory)  Xframe II’s did not work in UKLight’s Boston machines  Hard-disks  3 dead 10K disks (had to ship in spare)  1 x 4Port 10Gb blade DOA  MTU mismatch between domains  Router blade died during stress testing day before BWC!  Cables! Cables! Cables!  Software  Used golden disks for duplication (still takes 30 minutes per disk to replicate!)  Linux kernels:  Initially used 2.6.14, found sever performance problems compared to 2.6.12.  (New) Router firmware caused crashes under heavy load  Unfortunately, only discovered just before BWC  Had to manually restart the affected ports during BWC

20 Problems  Most transfers were from memory to memory (Ramdisk etc).  Local caching of (small) files in memory  Reading and writing to disk will be the next bottleneck to overcome  Most transfers were from memory to memory (Ramdisk etc).  Local caching of (small) files in memory  Reading and writing to disk will be the next bottleneck to overcome

21 Conclusion  Previewed the IT Challenges of the next generation Data Intensive Science Applications (High Energy Physics, astronomy etc)  Petabyte-scale datasets  Tens of national and transoceanic links at 10 Gbps (and up)  100+ Gbps aggregate data transport sustained for hours; We reached a Petabyte/day transport rate for real physics data  Learned to gauge difficulty of the global networks and transport systems required for the LHC mission  Set up, shook down and successfully ran the systems in < 1 week  Understood and optimized the configurations of various components (Network interfaces, router/switches, OS, TCP kernels, applications) for high performance over the wide area network.  Previewed the IT Challenges of the next generation Data Intensive Science Applications (High Energy Physics, astronomy etc)  Petabyte-scale datasets  Tens of national and transoceanic links at 10 Gbps (and up)  100+ Gbps aggregate data transport sustained for hours; We reached a Petabyte/day transport rate for real physics data  Learned to gauge difficulty of the global networks and transport systems required for the LHC mission  Set up, shook down and successfully ran the systems in < 1 week  Understood and optimized the configurations of various components (Network interfaces, router/switches, OS, TCP kernels, applications) for high performance over the wide area network.

22 Conclusion  Products from this the exercise  An optimized Linux (2.6.12 + NFSv4 + FAST and other TCP stacks) kernel for data transport; after 7 full kernel-build cycles in 4 days  A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions.  Extensions of Xrootd, an optimized low-latency file access application for clusters, across the wide area  Understanding of the limits of 10 Gbps-capable systems under stress.  How to effectively utilize 10GE and 1GE connected systems to drive 10 gigabit wavelengths in both directions.  Use of production and test clusters at FNAL reaching more than 20 Gbps of network throughput.  Significant efforts remain from the perspective of high-energy physics  Management, integration and optimization of network resources  End-to-end capabilities able to utilize these network resources. This includes applications and IO devices (disk and storage systems)  Products from this the exercise  An optimized Linux (2.6.12 + NFSv4 + FAST and other TCP stacks) kernel for data transport; after 7 full kernel-build cycles in 4 days  A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions.  Extensions of Xrootd, an optimized low-latency file access application for clusters, across the wide area  Understanding of the limits of 10 Gbps-capable systems under stress.  How to effectively utilize 10GE and 1GE connected systems to drive 10 gigabit wavelengths in both directions.  Use of production and test clusters at FNAL reaching more than 20 Gbps of network throughput.  Significant efforts remain from the perspective of high-energy physics  Management, integration and optimization of network resources  End-to-end capabilities able to utilize these network resources. This includes applications and IO devices (disk and storage systems)

23 Press and PR  11/8/05 - Brit Boffins aim to Beat LAN speed record from vnunet.comBrit Boffins aim to Beat LAN speed record  SC|05 Bandwidth Challenge SLAC Interaction Point. SC|05 Bandwidth Challenge  Top Researchers, Projects in High Performance Computing Honored at SC/05... Business Wire (press release) - San Francisco, CA, USA Top Researchers, Projects in High Performance Computing Honored at SC/05...  11/18/05 - Official Winner AnnouncementOfficial Winner Announcement  11/18/05 - SC|05 Bandwidth Challenge Slide PresentationSC|05 Bandwidth Challenge Slide Presentation  11/23/05 - Bandwidth Challenge Results from SlashdotBandwidth Challenge Results  12/6/05 - Caltech press releaseCaltech press release  12/6/05 - Neterion Enables High Energy Physics Team to Beat World Record Speed at SC05 Conference CCN Matthews News Distribution ExpertsNeterion Enables High Energy Physics Team to Beat World Record Speed at SC05 Conference  High energy physics team captures network prize at SC|05 from SLAC High energy physics team captures network prize at SC|05  High energy physics team captures network prize at SC|05 EurekaAlert! High energy physics team captures network prize at SC|05  12/7/05 - High Energy Physics Team Smashes Network Record, from Science Grid this Week.High Energy Physics Team Smashes Network Record  Congratulations to our Research Partners for a New Bandwidth Record at SuperComputing 2005, from Neterion. Congratulations to our Research Partners for a New Bandwidth Record at SuperComputing 2005  11/8/05 - Brit Boffins aim to Beat LAN speed record from vnunet.comBrit Boffins aim to Beat LAN speed record  SC|05 Bandwidth Challenge SLAC Interaction Point. SC|05 Bandwidth Challenge  Top Researchers, Projects in High Performance Computing Honored at SC/05... Business Wire (press release) - San Francisco, CA, USA Top Researchers, Projects in High Performance Computing Honored at SC/05...  11/18/05 - Official Winner AnnouncementOfficial Winner Announcement  11/18/05 - SC|05 Bandwidth Challenge Slide PresentationSC|05 Bandwidth Challenge Slide Presentation  11/23/05 - Bandwidth Challenge Results from SlashdotBandwidth Challenge Results  12/6/05 - Caltech press releaseCaltech press release  12/6/05 - Neterion Enables High Energy Physics Team to Beat World Record Speed at SC05 Conference CCN Matthews News Distribution ExpertsNeterion Enables High Energy Physics Team to Beat World Record Speed at SC05 Conference  High energy physics team captures network prize at SC|05 from SLAC High energy physics team captures network prize at SC|05  High energy physics team captures network prize at SC|05 EurekaAlert! High energy physics team captures network prize at SC|05  12/7/05 - High Energy Physics Team Smashes Network Record, from Science Grid this Week.High Energy Physics Team Smashes Network Record  Congratulations to our Research Partners for a New Bandwidth Record at SuperComputing 2005, from Neterion. Congratulations to our Research Partners for a New Bandwidth Record at SuperComputing 2005

24

25 SLAC/UK Contribution ESnet/USN layer 2 UKLight In to booth Out from booth ESnet routed

26 SLAC/Esnet Contribution Mbps Hosts Aggregate

27 HOPI USN FermiLab Contribution UltraLight


Download ppt "SC|05 Bandwidth Challenge ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford Linear Accelerator Center ESCC Meeting 9th February ‘06 Yee-Ting Li Stanford."

Similar presentations


Ads by Google