Presentation is loading. Please wait.

Presentation is loading. Please wait.

Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman Harvey B. Newman California Institute of Technology SC2005 Seattle, November.

Similar presentations


Presentation on theme: "Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman Harvey B. Newman California Institute of Technology SC2005 Seattle, November."— Presentation transcript:

1 Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman Harvey B. Newman California Institute of Technology SC2005 Seattle, November 14-18 2005 California Institute of Technology SC2005 Seattle, November 14-18 2005

2 Beyond the SM: Great Questions of Particle Physics and Cosmology 1.Where does the pattern of particle families and masses come from ? 2.Where are the Higgs particles; what is the mysterious Higgs field ? 3.Why do neutrinos and quarks oscillate ? 4.Is Nature Supersymmetric ? 5.Why is any matter left in the universe ? 6.Why is gravity so weak? 7.Are there extra space-time dimensions? You Are Here. We do not know what makes up 95% of the universe.

3 TOTEM pp, general purpose; HI LHCb: B-physics ALICE : HI  pp  s =14 TeV L=10 34 cm -2 s -1  27 km Tunnel in Switzerland & France Large Hadron Collider CERN, Geneva: 2007 Start CMS Atlas Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … the Unexpected 5000+ Physicists 250+ Institutes 60+ Countries Challenges: Analyze petabytes of complex data cooperatively Harness global computing, data & network resources

4 CERN/Outside Resource Ratio ~1:2 Tier0/(  Tier1)/(  Tier2) ~1:1:1 Tier 1 Tier2 Center Online System CERN Center PBs of Disk; Tape Robot FNAL Center IN2P3 Center INFN Center RAL Center Institute Workstations ~150-1500 MBytes/sec ~10 Gbps 1 to 10 Gbps Tens of Petabytes by 2007-8. An Exabyte ~5-7 Years later. Physics data cache ~PByte/sec 10 - 40 Gbps Tier2 Center ~1-10 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment LHC Data Grid Hierarchy Emerging Vision: A Richly Structured, Global Dynamic System

5 Long Term Trends in Network Traffic Volumes: 300-1000X/10Yrs  SLAC Traffic Growth in Steps: ~10X/4 Years.  Projected: ~2 Terabits/s by ~2014  “Summer” ‘05: 2x10 Gbps links: one for production, one for R&D W. Johnston R. Cottrell Progress in Steps 10 Gbit/s TERABYTES Per Month 100 300 400 500 600 200 ESnet Accepted Traffic 1990 – 2005 Exponential Growth: +82%/Year for the Last 15 Years; 400X Per Decade

6  IPv4 Multi-stream record with FAST TCP: 6.86 Gbps X 27kkm: Nov 2004  IPv6 record: 5.11 Gbps between Geneva and Starlight: Jan. 2005  Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux)  End System Issues: PCI-X Bus, Linux Kernel, NIC Drivers, CPU Internet 2 Land Speed Record (LSR) NB: Manufacturers’ Roadmaps for 2006: One Server Pair to One 10G Link Nov. 2004 Record Network Internet2 LSRs: Blue = HEP 7.2G X 20.7 kkm Throuhgput (Petabit-m/sec)

7 HENP Bandwidth Roadmap for Major Links (in Gbps) Continuing Trend: ~1000 Times Bandwidth Growth Per Decade; HEP: Co-Developer as well as Application Driver of Global Nets

8 LHCNet, ESnet Plan 2006-2009: 20-80Gbps US-CERN, ESnet MANs, IRNC DEN ELP ALB ATL Metropolitan Area Rings Aus. Europe SDG AsiaPac SEA Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene New ESnet hubs ESnet hubs SNV Europe Japan Science Data Network core, 40-60 Gbps circuit transport Lab supplied Major international Production IP ESnet core, 10 Gbps enterprise IP traffic Japan Aus. Metro Rings ESnet 2nd Core: 30-50G ESnet IP Core ≥10 Gbps 10Gb/s 30Gb/s 2 x 10Gb/s NYC CHI LHCNet Data Network (2 to 8 x 10 Gbps US-CERN ) LHCNet Data Network DC GEANT2 SURFNet IN2P3 NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant2 CERN FNAL BNL LHCNet US-CERN: Wavelength Triangle 10/05: 10G CHI + 10G NY 2007: 20G + 20G 2009: ~40G + 40G ESNet MANs to FNAL & BNL; Dark fiber (60Gbps) to FNAL IRNC Links

9 Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths   Preview global-scale data analysis of the LHC Era (2007-2020+), using next-generation networks and intelligent grid systems   Using state of the art WAN infrastructure and Grid-based Web service frameworks, based on the LHC Tiered Data Grid Architecture   Using a realistic mixture of streams: organized transfer of multi-TB event datasets, plus numerous smaller flows of physics data that absorb the remaining capacity.   The analysis software suites are based on the Grid-enabled Analysis Environment (GAE) developed at Caltech and U. Florida, as well as Xrootd from SLAC, and dcache from FNAL   Monitored by Caltech’s MonALISA global monitoring and control system

10 Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths   We used Twenty Two [*] 10 Gbps waves to carry bidirectional traffic between Fermilab, Caltech, SLAC, BNL, CERN and other partner Grid Service sites including: Michigan, Florida, Manchester, Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in Brazil, Korea (KNU), and Japan (KEK)   Results   151 Gbps peak, 100+ Gbps of throughput sustained for hours: 475 Terabytes of physics data transported in < 24 hours   131 Gbps measured by SCInet bwc team on 17 of our waves   Using real physics applications and production as well as test systems for data access, transport and analysis: bbcp, xrootd, dcache, and gridftp; and grid analysis tool suites   Linux kernel for TCP-based protocols, including Caltech’s FAST   Far surpassing our previous SC2004 BWC Record of 101 Gbps [*] 15 at the Caltech/CACR and 7 at the FNAL/SLAC Booth

11 Monitoring NLR, Abilene/HOPI, LHCNet, USNet, TeraGrid, PWave, SCInet, Gloriad, JGN2, WHREN, other Int’l R&E Nets, and 14000+ Grid Nodes Simultaneously I. Legrand

12

13

14 Switch and Server Interconnections at the Caltech Booth (#428)  15 10G Waves  72 nodes with 280+ Cores  64 10G Switch Ports: 2 Fully Populated Cisco 6509Es  45 Neterion 10 GbE NICs  200 SATA Disks  40 Gbps (20 HBAs) to StorCloud  Thursday – Sunday Setup http://monalisa-ul.caltech.edu:8080/stats?page=nodeinfo_sys

15

16

17

18 Fermilab  Our BWC data sources are the Production Storage Systems and File Servers used by:  CDF  DØ  US CMS Tier 1  Sloan Digital Sky Survey  Each of these produces, stores and moves Multi- TB to PB-scale data: Tens of TB per day  ~600 gridftp servers (of 1000s) directly involved

19  Mass Storage Facilities  Over 3.3 PB stored.  Ingest ~ 200 TB/month.  20 to 300 TB/day read.  Disk pools: “dCache” backed by tape through “Enstore.”  Multiple data transfer protocols  WAN: gsiftp, http  LAN: dcap (presents POSIX I/O interface) Fermilab

20

21 Xrootd Server Performance  Scientific Results  Ad hoc Analysis of Multi-TByte Archives  Immediate exploration  Spurs novel discovery approaches  Linear Scaling  Hardware Performance  Deterministic Sizing  High Capacity  Thousands of clients  Hundreds of Parallel Streams  Very Low Latency  12us + Transfer Cost  Device + NIC Limited Excellent Across WANs A. Hanushevsky

22 Xrootd Clustering Client Redirector (Head Node) Data Servers open file X A B C go to C open file X Who has file X? I have Cluster Client sees all servers as xrootd data servers Supervisor (sub-redirector) Who has file X? D E F I have go to F open file X I have  Unbounded Clustering  Self organizing  Total Fault Tolerance  Automatic real- time reorganization  Result  Minimum Admin Overhead  Better Client CPU Utilization  More results in less time at less cost

23 Remote Sites: Caltech, UFL, Brazil….. GAE Services ROOT Analysis ROOT Analysis ROOT Analysis  Authenticated users automatically discover, and initiate multiple transfers of physics datasets (Root files) through secure Clarens based GAE services.  Transfer is monitored through MonALISA  Once data arrives at the target sites (remote) analysis can start by authenticated users, using the Root analysis framework.  Using the Clarens Root viewer or COJAC event viewer data from remote can be presented transparently to the user.

24 SC|05 Abilene and HOPI Waves

25 GLORIAD: 10 Gbps Optical Ring Around the Globe by March 2007 GLORIAD Circuits Today  10 Gbps Hong Kong-Daejon- Seattle  10 Gbps Seattle-Chicago-NYC (CANARIE contribution to GLORIAD)  622 Mbps Moscow-AMS-NYC  2.5 Gbps Moscow-AMS  155 Mbps Beijing-Khabarovsk- Moscow  2.5 Gbps Beijing-Hong Kong  1 GbE NYC-Chicago (CANARIE) China, Russia, Korea, Japan, US, Netherlands Partnership US: NSF IRNC Program

26 ESLEA/UKLight SC|05 Network Diagram OC-192 6 X 1 GE

27 KNU (Korea) Main Goals  Uses 10Gbps GLORIAD link from Korea to US, which is called BIG- GLORIAD, also part of UltraLight  Try to saturate this BIG-GLORIAD link with servers and cluster storages connected with 10Gbps  Korea is planning to be a Tier-1 site for LHC experiments Korea U.S. BIG-GLORIAD

28 KEK (Japan) at SC05 10GE Switches on the KEK-JGN2-StarLight Path JGN2: 10G Network Research Testbed Operational since 4/04 10Gbps L2 between Tsukuba and Tokyo Otemachi 10Gbps IP to Starlight since August 2004 10Gbps L2 to Starlight since September 2005 Otemachi–Chicago OC192 link replaced by 10GE WANPHY in September 2005

29 Brazil HEPGrid: Rio de Janeiro (UERJ) and Sao Paulo (UNESP)

30 “Global Lambdas for Particle Physics” A Worldwide Network & Grid Experiment   We have Previewed the IT Challenges of Next Generation Science at the High Energy Frontier (for the LHC and other major programs)   Petabyte-scale datasets   Tens of national and transoceanic links at 10 Gbps (and up)   100+ Gbps aggregate data transport sustained for hours; We reached a Petabyte/day transport rate for real physics data   We set the scale and learned to gauge the difficulty of the global networks and transport systems required for the LHC mission   But we set up, shook down and successfully ran the system in <1 week   We have substantive take-aways from this marathon exercise   An optimized Linux (2.6.12 + FAST + NFSv4) kernel for data transport; after 7 full kernel-build cycles in 4 days   A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions   Extension of Xrootd, an optimized low-latency file access application for clusters, across the wide area   Understanding of the limits of 10 Gbps-capable systems under stress

31 “Global Lambdas for Particle Physics” A Worldwide Network & Grid Experiment   We are grateful to our many network partners: SCInet, LHCNet, Starlight, NLR, Internet2’s Abilene and HOPI, ESnet, UltraScience Net, MiLR, FLR, CENIC, Pacific Wave, UKLight, TeraGrid, Gloriad, AMPATH, RNP, ANSP, CANARIE and JGN2.   And to our partner projects: US CMS, US ATLAS, D0, CDF, BaBar, US LHCNet, UltraLight, LambdaStation, Terapaths, PPDG, GriPhyN/iVDGL, LHCNet, StorCloud, SLAC IEPM, ICFA/SCIC and Open Science Grid   Our Supporting Agencies: DOE and NSF   And for the generosity of our vendor supporters, especially Cisco Systems, Neterion, HP, IBM, and many others, who have made this possible   And the Hudson Bay Fan Company…

32

33 Extra Slides Follow

34 Global Lambdas for Particle Physics Analysis SC|05 Bandwidth Challenge Entry Caltech, CERN, Fermilab, Florida, Manchester, Michigan, SLAC, Vanderbilt, Brazil, Korea, Japan, et al CERN's Large Hadron Collider experiments: Data/Compute/Network Intensive Discovering the Higgs, SuperSymmetry, or Extra Space-Dimensions - with a Global Grid Worldwide Collaborations of Physicists Working Together; while Developing Next-generation Global Network and Grid Systems

35 http/https Client Web server Web server Service 3 rd party application Clarens Clarens (ACL, X509, Discovery) XML-RPC SOAP Java RMI JSON RPC Catalog Storage Analysis Sandbox select dataset Network datasets Start (remote) analysis  Authentication  Access control on Web Services.  Remote file access (and access control on files).  Discovery of Web Services and Software.  Shell service. Shell like access to remote machines (managed by access control lists).  Proxy certificate functionality  Virtual Organization management and role management.  User's point of access to a Grid system.  Provides environment where user can:  Access Grid resources and services.  Execute and monitor Grid applications.  Collaborate with other users.  One stop shop for Grid needs Portals can lower the barrier for users to access Web Services and using Grid enabled applications

36

37

38

39


Download ppt "Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman Harvey B. Newman California Institute of Technology SC2005 Seattle, November."

Similar presentations


Ads by Google