Download presentation
Presentation is loading. Please wait.
1
Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman Harvey B. Newman California Institute of Technology SC2005 Seattle, November 14-18 2005 California Institute of Technology SC2005 Seattle, November 14-18 2005
2
Beyond the SM: Great Questions of Particle Physics and Cosmology 1.Where does the pattern of particle families and masses come from ? 2.Where are the Higgs particles; what is the mysterious Higgs field ? 3.Why do neutrinos and quarks oscillate ? 4.Is Nature Supersymmetric ? 5.Why is any matter left in the universe ? 6.Why is gravity so weak? 7.Are there extra space-time dimensions? You Are Here. We do not know what makes up 95% of the universe.
3
TOTEM pp, general purpose; HI LHCb: B-physics ALICE : HI pp s =14 TeV L=10 34 cm -2 s -1 27 km Tunnel in Switzerland & France Large Hadron Collider CERN, Geneva: 2007 Start CMS Atlas Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … the Unexpected 5000+ Physicists 250+ Institutes 60+ Countries Challenges: Analyze petabytes of complex data cooperatively Harness global computing, data & network resources
4
CERN/Outside Resource Ratio ~1:2 Tier0/( Tier1)/( Tier2) ~1:1:1 Tier 1 Tier2 Center Online System CERN Center PBs of Disk; Tape Robot FNAL Center IN2P3 Center INFN Center RAL Center Institute Workstations ~150-1500 MBytes/sec ~10 Gbps 1 to 10 Gbps Tens of Petabytes by 2007-8. An Exabyte ~5-7 Years later. Physics data cache ~PByte/sec 10 - 40 Gbps Tier2 Center ~1-10 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment LHC Data Grid Hierarchy Emerging Vision: A Richly Structured, Global Dynamic System
5
Long Term Trends in Network Traffic Volumes: 300-1000X/10Yrs SLAC Traffic Growth in Steps: ~10X/4 Years. Projected: ~2 Terabits/s by ~2014 “Summer” ‘05: 2x10 Gbps links: one for production, one for R&D W. Johnston R. Cottrell Progress in Steps 10 Gbit/s TERABYTES Per Month 100 300 400 500 600 200 ESnet Accepted Traffic 1990 – 2005 Exponential Growth: +82%/Year for the Last 15 Years; 400X Per Decade
6
IPv4 Multi-stream record with FAST TCP: 6.86 Gbps X 27kkm: Nov 2004 IPv6 record: 5.11 Gbps between Geneva and Starlight: Jan. 2005 Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux) End System Issues: PCI-X Bus, Linux Kernel, NIC Drivers, CPU Internet 2 Land Speed Record (LSR) NB: Manufacturers’ Roadmaps for 2006: One Server Pair to One 10G Link Nov. 2004 Record Network Internet2 LSRs: Blue = HEP 7.2G X 20.7 kkm Throuhgput (Petabit-m/sec)
7
HENP Bandwidth Roadmap for Major Links (in Gbps) Continuing Trend: ~1000 Times Bandwidth Growth Per Decade; HEP: Co-Developer as well as Application Driver of Global Nets
8
LHCNet, ESnet Plan 2006-2009: 20-80Gbps US-CERN, ESnet MANs, IRNC DEN ELP ALB ATL Metropolitan Area Rings Aus. Europe SDG AsiaPac SEA Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene New ESnet hubs ESnet hubs SNV Europe Japan Science Data Network core, 40-60 Gbps circuit transport Lab supplied Major international Production IP ESnet core, 10 Gbps enterprise IP traffic Japan Aus. Metro Rings ESnet 2nd Core: 30-50G ESnet IP Core ≥10 Gbps 10Gb/s 30Gb/s 2 x 10Gb/s NYC CHI LHCNet Data Network (2 to 8 x 10 Gbps US-CERN ) LHCNet Data Network DC GEANT2 SURFNet IN2P3 NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant2 CERN FNAL BNL LHCNet US-CERN: Wavelength Triangle 10/05: 10G CHI + 10G NY 2007: 20G + 20G 2009: ~40G + 40G ESNet MANs to FNAL & BNL; Dark fiber (60Gbps) to FNAL IRNC Links
9
Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths Preview global-scale data analysis of the LHC Era (2007-2020+), using next-generation networks and intelligent grid systems Using state of the art WAN infrastructure and Grid-based Web service frameworks, based on the LHC Tiered Data Grid Architecture Using a realistic mixture of streams: organized transfer of multi-TB event datasets, plus numerous smaller flows of physics data that absorb the remaining capacity. The analysis software suites are based on the Grid-enabled Analysis Environment (GAE) developed at Caltech and U. Florida, as well as Xrootd from SLAC, and dcache from FNAL Monitored by Caltech’s MonALISA global monitoring and control system
10
Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths We used Twenty Two [*] 10 Gbps waves to carry bidirectional traffic between Fermilab, Caltech, SLAC, BNL, CERN and other partner Grid Service sites including: Michigan, Florida, Manchester, Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in Brazil, Korea (KNU), and Japan (KEK) Results 151 Gbps peak, 100+ Gbps of throughput sustained for hours: 475 Terabytes of physics data transported in < 24 hours 131 Gbps measured by SCInet bwc team on 17 of our waves Using real physics applications and production as well as test systems for data access, transport and analysis: bbcp, xrootd, dcache, and gridftp; and grid analysis tool suites Linux kernel for TCP-based protocols, including Caltech’s FAST Far surpassing our previous SC2004 BWC Record of 101 Gbps [*] 15 at the Caltech/CACR and 7 at the FNAL/SLAC Booth
11
Monitoring NLR, Abilene/HOPI, LHCNet, USNet, TeraGrid, PWave, SCInet, Gloriad, JGN2, WHREN, other Int’l R&E Nets, and 14000+ Grid Nodes Simultaneously I. Legrand
14
Switch and Server Interconnections at the Caltech Booth (#428) 15 10G Waves 72 nodes with 280+ Cores 64 10G Switch Ports: 2 Fully Populated Cisco 6509Es 45 Neterion 10 GbE NICs 200 SATA Disks 40 Gbps (20 HBAs) to StorCloud Thursday – Sunday Setup http://monalisa-ul.caltech.edu:8080/stats?page=nodeinfo_sys
18
Fermilab Our BWC data sources are the Production Storage Systems and File Servers used by: CDF DØ US CMS Tier 1 Sloan Digital Sky Survey Each of these produces, stores and moves Multi- TB to PB-scale data: Tens of TB per day ~600 gridftp servers (of 1000s) directly involved
19
Mass Storage Facilities Over 3.3 PB stored. Ingest ~ 200 TB/month. 20 to 300 TB/day read. Disk pools: “dCache” backed by tape through “Enstore.” Multiple data transfer protocols WAN: gsiftp, http LAN: dcap (presents POSIX I/O interface) Fermilab
21
Xrootd Server Performance Scientific Results Ad hoc Analysis of Multi-TByte Archives Immediate exploration Spurs novel discovery approaches Linear Scaling Hardware Performance Deterministic Sizing High Capacity Thousands of clients Hundreds of Parallel Streams Very Low Latency 12us + Transfer Cost Device + NIC Limited Excellent Across WANs A. Hanushevsky
22
Xrootd Clustering Client Redirector (Head Node) Data Servers open file X A B C go to C open file X Who has file X? I have Cluster Client sees all servers as xrootd data servers Supervisor (sub-redirector) Who has file X? D E F I have go to F open file X I have Unbounded Clustering Self organizing Total Fault Tolerance Automatic real- time reorganization Result Minimum Admin Overhead Better Client CPU Utilization More results in less time at less cost
23
Remote Sites: Caltech, UFL, Brazil….. GAE Services ROOT Analysis ROOT Analysis ROOT Analysis Authenticated users automatically discover, and initiate multiple transfers of physics datasets (Root files) through secure Clarens based GAE services. Transfer is monitored through MonALISA Once data arrives at the target sites (remote) analysis can start by authenticated users, using the Root analysis framework. Using the Clarens Root viewer or COJAC event viewer data from remote can be presented transparently to the user.
24
SC|05 Abilene and HOPI Waves
25
GLORIAD: 10 Gbps Optical Ring Around the Globe by March 2007 GLORIAD Circuits Today 10 Gbps Hong Kong-Daejon- Seattle 10 Gbps Seattle-Chicago-NYC (CANARIE contribution to GLORIAD) 622 Mbps Moscow-AMS-NYC 2.5 Gbps Moscow-AMS 155 Mbps Beijing-Khabarovsk- Moscow 2.5 Gbps Beijing-Hong Kong 1 GbE NYC-Chicago (CANARIE) China, Russia, Korea, Japan, US, Netherlands Partnership US: NSF IRNC Program
26
ESLEA/UKLight SC|05 Network Diagram OC-192 6 X 1 GE
27
KNU (Korea) Main Goals Uses 10Gbps GLORIAD link from Korea to US, which is called BIG- GLORIAD, also part of UltraLight Try to saturate this BIG-GLORIAD link with servers and cluster storages connected with 10Gbps Korea is planning to be a Tier-1 site for LHC experiments Korea U.S. BIG-GLORIAD
28
KEK (Japan) at SC05 10GE Switches on the KEK-JGN2-StarLight Path JGN2: 10G Network Research Testbed Operational since 4/04 10Gbps L2 between Tsukuba and Tokyo Otemachi 10Gbps IP to Starlight since August 2004 10Gbps L2 to Starlight since September 2005 Otemachi–Chicago OC192 link replaced by 10GE WANPHY in September 2005
29
Brazil HEPGrid: Rio de Janeiro (UERJ) and Sao Paulo (UNESP)
30
“Global Lambdas for Particle Physics” A Worldwide Network & Grid Experiment We have Previewed the IT Challenges of Next Generation Science at the High Energy Frontier (for the LHC and other major programs) Petabyte-scale datasets Tens of national and transoceanic links at 10 Gbps (and up) 100+ Gbps aggregate data transport sustained for hours; We reached a Petabyte/day transport rate for real physics data We set the scale and learned to gauge the difficulty of the global networks and transport systems required for the LHC mission But we set up, shook down and successfully ran the system in <1 week We have substantive take-aways from this marathon exercise An optimized Linux (2.6.12 + FAST + NFSv4) kernel for data transport; after 7 full kernel-build cycles in 4 days A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions Extension of Xrootd, an optimized low-latency file access application for clusters, across the wide area Understanding of the limits of 10 Gbps-capable systems under stress
31
“Global Lambdas for Particle Physics” A Worldwide Network & Grid Experiment We are grateful to our many network partners: SCInet, LHCNet, Starlight, NLR, Internet2’s Abilene and HOPI, ESnet, UltraScience Net, MiLR, FLR, CENIC, Pacific Wave, UKLight, TeraGrid, Gloriad, AMPATH, RNP, ANSP, CANARIE and JGN2. And to our partner projects: US CMS, US ATLAS, D0, CDF, BaBar, US LHCNet, UltraLight, LambdaStation, Terapaths, PPDG, GriPhyN/iVDGL, LHCNet, StorCloud, SLAC IEPM, ICFA/SCIC and Open Science Grid Our Supporting Agencies: DOE and NSF And for the generosity of our vendor supporters, especially Cisco Systems, Neterion, HP, IBM, and many others, who have made this possible And the Hudson Bay Fan Company…
33
Extra Slides Follow
34
Global Lambdas for Particle Physics Analysis SC|05 Bandwidth Challenge Entry Caltech, CERN, Fermilab, Florida, Manchester, Michigan, SLAC, Vanderbilt, Brazil, Korea, Japan, et al CERN's Large Hadron Collider experiments: Data/Compute/Network Intensive Discovering the Higgs, SuperSymmetry, or Extra Space-Dimensions - with a Global Grid Worldwide Collaborations of Physicists Working Together; while Developing Next-generation Global Network and Grid Systems
35
http/https Client Web server Web server Service 3 rd party application Clarens Clarens (ACL, X509, Discovery) XML-RPC SOAP Java RMI JSON RPC Catalog Storage Analysis Sandbox select dataset Network datasets Start (remote) analysis Authentication Access control on Web Services. Remote file access (and access control on files). Discovery of Web Services and Software. Shell service. Shell like access to remote machines (managed by access control lists). Proxy certificate functionality Virtual Organization management and role management. User's point of access to a Grid system. Provides environment where user can: Access Grid resources and services. Execute and monitor Grid applications. Collaborate with other users. One stop shop for Grid needs Portals can lower the barrier for users to access Web Services and using Grid enabled applications
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.