Internet2 Presentation (Oct. 11, 2007)Paul Avery 1 Paul Avery University of Florida Internet2 Meeting San Diego, CA October 11, High Energy & Nuclear Physics Experiments and Advanced Cyberinfrastructure
Internet2 Presentation (Oct. 11, 2007)Paul Avery 2 Context: Open Science Grid Consortium of many organizations (multiple disciplines) Production grid cyberinfrastructure 75+ sites, 30,000+ CPUs: US, UK, Brazil, Taiwan
Internet2 Presentation (Oct. 11, 2007)Paul Avery 3 OSG Science Drivers Experiments at Large Hadron Collider New fundamental particles and forces 100s of petabytes ? High Energy & Nuclear Physics expts Top quark, nuclear matter at extreme density ~10 petabytes 1997 – present LIGO (gravity wave search) Search for gravitational waves ~few petabytes2002 – present Data growth Community growth Future Grid resources Massive CPU (PetaOps) Large distributed datasets (>100PB) Global communities (1000s) International optical networks
Internet2 Presentation (Oct. 11, 2007)Paul Avery 4 OSG History in Context Primary Drivers: LHC and LIGO PPDG GriPhyN iVDGL TrilliumGrid3 OSG (DOE) (DOE+NSF) (NSF) European Grid + Worldwide LHC Computing Grid Campus, regional grids LHC Ops LHC construction, preparation, commissioning LIGO operation LIGO preparation
Internet2 Presentation (Oct. 11, 2007)Paul Avery 5 Search for Origin of Mass New fundamental forces Supersymmetry Other new particles 2008 – ? TOTEM LHCb ALICE 27 km Tunnel in Switzerland & France CMS ATLAS LHC Experiments at CERN
Internet2 Presentation (Oct. 11, 2007)Paul Avery 6 Particle Proton Proton 2835 bunch/beam Protons/bunch10 11 Beam energy7 TeV x 7 TeV Luminosity10 34 cm 2 s 1 Crossing rate every 25 nsec Collision rate ~10 9 Hz New physics rate ~10 5 Hz Selection: 1 in Parton (quark, gluon) Proton l l jet Bunch Collisions at LHC (2008 ?) (~20 Collisions/Crossing)
Internet2 Presentation (Oct. 11, 2007)Paul Avery 7 CMS ATLAS LHCb Storage Raw recording rate 0.2 – 1.5 GB/s Large Monte Carlo data samples 100 PB by ~2012 1000 PB later in decade? Processing PetaOps (> 600,000 3 GHz cores) Users 100s of institutes 1000s of researchers LHC Data and CPU Requirements
Internet2 Presentation (Oct. 11, 2007)Paul Avery 8 ATLAS CMS LHC Global Collaborations 2000 – 3000 physicists per experiment USA is 20–31% of total
Internet2 Presentation (Oct. 11, 2007)Paul Avery 9 CMS Experiment LHC Global Grid Online System CERN Computer Center FermiLab Korea Russia UK Maryland MB/s >10 Gb/s Gb/s Gb/s Tier 0 Tier 1 Tier 3 Tier 2 Physics caches PCs Iowa UCSDCaltech U Florida 5000 physicists, 60 countries 10s of Petabytes/yr by 2009 CERN / Outside = 10-20% FIU Tier 4 OSG
Internet2 Presentation (Oct. 11, 2007)Paul Avery 10 LHC Global Grid 11 Tier-1 sites 112 Tier-2 sites (growing) 100s of universities J. Knobloch
Internet2 Presentation (Oct. 11, 2007)Paul Avery 11 LHC Cyberinfrastructure Growth: CPU CERN Tier-1 Tier-2 ~100,000 cores Multi-core boxes AC & power challenges
Internet2 Presentation (Oct. 11, 2007)Paul Avery 12 LHC Cyberinfrastructure Growth: Disk 100 Petabytes Disk CERN Tier-1 Tier-2
Internet2 Presentation (Oct. 11, 2007)Paul Avery 13 LHC Cyberinfrastructure Growth: Tape 100 Petabytes Tape CERN Tier-1
Internet2 Presentation (Oct. 11, 2007)Paul Avery 14 HENP Bandwidth Roadmap for Major Links (in Gbps) Paralleled by ESnet roadmap
Internet2 Presentation (Oct. 11, 2007)Paul Avery 15 HENP Collaboration with Internet2 HENP SIG
Internet2 Presentation (Oct. 11, 2007)Paul Avery 16 UltraLight and other networking initiatives Spawning state-wide and regional networks (FLR, SURA, LONI, …) HENP Collaboration with NLR
Internet2 Presentation (Oct. 11, 2007)Paul Avery 17 US LHCNet, ESnet Plan : 30 80 Gbps US-CERN DEN ELP ALB ATL Metropolitan Area Rings Aus. Europe SDG AsiaPac SEA Major DOE Office of Science Sites High-speed cross connects with Internet2/Abilene New ESnet hubs ESnet hubs SNV Europe Japan Science Data Network core, Gbps circuit transport Lab supplied Major international Production IP ESnet core, 10 Gbps enterprise IP traffic Japan Aus. Metro Rings ESnet4 SDN Core: 30-50Gbps ESnet IP Core ≥10 Gbps 10Gb/s 30Gb/s 2 x 10Gb/s NYC CHI US-LHCNet Network Plan (3 to 8 x 10 Gbps US-CERN ) LHCNet Data Network DC GEANT2 SURFNet IN2P3 NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant2 CERN FNAL BNL US-LHCNet: NY-CHI-GVA-AMS : 30, 40, 60, 80 Gbps ESNet MANs to FNAL & BNL; Dark Fiber to FNAL; Peering With GEANT
Internet2 Presentation (Oct. 11, 2007)Paul Avery 18 CSA06 Tier1–Tier2 Data Transfers: 2006–07 1 GB/sec Sep Sep Mar. 2007
Internet2 Presentation (Oct. 11, 2007)Paul Avery 19 Computing, Offline and CSA07 Nebraska One well configured site. But ~10 such sites in near future network challenge US: FNAL Transfer Rates to Tier-2 Universities 1 GB/s June 2007
Internet2 Presentation (Oct. 11, 2007)Paul Avery 20 Current Data Transfer Experience Transfers are generally much slower than expected Or stop altogether Potential causes difficult to diagnose Configuration problem? Loading? Queuing? Database errors, experiment S/W error, grid S/W error? End-host problem? Network problem? Application failure? Complicated recovery Insufficient information Too slow to diagnose and correlate at the time the error occurs Result Lower transfer rates, longer troubleshooting times Need intelligent services, smart end-host systems
Internet2 Presentation (Oct. 11, 2007)Paul Avery 21 UltraLight 10 Gb/s+ network Caltech, UF, FIU, UM, MIT SLAC, FNAL Int’l partners Level(3), Cisco, NLR Funded by NSF Integrating Advanced Networking in Applications
Internet2 Presentation (Oct. 11, 2007)Paul Avery 22 UltraLight Testbed Funded by NSF
Internet2 Presentation (Oct. 11, 2007)Paul Avery 23 Many Near-Term Challenges Network Bandwidth, bandwidth, bandwidth Need for intelligent services, automation More efficient utilization of network (protocols, NICs, S/W clients, pervasive monitoring) Better collaborative tools Distributed authentication? Scalable services: automation Scalable support
Internet2 Presentation (Oct. 11, 2007)Paul Avery 24 END
Internet2 Presentation (Oct. 11, 2007)Paul Avery 25 Extra Slides
Internet2 Presentation (Oct. 11, 2007)Paul Avery 26 The Open Science Grid Consortium Open Science Grid U.S. grid projects LHC experiments Laboratory centers Education communities Science projects & communities Technologists (Network, HPC, …) Computer Science University facilities Multi-disciplinary facilities Regional and campus grids
Internet2 Presentation (Oct. 11, 2007)Paul Avery 27 CMS: “Compact” Muon Solenoid Inconsequential humans
Internet2 Presentation (Oct. 11, 2007)Paul Avery 28 All charged tracks with pt > 2 GeV Reconstructed tracks with pt > 25 GeV (+30 minimum bias events) 10 9 collisions/sec, selectivity: 1 in Collision Complexity: CPU + Storage
Internet2 Presentation (Oct. 11, 2007)Paul Avery 29 LHC Data Rates: Detector to Storage Level 1 Trigger: Special Hardware 40 MHz 75 KHz 75 GB/sec 5 KHz 5 GB/sec Level 2 Trigger: Commodity CPUs 100 Hz 0.15 – 1.5 GB/sec Level 3 Trigger: Commodity CPUs Raw Data to storage (+ simulated data) Physics filtering ~TBytes/sec
Internet2 Presentation (Oct. 11, 2007)Paul Avery 30 LIGO: Search for Gravity Waves LIGO Grid 6 US sites 3 EU sites (UK & Germany) * LHO, LLO: LIGO observatory sites * LSC: LIGO Scientific Collaboration Cardiff AEI/Golm Birmingham
Internet2 Presentation (Oct. 11, 2007)Paul Avery 31 Is HEP Approaching Productivity Plateau? Padova 2000 Beijing 2001 San Diego 2003 Interlachen 2004 Mumbai 2006 Victoria 2007 Expectations From Les Robertson Gartner Group (CHEP Conferences) The Technology Hype Cycle Applied to HEP Grids
Internet2 Presentation (Oct. 11, 2007)Paul Avery 32 Challenges from Diversity and Growth Management of an increasingly diverse enterprise Sci/Eng projects, organizations, disciplines as distinct cultures Accommodating new member communities (expectations?) Interoperation with other grids TeraGrid International partners (EGEE, NorduGrid, etc.) Multiple campus and regional grids Education, outreach and training Training for researchers, students … but also project PIs, program officers Operating a rapidly growing cyberinfrastructure 25K 100K CPUs, 4 10 PB disk Management of and access to rapidly increasing data stores (slide) Monitoring, accounting, achieving high utilization Scalability of support model (slide)
Internet2 Presentation (Oct. 11, 2007)Paul Avery 33 Collaborative Tools: EVO Videoconferencing End-to-End Self Managed Infrastructure
Internet2 Presentation (Oct. 11, 2007)Paul Avery 34 REDDnet: National Networked Storage NSF funded project Vanderbilt 8 initial sites Multiple disciplines Satellite imagery HENP Terascale Supernova Initative Structural Biology Bioinformatics Storage 500TB disk 200TB tape Brazil?
Internet2 Presentation (Oct. 11, 2007)Paul Avery 35 OSG Operations Model Distributed model Scalability! VOs, sites, providers Rigorous problem tracking & routing Security Provisioning Monitoring Reporting Partners with EGEE operations