Download presentation
Presentation is loading. Please wait.
Published byBertha Powell Modified over 9 years ago
1
Global Networking for the LHC Artur Barczyk California Institute of Technology ECOC Conference Geneva, September 18 th, 2011 1
2
INTRODUCTION First Year of LHC from the network perspective 2
3
WLCG Worldwide Resources 3 Today >140 sites >250k CPU cores >150 PB disk Today >140 sites >250k CPU cores >150 PB disk Today we have 49 MoU signatories, representing 34 countries: Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA. Today we have 49 MoU signatories, representing 34 countries: Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA. WLCG Collaboration Status Tier 0; 11 Tier 1s; 68 Tier 2 federations WLCG Collaboration Status Tier 0; 11 Tier 1s; 68 Tier 2 federations In addition to WLCG, O(300) Tier- 3 sites, not shown
4
Data and Computing Models 4 The Evolving MONARC Picture: Circa 2003 From Ian Bird, ICHEP 2010 Variations by experiment The models are based on the MONARC model Now 10+ years old The models are based on the MONARC model Now 10+ years old Circa 1996
5
The LHC Optical Private Network Serving Tier0 and Tier1 sites Dedicated network resources for Tier0 and Tier1 data movement Layer 2 overlay on R&E infrastructure 130 Gbps total Tier0-Tier1 capacity Simple architecture –Point-to-point Layer 2 circuits –Flexible and scalable topology Grew organically –From star to partial mesh Open to technology choices have to satisfy requirements OC-192/SDH-64, EoMPLS, OTN-3 Federated governance model –Coordination between stakeholders 5
6
6 MB/s per day 6 GB/s Peaks of 10 GB/s reached ~2 GB/s (design) Grid-based analysis in Summer 2010: >1000 different users; >15M analysis jobs The excellent Grid performance has been crucial for fast release of physics results. E.g.: ICHEP: the full data sample taken until Monday was shown at the conference Friday 2010 Worldwide data distribution and analysis (F.Gianotti) Total throughput of ATLAS data through the Grid: 1 st January November.
7
CMS Data Movements (2010) (All Sites and Tier1-Tier2) 7 1 hour average: to 3.5 GBytes/s Throughput [GBy/s] 3 4 2 1 2 0 1.5 2.5 0.5 Daily average total rates reach over 2 GBytes/s 1 0 1 2 0 1.5 0.5 120 Days June-October 2010 Daily average T1-T2 rates reach 1-1.8 GBytes/s 132 Hours in Oct. 2010 6/197/037/177/318/148/289/119/2510/9 6/237/077/21 8/48/189/19/159/2910/13 10/6 10/7 10/8 10/910/10 Tier2-Tier2 ~25% of Tier1-Tier2 Traffic To ~50% during Dataset Reprocessing & Repopulation
8
THE RESEARCH AND EDUCATION NETWORKING LANDSCAPE Selected representative examples 8
9
GEANT Pan-European Backbone 9 Dark Fiber Core Among 19 Countries: Austria Belgium Croatia Czech Republic Denmark Finland France Germany Hungary Ireland Italy Netherlands Norway Slovakia Slovenia Spain Sweden Switzerland United Kingdom 34 NRENs, ~40M Users; 50k km Leased Lines 12k km Dark Fiber; Point to Point Services GN3 Next Gen. Network Started in June 2009
10
SURFNet & NetherLight: 8000 Km Dark Fiber Flexible Photonic Infrastructure 5 Photonic Subnets λ Switching 10G, 40G; 100G Trials Fixed or Dynamic Lightpaths for LCG, GN3, EXPRES DEISA LOFAR CineGrid 5 Photonic Subnets λ Switching 10G, 40G; 100G Trials Fixed or Dynamic Lightpaths for LCG, GN3, EXPRES DEISA LOFAR CineGrid Cross Border Fiber: to Belgium, on to CERN (1650km); to Germany: X-Win, On to NORDUnet; Erik-Jan Bos
11
GARR-X in Italy: Dark Fiber Network Supporting LHC Tier1 and Nat’l Tier2 Centers GARR-X 10G Links Among Bologna Tier1 & 5 Tier2s Adding 5 More Sites at 10G 2 x 10G Circuits to the LHCOPN Over GEANT and to Karlsruhe Via Int’l Tier2 – Tier1 Circuits M. Marletta Cross Border Fibers to Karlsruhe (Via CH, DE)
12
US: DOE ESnet 12 Current ESnet4 Topology: Multi-10G backbone SDN node IP router node 10G link Major site v v
13
DOE Esnet – 100Gbps Backbone Upgrade 13 100G node Router node 100G link Major site v v ESnet5 100G Backbone, Q4 2012 First deployment started Q3 2011
14
US LHCNet Non-stop Operation; Circuit-oriented Services 14 PerformanceenhancingStandard Extensions: VCAT, LCAS USLHCNet, ESnet, BNL & FNAL: Facility, equipment and link redundancy Core: Optical multiservice Switches Dynamic circuit-oriented network services with BW guarantees, with robust fallback at layer 1: Hybrid optical network
15
Dark Fiber in NREN Backbones 2005 – 2010 Greater or Complete Reliance on Dark Fiber TERENA Compendium 2010: www.terena.org/activities/compendium/ 20052010
16
Cross Border Dark Fiber in Europe Current and Planned: Increasing Use TERENA Compendium 2010
17
Global Lambda Integrated Facility A Global Partnership of R&E Networks and Advanced Network R&D Projects Supporting HEP http://glif.is GLIF 2010 Map – Global View
18
GLIF 2010 Map North America ~16 10G Trans- Atlantic Links in 2010
19
GLIF Open Lightpath Exchanges: MoscowLight, CzechLight, CERNLight, NorthernLight NetherLight, UKLight GLIF Open Lightpath Exchanges: MoscowLight, CzechLight, CERNLight, NorthernLight NetherLight, UKLight GLIF 2010 Map: European View R&E Networks, Links and GOLEs
20
Open Exchange Points: NetherLight Example 3 x 40G, 30+ 10G Lambdas, Use of Dark Fiber Convergence of Many Partners on Common Lightpath Concepts Internet2, ESnet, GEANT, USLHCNet; nl, cz, ru, be, pl, es, tw, kr, hk, in, nordic
21
LHC NETWORKING - BEYOND LHCOPN 21
22
Computing Models Evolution Moving away from the strict MONARC model Introduced gradually since 2010 3 recurring themes: –Flat(ter) hierarchy: Any site can use any other site as source of data –Dynamic data caching: Analysis sites will pull datasets from other sites “on demand”, including from Tier2s in other regions Possibly in combination with strategic pre-placement of data sets –Remote data access: jobs executing locally, using data cached at a remote site in quasi-real time Possibly in combination with local caching Variations by experiment 22
23
LHC Open Network Environment So far, T1-T2, T2-T2, and T3 data movements have been using General Purpose Network infrastructure –Shared resources (with other science fields) –Mostly best effort service Increased reliance on network performance need more than best effort Separate large LHC data flows from routed R&E GPN Collaboration on global scale, diverse environment, many parties –Solution to be Open, Neutral and Diverse –Agility and Expandability Scalable in bandwidth, extent and scope Organic activity, growing over time according to needs Architecture: –Switched Core, Routed Edge –Core: Interconnecting trunks between Open Exchanges –Edge: Site Border Routers, or BRs of regional aggregation networks Services: Multipoint, static point-to-point, dynamic point-to-point 23
24
LHCONE High-Level Architecture Overview 24 LHCONE Conceptual Diagram
25
LOOKING FORWARD: NEW NETWORK SERVICES 25
26
Characterization of User Space 26 Cees de Laat; http://ext.delaat.net/talks/cdl-2005-02-13.pdf This is where LHC users are
27
27 David Foster, 1 st TERENA ASPIRE Workshop, May 2011
28
The Case for Dynamic Circuits in LHC Data Processing Data models do not require full-mesh @ full-rate connectivity @ all times On-demand data movement will augment and partially replace static pre- placement Network utilisation will be more dynamic and less predictable Performance expectations will not decrease –More dependence on the network, for the whole data processing system to work well! Need to move large data sets fast between computing sites –On-demand: caching –Scheduled: pre-placement –Transfer latency is important Network traffic far in excess of what was anticipated As data volumes grow rapidly, and experiments rely increasingly on the network performance - what will be needed in the future is –More bandwidth –More efficient use of network resources –Systems approach including end-site resources and software stacks Note: Solutions for the LHC community need global reach 28
29
Dynamic Bandwidth Allocation Will be one of the services to be provided in LHCONE Allows to allocate network capacity on as-needed basis –Instantaneous (“Bandwidth on Demand”), or –Scheduled allocation Significant effort in R&E Networking community –Standardisation through OGF (OGF-NSI, OGF-NML) Dynamic Circuit Service is present in several advanced R&E networks –SURFnet (DRAC) –ESnet (OSCARS) –Internet2 (ION) –US LHCNet (OSCARS) Planned (or in experimental deployment) –E.g. GEANT (AutoBahn), RNP (OSCARS/DCN), … DYNES: NSF funded project to extend hybrid & dynamic network capabilities to campus & regional networks –In first deployment phase; fully operational in 2012 29
30
US Example: DYNES Project NSF-funded project: DYnamic NEtwork SystemNSF-funded project: DYnamic NEtwork System What is it?What is it? –A nationwide cyber-instrument spanning up to ~40 US universities and ~14 Internet2 connectors –Extends Internet2s ION service into regional networks and campuses, based on ESnet’s OSCARS implementation of IDC protocol Who is it? –A collaborative team including Internet2, Caltech, University of Michigan, and Vanderbilt University –Community of regional networks and campuses –LHC, astrophysics community, OSG, WLCG, other virtual organizations The goals –Support large, long-distance scientific data flows in the LHC, other leading programs in data intensive science (such as LIGO, Virtual Observatory, and other large scale sky surveys), and the broader scientific community –Build a distributed virtual instrument at sites of interest to the LHC but available to R&E community generally 30 http://www.internet2.edu/dynes
31
DYNES System Description AIM: extend hybrid & dynamic capabilities to campus & regional networks –A DYNES instrument must provide two basic capabilities at the Tier 2s, Tier3s and regional networks: 1.Network resource allocation such as bandwidth to ensure transfer performance 2.Monitoring of the network and data transfer performance All networks in the path require the ability to allocate network resources and monitor the transfer. This capability currently exists on backbone networks such as Internet2 and ESnet, but is not widespread at the campus and regional level –In addition Tier 2 & 3 sites require: 3.Hardware at the end sites capable of making optimal use of the available network resources 31 Two typical transfers that DYNES supports: one Tier2 - Tier3 and another Tier1-Tier2. The clouds represent the network domains involved in such a transfer.
32
Summary LHC Computing models rely on efficient high-throughput data movement between computing sites (Tier0/1/2/3) Close collaboration between the LHC and R&E networking communities –Regional, National, International LHCOPN (LHC Optical Private Network): –Layer 2 overlay network, dedicated resources for Tier0 and Tier1 centres –Very successful operation LHCONE (LHC Open Network Environment): –New initiative to provide reliable services to ALL LHC computing sites (Tier 0-3) –Being developed as collaboration between LHC community and the Research and Education Networks world-wide –User driven, organic growth –Current architecture is built on switched core with routed edge –Will provide advanced network services with dynamic bandwidth allocation 32
33
THANK YOU! Artur.Barczyk@cern.ch 33
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.